CSV Diff
CSV Diff Toolkit
Compare two versions of a CSV at the row and cell level. Specify a key column that
uniquely identifies a record (id, email, or similar), drop in
the old and new files, and receive an added / removed / changed breakdown with cell-level
highlighting. More useful than plain diff for tabular data, which flags
reordered rows as different and can't point at the specific cell that changed — a
typical application is sanity-checking a data migration before cutover.
Before you start
You need:
- Two CSVs — the "old" (left) and "new" (right) version of the same dataset. They should share a header row and, ideally, share most columns.
- A key column (or composite of columns) that uniquely identifies a record in both files. Usually an
id,sku,email, or similar. Without a key, the tool falls back to row-text compare — see below.
The two files don't have to have the same columns or the same row order. The diff matches rows by key, then compares cells of the matched rows.
How to use it
- Paste or drop your old CSV into the left pane.
- Paste or drop your new CSV into the right pane.
- Type the Key column(s) — one name (
id) or comma-separated for a composite key (country, sku). - Click Compare.
- Review the result below the panes. Colour legend:
- green — row exists only in the new file (added).
- red — row exists only in the old file (removed).
- yellow — row exists in both, but at least one cell differs. The specific differing cells are highlighted.
- Click Download diff for a CSV with a
change_typecolumn.
Key-column vs. row-text diff
Leave the key field empty and the tool falls back to comparing entire row strings as a set — useful when a file has no stable ID, but weaker because a single-cell change looks like a delete + add, and you lose the cell-level highlight.
Use a key whenever possible. If no column is unique on its own, combine two or three into a composite key; the result is almost always more useful.
What "changed" really means
- Comparison is string-exact.
"1.0"and"1"are different. Normalise types in the source if that's noise. - Only columns that exist in both files are compared for changes. A column added in the new file is reported as part of the "added" row context, not as a cell change for every existing row.
- If the same key appears more than once in one file, the extra copies are flagged so you can fix the source before trusting the diff.
Example
Old CSV:
id,name,city
1,Alice,Berlin
2,Bob,Paris
3,Carol,Rome
New CSV:
id,name,city
1,Alice,Berlin
2,Bob,Lyon
4,Dana,Madrid
Diff on id:
- id=1 — unchanged.
- id=2 — changed:
cityParis → Lyon. - id=3 — removed.
- id=4 — added.
Tips & common pitfalls
- Trim noisy columns first. Timestamps and "last updated" columns change on every row and drown out real diffs. Strip them or add them to a separate compare.
- Sort both files by key before download if you plan to eyeball the diff output — the tool itself doesn't need sorted input, but a sorted CSV is much easier to review.
- Big + wide files are slow because every matched row builds a string key and a per-column comparison. If you hit a wall, cut the file by column or by row with Split CSV first.
- Reordered columns are invisible to the diff — the tool lines up columns by name, not by position. If a column was renamed, it looks like "old column removed, new column added".
- Watch out for whitespace: trailing spaces in one file make the cell look "changed" even when it isn't. If the source has that problem, run the file through Dedupe CSV with trim first, or preprocess with a script.
Troubleshooting
Every row shows up as changed.
Most likely the files have different line endings or an encoding mismatch, or the key column has a subtle difference (BOM, trailing space). Open both files in a plain editor and compare the first row byte-for-byte.
Rows I expect to match are flagged as "added + removed".
Your key isn't actually unique, or its values don't match exactly across files. Try a composite key or normalise the key column first (trim, lowercase).
The tool says "duplicate key" but I only have one row with that id.
Check for whitespace or invisible characters in the key cell. Two rows with keys "42" and "42 " collide after trimming.
Frequently asked questions
Why not just use diff?
Command-line diff does byte-level line comparison. It gets confused by reordered rows or different line endings, and it can't tell you which cell changed in a row. This tool understands keys and cells.
Can I diff on a composite key?
Yes. Enter comma-separated column names, e.g. country, sku. Rows match when the full tuple matches.
Is my data uploaded?
No. Both files stay in your browser. See the privacy policy.
Can I export only the changed rows?
Yes — Download diff exports all rows with a change_type column (added / removed / changed / unchanged). Filter that column in any spreadsheet or with awk.