csvkit.org
CSV (Comma-Separated Values) utilities, in the browser
Say hi →

Remove Duplicates From CSV

updated 12 March 2026

dedupe key
Drop a .csv file here, or
ready

CSV Dedupe Toolkit

Remove duplicate rows from a CSV by any column or combination of columns. Leave the column field empty for whole-row dedupe. Choose whether to keep the first or last occurrence of each duplicate, and optionally ignore case or trim whitespace during comparison. Comparison rules are explicit, so nothing in the data is changed silently. Runs entirely in the browser.

Before you start

You need to know:

How to use it

  1. Paste or drop your CSV in the left pane.
  2. In Cols, type the column names that define a duplicate, comma-separated — e.g. email or email, country. Leave blank for whole-row dedupe.
  3. Toggle ignore case if casing shouldn't matter.
  4. Toggle trim if leading/trailing whitespace shouldn't matter.
  5. Pick Keep first (default) or last.
  6. Click Dedupe. The status bar reports how many duplicates were removed.
  7. Copy or Download .csv.

Options explained

Cols (the dedupe key)

Each row is reduced to a key built from the listed columns, joined with a separator. Two rows with the same key are duplicates. Leaving the field blank uses every column — equivalent to whole-row dedupe.

Examples: email (one row per email); email, country (same email in different countries is allowed); first_name, last_name, dob (dedupe by identity tuple).

ignore case & trim

ignore case lowercases the key before comparing: Alice and alice collapse. trim removes leading/trailing whitespace from each key component: "alice@x " and "alice@x" collapse. Both operate on the comparison key only — the output row preserves the original values.

Keep first / last

Which duplicate survives. First preserves the original file order and keeps the earliest occurrence. Last keeps the most recent — useful when the source is an append-only log and the newest row has the canonical state.

Example

Input (duplicates by email):

email,name,last_login
[email protected],Alice,2024-01-01
[email protected],Bob,2024-01-02
[email protected],Alice (updated),2024-02-09

Dedupe on email, ignore case on, Keep last:

email,name,last_login
[email protected],Bob,2024-01-02
[email protected],Alice (updated),2024-02-09

Tips & common pitfalls

Troubleshooting

"Column not found" on a column that's clearly in the file.

Casing mismatch or an invisible character (like a BOM on the first column). Try copying the header name straight from the file rather than typing it.

The row count went down more than expected.

Turn off trim and ignore case to see the count with exact comparison, then re-enable them one by one. The difference tells you how dirty the source is.

I want to see which rows were removed, not which survived.

That's not a current option. Workaround: dedupe the file, then use CSV Diff to compare the original and the result.

Frequently asked questions

Can I dedupe on multiple columns?

Yes. Enter a comma-separated list, e.g. email, country. A row is a duplicate only if the combination of those columns matches.

Does it upload my file?

No. All deduping runs in your browser. The "Removed N duplicates" feedback is computed locally. See the privacy policy.

Will the original file order be preserved?

Yes. Output rows appear in the order they were first seen in the input. If you set Keep: last, the surviving row for each key is the last-seen one, but the surviving rows still come out in first-seen order.

Can it handle a million rows?

Yes, as long as it fits in your browser's RAM. A million thin rows is fine on a laptop; wide rows with long text will use a lot more memory.