Duplicate Line Remover
Strip duplicate lines from any list. Ignore-case, trim-whitespace and drop-blank options plus a per-line frequency table so you can see exactly what repeated.
What is the Duplicate Line Remover?
Duplicate Line Remover deduplicates any newline-separated list — email exports, log lines, URL lists, CSV columns — and shows you exactly what was duplicated. It supports four common matching options: ignore-case (Apple = APPLE), trim-whitespace (lines that differ only in leading/trailing space collapse), drop-blank-lines, and a preserve-order vs. sort-A→Z output toggle. Below the main output a frequency table shows every unique line ranked by how often it appeared, so you can spot the worst offenders. Stats at the top track the original line count, unique lines kept, duplicates dropped and blank lines found. A live filter narrows the frequency table when the list is long. Pure functions, no upload — everything stays on this device.
How to use it
- Paste your list (one entry per line) into the input panel.
- Toggle ignore-case, trim-whitespace or drop-blank-lines depending on what counts as 'the same' line.
- Pick preserve-order (default) or sort-A→Z for the output.
- Copy the cleaned list, download it as .txt, or browse the frequency table to see what repeated.
Benefits
- Preserve-order mode keeps the first occurrence and drops repeats — same semantics as `awk '!seen[$0]++'`.
- Ignore-case and trim-whitespace options collapse near-duplicates that differ only in capitalisation or padding.
- Drop-blank-lines option cleans up paste-artifact empty rows.
- Live frequency table shows every unique line with its repeat count for spot-checking.
- Filter input narrows the table by substring so you can hunt a specific offender in a huge list.
- Stats row tracks original lines, unique lines kept, duplicates dropped and blank lines.
- 'Replace input with unique' button feeds the deduped output back into the editor.
- Runs 100% in your browser — your list never leaves the device.
Frequently asked questions
What counts as a duplicate line?
By default, two lines are duplicates when their normalised forms match. With 'trim whitespace' on, leading/trailing spaces are stripped first. With 'ignore case' on, both lines are lowercased before comparison.
Which copy of a duplicate is kept?
In 'preserve order' mode the first occurrence wins — every later copy is dropped. In 'sort A→Z' mode the unique set is collected first then sorted.
Does it remove duplicates that appear out of order?
Yes. Order doesn't matter for detection — a duplicate is any line whose normalised form matches a line seen earlier anywhere in the input.
What's the frequency table for?
It shows every unique line and how many times it appeared in the input, ranked by count. Useful for finding the most-repeated entries when a CSV got mis-merged.
Can I remove blank lines too?
Yes. The 'drop blank lines' toggle skips empty lines entirely — they won't appear in the output or the frequency table.
Does it work on huge lists?
Yes. The implementation uses a Set for O(n) detection; we routinely test 100 000-line inputs in the browser without lag.
Will it break CSVs or quoted strings?
No — the tool only splits on newline characters. Each CSV row is treated as a whole line, including any commas, quotes or embedded escapes.
Can I sort the result alphabetically?
Yes — switch the order toggle to 'Sort A → Z' and the unique lines are alphabetised before output.
Does the trailing newline matter?
No. We strip a single trailing empty line before splitting so a final-newline file doesn't pick up a phantom blank entry.
Is my list uploaded anywhere?
No. The dedupe is a pure browser function — Toollyz has no server that ever sees your list.
Does this work on emojis or non-ASCII text?
Yes. Comparison is by string equality after the chosen normalisations, so Unicode lines compare correctly.