CSV Encoding Detector and UTF-8 Converter

Detect the character encoding of a CSV file and convert it to UTF-8 with a byte order mark.

Upload a CSV
About CSV Encoder

Upload a CSV and the tool inspects the raw bytes to guess the encoding using chardet, decodes with TextDecoder, and re-encodes the content as UTF-8 with a BOM so spreadsheet software opens it correctly.

CSV files exported from older systems often use legacy single byte encodings such as Windows-1252, ISO-8859-1, Shift_JIS, or GB18030. When those files are opened in software that expects UTF-8, accented or non Latin characters appear as garbled bytes (mojibake). This tool reads the raw bytes of an uploaded file, runs them through chardet to identify the most likely encoding, decodes the bytes with the browser's TextDecoder, and re-encodes the text as UTF-8.

The download includes a UTF-8 byte order mark. Microsoft Excel relies on the BOM to recognize UTF-8 files and display non Latin characters correctly. Tools that read UTF-8 without a BOM (most command line tools, Python, awk) will skip the BOM bytes automatically, so the output works in both worlds.

If the automatic detection picks the wrong encoding (it can happen with very short files or files that look ambiguous), the override dropdown lets you re-decode with a specific charset. The byte counts before and after conversion are shown so you can verify that decoding produced sensible output: a sudden change in size that looks suspicious usually indicates the wrong source encoding was used.

How to use the CSV Encoder
  1. 1

    Upload the file

    Choose a CSV file. The raw bytes are loaded into the browser as a Uint8Array, ready for inspection.

  2. 2

    Detect the encoding

    chardet analyses byte patterns and returns the most likely encoding with a confidence score. The top match is used to decode the file.

  3. 3

    Download as UTF-8

    The decoded text is encoded as UTF-8 with a byte order mark and offered as a download. The original file on disk is never modified.

Common use cases

Fix Excel exports with broken accents

Convert Windows-1252 exports from older Excel installations into UTF-8 so accented French, Spanish, or German names display correctly.

Prepare CSVs for Postgres COPY

Postgres COPY expects a known client encoding. Convert files to UTF-8 first so the import does not fail on the first non ASCII byte.

Open Japanese CSVs in modern editors

Files saved as Shift_JIS from legacy POS software become readable when transcoded to UTF-8, which is the default for VS Code and most editors.

Override an incorrect guess

Very short files can confuse encoding detection. Pick the correct encoding from the override list to redo the conversion.

Frequently asked questions
Is my file uploaded anywhere?

No. Detection and conversion both run in your browser using chardet and the standard TextDecoder API. The file never leaves your device.

Why does the output include a BOM?

Microsoft Excel uses the UTF-8 byte order mark to recognize UTF-8 encoded CSV files. Command line tools and most other software skip the BOM automatically, so the file works in both.

What if detection picks the wrong encoding?

Use the override dropdown to re-decode with a specific encoding. chardet relies on statistical patterns, which can be inconclusive for very short or unusual files.

Which encodings can the browser decode?

TextDecoder supports the encodings listed in the Encoding Standard, including UTF-8, UTF-16, all ISO-8859 variants, the Windows-125x family, Shift_JIS, EUC-JP, EUC-KR, GB18030, Big5, and KOI8-R.

Will conversion change my data?

Conversion only changes the byte representation of characters, not the characters themselves. If the source encoding is correct, the visible content of every cell is preserved exactly.

developerencodingconvertertext