CSV Size and Compression Analyzer

See row count, column count, detected delimiter, average row size, and estimated gzip size for any CSV.

Upload a CSV
About CSV Size Analyzer

Upload a CSV and inspect its shape and weight: total bytes, row count, column count, detected delimiter, average row size, and projected gzip size. Large files are sampled so analysis stays fast in the browser.

When a CSV grows past a few megabytes it stops being something you can eyeball. This tool reads up to 5 MB of the file directly in the browser, walks the sample, and reports row count, column count, average bytes per row, and detected delimiter. For files larger than the sample cap, the totals are extrapolated from the ratio of sampled bytes to file size, which is accurate when rows are uniform in width.

Delimiter detection compares comma, semicolon, tab, and pipe across the first ten non empty lines. The candidate that produces the most consistent column count wins. This is the same approach used by pandas read_csv when the separator is set to sniff. If your file uses something unusual, the result will fall back to whichever candidate parses cleanly.

The gzip estimate compresses up to 1 MB of content with pako (a JavaScript port of zlib) and multiplies the resulting ratio across the full file. CSV often compresses very well because of repeating values in categorical columns. A quick check here is useful before deciding whether to ship a dataset uncompressed, gzipped, or in a columnar format like Parquet.

How to use the CSV Size Analyzer
  1. 1

    Upload a file

    Pick a CSV from your computer. Files over 5 MB are sampled so the page stays responsive even with multi gigabyte exports.

  2. 2

    Detect the shape

    The tool walks the sample to count rows, determine the delimiter, and read the header row to count columns.

  3. 3

    Estimate gzip size

    Up to 1 MB is compressed with pako to measure the compression ratio. That ratio is applied to the full file size to project a gzip total.

Common use cases

Plan a database import

Confirm row and column counts before loading a CSV into Postgres or BigQuery so you can preallocate disk and pick a sensible batch size.

Decide whether to gzip

Compare the estimated gzip size with the original to see if it is worth compressing the file before uploading to object storage.

Confirm the delimiter

Inspect files exported from a tool that uses semicolons or tabs before opening them in software that expects comma separated values.

Spot malformed exports

If the column count looks wrong, the delimiter is likely wrong too. The detected delimiter often reveals which separator the producing tool actually used.

Frequently asked questions
Is my CSV uploaded to a server?

No. The file is read into memory by your browser using the File API, analyzed locally, and discarded when you leave the page. No content is sent over the network.

Why are large files sampled?

Loading a multi gigabyte file fully into memory would crash the tab. Sampling the first 5 MB gives accurate per row averages without exhausting browser memory.

How accurate is the gzip estimate?

Very close to the real result for typical CSVs, which have repeating values that compress uniformly. Estimates can differ by a few percent if the data distribution changes deep in the file.

Can it detect any delimiter?

Comma, semicolon, tab, and pipe are tested. If you use another character, the detection picks whichever candidate parses the most consistently and may not match your real separator.

Does the row count include the header?

Yes. The reported row count is the total number of non empty lines in the file. Subtract one if your file has a header and you only want data rows.

developervalidatorconvertertext