CSV to Parquet Converter: Write Columnar Files in Your Browser
Convert a CSV file to Apache Parquet with inferred column types, locally in your browser.
Paste CSV text or upload a .csv file. The converter parses the rows, walks every column to decide whether the values are integers, doubles, or strings, then writes a Parquet file with Snappy-compressed pages using hyparquet-writer. The result downloads as a .parquet file ready to load into Spark, DuckDB, pandas, or any other Parquet-aware engine.
Parquet is the de-facto storage format for analytical workloads because columnar layout plus dictionary encoding plus Snappy compression shrinks files dramatically and lets engines read only the columns they need. Most data, though, still lands as CSV: API exports, database dumps, spreadsheet downloads. Converting the two has historically meant a one-off pandas script or a DuckDB CLI session.
This converter does the same job in the browser. The CSV is parsed with a hand-rolled state machine that handles quoted fields, escaped quotes, and CRLF line endings. Each column is then scanned: if every non-empty value matches the integer regex /^-?\d+$/, the column is written as INT64; if every value parses as a finite number, it becomes DOUBLE; otherwise it's STRING. Empty cells become nulls. Header detection is opt-in via a checkbox.
Writing uses hyparquet-writer, which assembles row groups, applies the default Snappy codec, and emits a valid Parquet 2.x file with proper page and footer metadata. The resulting file opens cleanly in pandas (pd.read_parquet), DuckDB (read_parquet), Arrow, and Spark. Because all of this runs locally, confidential CSVs never touch a server.
- 1
Load the CSV
Upload a .csv file or paste the contents into the textarea. Toggle the header checkbox if the first row isn't column names.
- 2
Convert
Click Convert to Parquet. The page parses the rows, infers per-column types (INT64, DOUBLE, or STRING), and writes a Parquet buffer.
- 3
Download .parquet
The output card shows row count, column count, and file size. Click Download to save the resulting .parquet file.
Shrink an archive before upload
Convert a large CSV export to Parquet before pushing it to S3 or a data lake. Compressed columnar files are often 5 to 10 times smaller.
Prep data for DuckDB or pandas
Both engines read Parquet faster than CSV because they can skip columns and apply pushdown filters.
Standardize a one-off CSV
Get strict types into the file once instead of casting them inside every downstream query.
Convert without installing tooling
A teammate doesn't have pyarrow, fastparquet, or DuckDB installed. Send them the URL instead.
Does the CSV go to a server?
No. Parsing and writing both happen in your browser tab via hyparquet-writer. There is no upload.
How is column type inferred?
Each column is scanned in full. If every non-empty cell matches the integer pattern, the column becomes INT64. If every cell parses as a finite number, it becomes DOUBLE. Otherwise it becomes STRING. Empty cells are written as nulls.
What compression is used?
Snappy, the default in hyparquet-writer. It is supported by every modern Parquet reader, including pandas, DuckDB, Spark, and Arrow.
Will it handle quoted fields with embedded commas?
Yes. The CSV parser implements the RFC 4180 quoting rules, including doubled quotes inside quoted fields and CRLF line endings.
Can I control the schema explicitly?
Not in this tool; types are inferred. For column-by-column control, write the file from a script that calls parquetWriteBuffer directly with a custom column array.