A lingua franca utility for converting between data formats (JSON, CSV) and Parquet.
Project description
parquet-lf
A lingua franca utility for converting between data formats (NDJSON, CSV) and Parquet.
Installation
uv tool install parquet-lf
Usage
Convert to Parquet
# Convert CSV to Parquet
parquet-lf to-parquet csv input.csv -o output.parquet
# Convert NDJSON to Parquet
parquet-lf to-parquet ndjson input.ndjson -o output.parquet
# jsonl is an alias for ndjson
parquet-lf to-parquet jsonl input.jsonl -o output.parquet
Convert from Parquet
# Convert Parquet to CSV
parquet-lf from-parquet csv input.parquet -o output.csv
# Convert Parquet to NDJSON
parquet-lf from-parquet ndjson input.parquet -o output.ndjson
# jsonl is an alias for ndjson
parquet-lf from-parquet jsonl input.parquet -o output.jsonl
Output to stdout
When the -o/--output flag is omitted, output is written to stdout:
# Output CSV to stdout
parquet-lf from-parquet csv input.parquet
# Pipe to another command
parquet-lf from-parquet csv input.parquet | head -10
# Output Parquet to stdout (binary) and redirect to file
parquet-lf to-parquet csv input.csv > output.parquet
Note: Logs are written to stderr, so they won't interfere with piped data.
Inspect Files
Use the info command to view file metadata and schema without loading the entire dataset:
# Show file info (schema, row count, size)
parquet-lf info examples/sample.parquet
# Show file info with preview of first N rows
parquet-lf info --head 5 examples/sample.parquet
parquet-lf info -n 5 examples/sample.csv
The info command supports all formats (Parquet, CSV, NDJSON) and auto-detects the format from the file extension.
Help
parquet-lf --help
parquet-lf to-parquet --help
parquet-lf from-parquet --help
parquet-lf info --help
Supported Formats
NDJSON (Newline Delimited JSON)
NDJSON is a format where each line is a valid JSON object. It's a true tabular peer to CSV, making it ideal for data interchange.
Example NDJSON file:
{"name": "alice", "value": 10}
{"name": "bob", "value": 20}
{"name": "charlie", "value": 30}
Both ndjson and jsonl commands are supported as synonyms.
CSV
Standard comma-separated values format with a header row.
Example CSV file:
name,value
alice,10
bob,20
charlie,30
Example Files
The examples/ directory contains sample data files for experimenting with the CLI:
examples/sample.parquet- Parquet formatexamples/sample.csv- CSV formatexamples/sample.ndjson- NDJSON format
These files contain the same 5-row dataset with columns: id, name, age, city, score.
Development
See CONTRIBUTING.md for development setup and guidelines.
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file parquet_lf-1.0.0.tar.gz.
File metadata
- Download URL: parquet_lf-1.0.0.tar.gz
- Upload date:
- Size: 7.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a54c312b17aa499323d253454956d8bcf0bfcdc6a2919c3d91e310807b66e34d
|
|
| MD5 |
e777a2118a095655ede92eba3bf78ed7
|
|
| BLAKE2b-256 |
47832b4b73a6a5b40afe76a6b0479d0590a2ec0136f415edb0ed5af67a6f4187
|
File details
Details for the file parquet_lf-1.0.0-py3-none-any.whl.
File metadata
- Download URL: parquet_lf-1.0.0-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e867d03181a88ff4ce3e354d6cfc3ac66e1d64ba64dc5f2a90ef767670f584de
|
|
| MD5 |
5b47f8084daf66a7cb88c2771348b2f2
|
|
| BLAKE2b-256 |
16b4694b13a996d5143b2652614b732ae58749cb528b5415784575811a0c253d
|