A utility for converting Parquet files into CSV and vice versa.
Project description
ParquetConv
A command-line tool for converting between Parquet and CSV file formats using pandas.
Features
- Automatic format detection: Automatically detects whether the input file is Parquet or CSV
- Bidirectional conversion: Convert Parquet to CSV or CSV to Parquet
- Flexible output naming: Auto-generates output filenames or allows custom naming
- Error handling: Comprehensive error handling with informative messages
- Force conversion: Option to force conversion even with uncertain file formats
Installation
The project uses uv for dependency management. Install dependencies with:
uv sync
Usage
Basic Usage
Convert a Parquet file to CSV:
python main.py input.parquet
Convert a CSV file to Parquet:
python main.py input.csv
Advanced Usage
Specify a custom output filename:
python main.py input.parquet -o custom_output.csv
python main.py input.csv -o custom_output.parquet
Force conversion (useful when file format detection is uncertain):
python main.py input_file --force
Command Line Options
input_file: Path to the input file (required)-o, --output: Custom output file path (optional)--force: Force conversion even if file format detection is uncertain-h, --help: Show help message
Examples
# Convert Parquet to CSV with auto-generated filename
python main.py data.parquet
# Output: data.csv
# Convert CSV to Parquet with custom filename
python main.py data.csv -o processed_data.parquet
# Convert with force flag
python main.py unknown_file --force
Requirements
- Python 3.9+
- pandas >= 2.3.2
- pyarrow >= 21.0.0
How It Works
- File Detection: The tool first checks the file extension, then attempts to read the file to determine its format
- Format Conversion: Uses pandas to read the input file and convert it to the opposite format
- Output Generation: Creates the output file with an appropriate extension if not specified
Error Handling
The tool provides clear error messages for:
- Missing input files
- Unsupported file formats
- Read/write errors during conversion
- Invalid file content
License
This project is open source and available under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file parquetconv-0.1.0.tar.gz.
File metadata
- Download URL: parquetconv-0.1.0.tar.gz
- Upload date:
- Size: 15.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96ceacf53c3454e6c457b17ffce3f876254daef9fce35180dd56526bce821f44
|
|
| MD5 |
84e289404d6f065f14c2be43973e3db9
|
|
| BLAKE2b-256 |
d2c43ae7c9f674d835505ad7041dd83dda9a0a2f955863d3245fa4307b007f24
|
File details
Details for the file parquetconv-0.1.0-py3-none-any.whl.
File metadata
- Download URL: parquetconv-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7fce1470fb4c4e5a29fed2a1cde84270e470905c1e99b14e579658f507971d1e
|
|
| MD5 |
4a3a66cf051a4e71af75ced052f740b6
|
|
| BLAKE2b-256 |
42bc02083795e34622c311c325f6303c1099c128b073c91a24319259ccd01b06
|