Add your description here
Project description
cdef-converter
cdef-converter is a Python CLI tool that converts CSV files to Parquet format efficiently.
Features
- Convert multiple CSV files to Parquet format in parallel
- Detect file encoding automatically
- Generate summary of processed files
- Progress tracking with rich console output
Installation
pip install cdef-converter
Usage
cdef-converter /path/to/input/directory --processes 4
Options
input_directory
: Path to the directory containing CSV files (required)output_directory
: Path to the directory where Parquet files will be saved (default:./registers
)--processes
: Number of processes to use for parallel processing (default: 4)--encoding-chunk-size
: Chunk size in MB for encoding detection (default: 1 MB)
Output
- Parquet files are saved in
/path/to/your/fixed/output/directory/registers
- A summary JSON file is generated at
register_summary.json
License
This project is licensed under the MIT License - see the LICENSE file for details.
Recent Changes
Encoding Chunk Size Modification
We've updated the encoding chunk size option to use megabytes (MB) instead of kilobytes (KB) for easier user input:
- The default encoding chunk size is now 1 MB.
- Users can specify the encoding chunk size in MB using the
--encoding-chunk-size
option. - Internally, the program converts the MB value to bytes for processing.
Error Handling Improvements
- Enhanced exception handling throughout the codebase.
- Added more specific error messages for common issues like file not found and permission errors.
Type Hinting Updates
- Updated type hints to be compatible with Python 3.12.
- Replaced
MPQueue
withQueue
from thequeue
module for better type checking.
Code Structure and Style
- Improved code organization and readability.
- Added or updated docstrings for better function documentation.
Performance Optimization
- Implemented dynamic chunk size adjustment for very large files in the encoding detection process.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cdef_converter-0.2.0.tar.gz
(9.7 kB
view details)
Built Distribution
File details
Details for the file cdef_converter-0.2.0.tar.gz
.
File metadata
- Download URL: cdef_converter-0.2.0.tar.gz
- Upload date:
- Size: 9.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e797942dc7e630b21463b7ed779470043222ca98b67233f483c658f777e85f5 |
|
MD5 | 755c391804fdfc7d3632253f4e5ac259 |
|
BLAKE2b-256 | e32711967e73d09efd1d404a5fbaaec5c572362e9442b7a453558007793a088e |
File details
Details for the file cdef_converter-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: cdef_converter-0.2.0-py3-none-any.whl
- Upload date:
- Size: 10.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2734562bdfe51e20986c4c2dd1fdf615e4b0e6b6e767f2c2b4454d0118cbd0dc |
|
MD5 | 68f0098e3c633852f597d72ed50acd46 |
|
BLAKE2b-256 | 0ee6db94ca7334744c9fbb2bf090a676b1198cd8f955883e3e68485fb27623c1 |