CLI tool to benchmark compression algorithms on Parquet datasets
Project description
compressbench
Benchmark compression algorithms on Parquet datasets.
Why
Compression settings affect performance, storage cost, and latency — but most data engineers inherit defaults without testing.
compressbench lets you benchmark compression ratio, compression speed, and decompression speed across algorithms using your own Parquet files.
Features
- Accepts local Parquet files as input.
- Supports gzip and snappy.
- Outputs:
- Compression ratio.
- Compression time.
- Decompression time.
- CLI built with Typer.
- Unit tests with pytest.
Installation
pip install compressbench
Usage
compressbench input.parquet --algorithms gzip snappy If --algorithms is omitted, runs benchmarks for all available algorithms.
Example Output Algorithm: gzip Compression ratio: 2.91 Compression time: 0.43s Decompression time: 0.12s
Algorithm: snappy Compression ratio: 1.67 Compression time: 0.12s Decompression time: 0.05s
CLI Options
input.parquet Path to the Parquet file to benchmark. --algorithms List of algorithms to test (gzip, snappy). --level Not supported in v0.1.0. Reserved for future versions.
Roadmap
See ROADMAP.md for planned features.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file compressbench-0.1.0.tar.gz.
File metadata
- Download URL: compressbench-0.1.0.tar.gz
- Upload date:
- Size: 5.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41204ae22615ca25d59b68c65074fefc632270bd0c849690f34335ec23e9905f
|
|
| MD5 |
33730f3aec0a928c518ea4e9e9c7aaba
|
|
| BLAKE2b-256 |
4555f6dc3b34a296c84f97f765aff73e3b1c20751555300b7d07a8521c1235ad
|
File details
Details for the file compressbench-0.1.0-py3-none-any.whl.
File metadata
- Download URL: compressbench-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9bc5fb5baa4494d2c611f9032d653b44279eaec398d4dd7f3cfcbd6ebb009a3
|
|
| MD5 |
8922f2678a1fc4227ddac0425f0e1f6e
|
|
| BLAKE2b-256 |
234eef948c448b1103a800487c11d1e38c0e3485ff3c831b100f0c3a1d4c7d90
|