Pure in-memory ZPAQ compression for Python (real pybind11 bindings, prebuilt wheels, no C++ toolchain needed to install).
Project description
zpaq
Pure in-memory ZPAQ compression for Python — up to 5.9× faster than the official zpaq CLI at the same ratio (with optional fragment-level deduplication), byte-exact CLI-interoperable in both directions, multi-threaded compress + decompress, prebuilt wheels for every modern Python on Windows / Linux / macOS, zero C++ toolchain or runtime dependencies to install.
import zpaq
blob = zpaq.compress(b"hello world " * 1_000, level=3) # bytes -> bytes
assert zpaq.decompress(blob) == b"hello world " * 1_000
By default zpaq.compress(...) auto-scales across all CPU cores (threads=0). If you want the absolute best compression ratio (around 0.5-3 percentage points better) at the cost of throughput, pass threads=1 to keep the input as a single block:
blob = zpaq.compress(big_data, level=5, threads=1) # max ratio, single-thread
For inputs with repeated content (logs, large text corpora, similar binaries, snapshots), pass dedup=True to get fragment-level deduplication — input is split into ~64 KB content-defined chunks, identical chunks are stored once. Matches what the official zpaq a CLI produces, so the output is fully zpaq x extractable:
blob = zpaq.compress(repetitive_data, level=5, dedup=True) # JIDAC archive
Why this exists
Every other ZPAQ binding on PyPI shells out to the zpaq executable, which forces temp files and a subprocess fork. This package is both:
- A real pybind11 binding around
libzpaq(Matt Mahoney's underlying C++ library — the same one the officialzpaqCLI is built on top of), wrapping abstractReader/Writeradapters that read from and write tobytesobjects with no filesystem detour. - Distributed as prebuilt wheels for Windows, Linux, and macOS (including Apple Silicon) across Python 3.8 through 3.13. Installing it never compiles anything.
On Windows, the wheel statically links the C and C++ runtimes so users don't need any "Visual C++ Redistributable" installed — if Python runs, zpaq works.
Performance
Speedup vs the official zpaq.exe -m5 (Ryzen-class 12-core x86_64, level 5):
| workload | zpaq.exe -m5 |
best zpaq.compress |
speedup |
|---|---|---|---|
| 40 KB text | 0.16 s | 0.11 s | 1.5× |
| 1 MB text | 2.13 s | 0.49 s (t=12) | 4.3× |
| 10 MB text | 23.17 s | 3.91 s (t=12) | 5.9× |
| 125 MB text | 183.77 s | 55.42 s (t=12) | 3.3× |
Full benchmark by thread count below. CLI is the official zpaq.exe v7.15 invoked with -m5 (the speeds shown include its -t0 default of two worker threads). mem(t=N) is zpaq.compress(data, level=5, threads=N). Times in seconds; ratio is bytes-reduced over original.
40 KB text:
| algo | compress | decompress | ratio % |
|---|---|---|---|
zpaq.exe -m5 |
0.16 s | 0.15 s | 71.5 % |
zpaq.compress(t=1) |
0.11 s | 0.11 s | 73.4 % |
1 MB text:
| algo | compress | decompress | ratio % |
|---|---|---|---|
zpaq.exe -m5 |
2.13 s | 2.16 s | 80.0 % |
zpaq.compress(t=1) |
1.97 s | 2.03 s | 80.1 % |
zpaq.compress(t=4) |
0.73 s | 2.27 s | 79.3 % |
zpaq.compress(t=12) |
0.49 s | 2.31 s | 77.6 % |
10 MB text:
| algo | compress | decompress | ratio % |
|---|---|---|---|
zpaq.exe -m5 |
23.17 s | 23.82 s | 84.2 % |
zpaq.compress(t=1) |
21.07 s | 22.23 s | 84.2 % |
zpaq.compress(t=4) |
6.83 s | 21.17 s | 82.8 % |
zpaq.compress(t=12) |
3.91 s | 21.37 s | 81.2 % |
125 MB text:
| algo | compress | decompress | ratio % |
|---|---|---|---|
zpaq.exe -m5 |
183.77 s | 187.99 s | 86.7 % |
zpaq.compress(t=4) |
101.40 s | 296.47 s | 85.8 % |
zpaq.compress(t=8) |
66.03 s | 297.45 s | 85.0 % |
zpaq.compress(t=12) |
55.42 s | 289.96 s | 84.5 % |
How the speedup is achieved. libzpaq's reference compiler emits an interpreter for the per-byte context-mixing predictor at compression levels 3-5. The official zpaq.exe on x86_64 ships with that interpreter replaced by a JIT that translates the predictor bytecode into native machine code at archive-open time. This package's x86_64 wheels enable the same JIT path plus:
- multi-threaded block compression via
threads=N(the official CLI tops out at 2 cores by default) - multi-threaded block decompression — we scan the archive for ZPAQ locator-tag block boundaries, dispatch each block to a worker, and concatenate. The official
libzpaqAPI exposes only sequential decompress; doing it block-parallel makes our 125 MB decompress about 3× faster thanzpaq.exe x. - skip-checksum-by-default (
verify=False), since pure-data workflows rarely need the SHA-1 per block thatzpaq.exealways computes - AVX2-enabled compile flags (auto-vectorization on x86_64; CPUs from 2013+ are covered, older fall back to the sdist build)
- a libsais-backed suffix array constructor for level-3 BWT mode (Apache 2.0, several times faster than
libzpaq's vendored libdivsufsort-lite) - a faster decompress path for archives produced by
zpaq.compress(avoids the JIDAC-aware per-segment buffering)
Compress scales nearly linearly with thread count up to ~12 cores. Compression ratio drops slightly as threads increase (more block boundaries reduce per-block context size); the ratio for t=1 matches or beats the CLI on every workload.
Pass dedup=True to match zpaq.exe's ratio on inputs with repeated content — fragment-level dedup splits the input into ~64 KB content-defined chunks and stores each unique chunk once. The output is a JIDAC archive that both this package's decompress and the official zpaq x CLI extract byte-exactly. Default behavior remains the raw streaming format (faster, slightly worse ratio on heavily-repetitive inputs).
ARM / Apple Silicon wheels disable the x86-only JIT and AVX2 flags but still benefit from threading, libsais, and the fast decompress path.
API
zpaq.compress(
data, # bytes-like
level=5, # 0..5 (0=store, 5=strongest)
threads=0, # 0 (default) = auto-detect host CPU count, clamped
# by input size (64KB minimum chunk per worker).
# 1 = single-thread, deterministic, best ratio.
# N>1 = pin to exactly N workers.
hints=False, # If True, scan input for text/exe signatures and order-1
# redundancy, pass them to libzpaq via the method string.
# Slight overhead, helps ratio on some mixed/binary data.
verify=False, # If True, compute & embed SHA-1 per segment. zpaq.exe
# also writes these by default; turning them off makes
# both this package and zpaq.exe skip verification on
# extract, which is faster but won't catch corruption.
method=None, # Optional raw libzpaq method-string override (e.g. "x4,4,1"
# for custom predictor specs). Overrides level/hints when set.
dedup=False, # If True, emit a JIDAC-format archive with fragment-level
# deduplication. Input is content-defined-chunked into ~64KB
# fragments, identical fragments are stored once. Output is
# zpaq.exe-extractable. Improves ratio on repetitive data;
# currently single-threaded encode.
) -> bytes
zpaq.decompress(
data, # bytes-like ZPAQ stream
verify=False, # If True, recompute SHA-1 of each segment and compare to
# the one stored in the archive. Raises zpaq.Error on
# mismatch. Default off for speed.
) -> bytes
zpaq.Error # Raised on libzpaq failures (corrupt stream, bad header, etc.)
Both compress and decompress release the GIL while libzpaq runs, so zpaq plays well with threaded workloads.
Compatibility with the zpaq CLI
zpaq.compress() emits the same on-disk format libzpaq itself writes, and zpaq.decompress() understands archives produced by the zpaq a journaling archiver (it identifies the JIDAC index/hash/info segments, discards them, and strips each data segment's trailing fragment-size footer so the recovered bytes match the original file exactly).
Tested on ten varied real files (1 KB to 25 MB, text/image/csv/jar/png/jpg/svg/exe/binary, compression levels 1-5):
| Direction | Result |
|---|---|
zpaq.compress → zpaq.decompress |
10 / 10 byte-exact |
zpaq.compress → official zpaq x CLI |
10 / 10 byte-exact |
official zpaq a CLI → zpaq.decompress |
10 / 10 byte-exact |
import zpaq
# Pipe to the official CLI
with open("out.zpaq", "wb") as f:
f.write(zpaq.compress(my_bytes, level=5))
# ...later, from any machine with the zpaq executable installed:
# $ zpaq x out.zpaq
# Read an archive that someone else produced with `zpaq a`
with open("their.zpaq", "rb") as f:
file_bytes = zpaq.decompress(f.read())
When zpaq.decompress is fed a multi-file archive it returns the concatenated bytes of every file in the order the CLI stored them. A future release will expose a per-segment iterator API so individual files can be addressed by name.
Future work
A few performance levers are still on the table; pull requests welcome:
- Profile-guided optimization (PGO). Adding
/GENPROFILE+/USEPROFILEto the MSVC build (and equivalents on gcc/clang) typically gains another 5-15%. Skipped here because cibuildwheel doesn't expose a clean two-stage build hook yet. - Hand-written SIMD in the predictor.
/arch:AVX2is enabled so the compiler auto-vectorizes where it can. The actual hot loop at compression levels 3-5 is the JIT-emitted predictor, which currently emits one x86 instruction at a time; rewriting the JIT to emit AVX2 mul-add chains for the MIX/ISSE components would be a real gain. - Parallel JIDAC encode.
dedup=Trueis currently single-threaded; splitting the fragment-build pass across cores would speed up large dedup compresses. - Per-segment archive API.
zpaq.decompresscurrently returns the concatenated bytes of every segment in a multi-filezpaq aarchive. A future iterator API would let callers address individual files by name.
License
This package is released under the same terms as the underlying libzpaq sources: public domain. See src/zpaq/vendor/COPYING.
The vendored libsais suffix array library is Apache 2.0 (Ilya Grebnov). See src/zpaq/vendor/LICENSE-libsais.
Not affiliated with Matt Mahoney. libzpaq was released into the public domain by its original author; this Python package wraps those sources and is an independent community project.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zpaq-0.2.1.tar.gz.
File metadata
- Download URL: zpaq-0.2.1.tar.gz
- Upload date:
- Size: 152.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f67a6c7dff2949f431aed9e68995304b1a3a92d866bee8c366826fc320bed41
|
|
| MD5 |
f25195b1fcc6fe5b5509f08f32bf5e4d
|
|
| BLAKE2b-256 |
cb72491db0a66fc4b6d97214f3863e09ca6a78bde6ec239643629b7f9c396aa4
|
File details
Details for the file zpaq-0.2.1-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: zpaq-0.2.1-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 409.8 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
421fa70a266df81cb5699ffe4131a33f507701c2e7e92e45c8f98cbbdfab2976
|
|
| MD5 |
49991fb8086ea8379a2a8e323ab09743
|
|
| BLAKE2b-256 |
ae0c9cba30cfb32a79157646158bf7d595b1bc609ec744254dd9531bbe6c3858
|