Asynchronous gzip file reader/writer with aiocsv support.
Project description
aiogzip ⚡️
An asynchronous library for reading and writing gzip-compressed files.
aiogzip provides a fast, simple, and asyncio-native interface for handling .gz files, making it a useful complement to Python's built-in gzip module for asynchronous applications.
Features
- Truly Asynchronous: Built with
asyncioandaiofiles. - High-Performance: Optimized buffer handling for fast I/O.
- Drop-in Replacement: Mimics
gzip.open()with asyncseek,tell,peek, andreadintosupport; verified against tarfile-style access patterns and aiocsv workflows. - Reproducible Archives: Control gzip
mtimeand embedded filenames. - Type-Safe: Distinct
AsyncGzipBinaryFileandAsyncGzipTextFile. aiocsvReady: Seamless integration for CSV pipelines.- Predictable Performance: Backward seeks rewind the stream and re-decompress data (same as
gzip.GzipFile), so treat random access as O(n) and prefer forward-only patterns when possible.
Append mode and large files
- Append mode (
"ab","at") writes a new gzip member. The file ends up as two (or more) concatenated gzip members. Every standards-compliant reader — includingaiogzip,gzip.open(), and command-linegunzip— transparently concatenates the output, but each additional open writes a new member rather than extending the existing deflate stream. - Backward seeks restart decompression from the beginning of the file, so forward-only access is much faster than mixed-direction access.
- Non-seekable input streams use a bounded rewind cache. By default, up to 128 MiB of compressed input is retained so backward seeks can replay the stream; pass
max_rewind_cache_size=<bytes>to tune this, orNoneto allow an unbounded cache. - Writes past 4 GiB of uncompressed data produce a gzip trailer whose
ISIZEfield wraps tosize & 0xFFFFFFFF(this matches the gzip format spec andgzip.open()). Passstrict_size=Trueto refuse writes that would exceed the limit instead. - Guard against decompression bombs by passing
max_decompressed_size=<bytes>when reading untrusted files; the decompressor aborts withOSErroronce the cap is exceeded.
Quickstart
pip install aiogzip
import asyncio
from aiogzip import AsyncGzipFile
async def main():
# Write
async with AsyncGzipFile("file.gz", "wb") as f:
await f.write(b"Hello, async world!")
# Read
async with AsyncGzipFile("file.gz", "rb") as f:
print(await f.read())
asyncio.run(main())
# Deterministic metadata
async with AsyncGzipFile(
"dataset.gz", "wb", mtime=0, original_filename="dataset.csv"
) as f:
await f.write(b"stable bytes")
Performance
- Text I/O: Often ~2-3x faster than standard
gzipin bulk text workflows. - Binary I/O: Typically near parity for bulk reads/writes, and can be slower for very small chunk sizes.
- Concurrency: CPU-heavy
zlibcompress/decompress calls run in the default executor above a 256 KiB threshold, so multiple gzip streams on the same event loop compress and decompress in parallel instead of serializing on the loop thread. The repo's concurrent-I/O benchmark runs ~4x faster on 1.4.0 than on 1.3.x as a result; single-stream throughput stays at parity. - Memory: Optimized buffer management for stable memory usage.
- JSONL: For large gzipped JSONL files, prefer
AsyncGzipTextFile(..., newline="\n", chunk_size=512 * 1024)to reduce line-iteration overhead.
See the Performance Guide for detailed benchmarks.
Contributing
See CONTRIBUTING.md for development instructions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aiogzip-1.5.0.tar.gz.
File metadata
- Download URL: aiogzip-1.5.0.tar.gz
- Upload date:
- Size: 58.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25844e7628fb6f69c6579de80ca65d4444d9258554c4628907414df3b54eaaef
|
|
| MD5 |
26f830b7084bdbf2d81bd6dea4d9e90f
|
|
| BLAKE2b-256 |
3780a9b3e3f6443032904f6d0772d3f2e5b2da4dc0f8b12867c9fe2b3948df37
|
Provenance
The following attestation bundles were made for aiogzip-1.5.0.tar.gz:
Publisher:
publish.yml on geoff-davis/aiogzip
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aiogzip-1.5.0.tar.gz -
Subject digest:
25844e7628fb6f69c6579de80ca65d4444d9258554c4628907414df3b54eaaef - Sigstore transparency entry: 1366402780
- Sigstore integration time:
-
Permalink:
geoff-davis/aiogzip@7223d9451e9c4c3ab8a89a6245d4f94446411f69 -
Branch / Tag:
refs/tags/v1.5.0 - Owner: https://github.com/geoff-davis
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7223d9451e9c4c3ab8a89a6245d4f94446411f69 -
Trigger Event:
push
-
Statement type:
File details
Details for the file aiogzip-1.5.0-py3-none-any.whl.
File metadata
- Download URL: aiogzip-1.5.0-py3-none-any.whl
- Upload date:
- Size: 26.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
150a65f7a69dfa54c623dcfba3cda244e4040a3a6ccd69b2448818a84967e934
|
|
| MD5 |
5363b61bbd77a3c0ddfff5c40822039f
|
|
| BLAKE2b-256 |
2d42529116321f52f148be4e64ab8a8341178b047034e6154e04389939d17c5b
|
Provenance
The following attestation bundles were made for aiogzip-1.5.0-py3-none-any.whl:
Publisher:
publish.yml on geoff-davis/aiogzip
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aiogzip-1.5.0-py3-none-any.whl -
Subject digest:
150a65f7a69dfa54c623dcfba3cda244e4040a3a6ccd69b2448818a84967e934 - Sigstore transparency entry: 1366402855
- Sigstore integration time:
-
Permalink:
geoff-davis/aiogzip@7223d9451e9c4c3ab8a89a6245d4f94446411f69 -
Branch / Tag:
refs/tags/v1.5.0 - Owner: https://github.com/geoff-davis
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7223d9451e9c4c3ab8a89a6245d4f94446411f69 -
Trigger Event:
push
-
Statement type: