Skip to main content

Asynchronous gzip file reader/writer with aiocsv support.

Project description

aiogzip ⚡️

An asynchronous library for reading and writing gzip-compressed files.

License: MIT PyPI version Python 3.8-3.14 Tests Coverage Documentation

aiogzip provides a fast, simple, and asyncio-native interface for handling .gz files, making it a useful complement to Python's built-in gzip module for asynchronous applications.

🚀 Read the Documentation

Features

  • Truly Asynchronous: Built with asyncio and aiofiles.
  • High-Performance: Optimized buffer handling for fast I/O.
  • Drop-in Replacement: Mimics gzip.open() with async seek, tell, peek, and readinto support; verified against tarfile-style access patterns and aiocsv workflows.
  • Reproducible Archives: Control gzip mtime and embedded filenames.
  • Type-Safe: Distinct AsyncGzipBinaryFile and AsyncGzipTextFile.
  • aiocsv Ready: Seamless integration for CSV pipelines.
  • Predictable Performance: Backward seeks rewind the stream and re-decompress data (same as gzip.GzipFile), so treat random access as O(n) and prefer forward-only patterns when possible.

Append mode and large files

  • Append mode ("ab", "at") writes a new gzip member. The file ends up as two (or more) concatenated gzip members. Every standards-compliant reader — including aiogzip, gzip.open(), and command-line gunzip — transparently concatenates the output, but each additional open writes a new member rather than extending the existing deflate stream.
  • Backward seeks restart decompression from the beginning of the file, so forward-only access is much faster than mixed-direction access.
  • Writes past 4 GiB of uncompressed data produce a gzip trailer whose ISIZE field wraps to size & 0xFFFFFFFF (this matches the gzip format spec and gzip.open()). Pass strict_size=True to refuse writes that would exceed the limit instead.
  • Guard against decompression bombs by passing max_decompressed_size=<bytes> when reading untrusted files; the decompressor aborts with OSError once the cap is exceeded.

Quickstart

pip install aiogzip
import asyncio
from aiogzip import AsyncGzipFile

async def main():
    # Write
    async with AsyncGzipFile("file.gz", "wb") as f:
        await f.write(b"Hello, async world!")

    # Read
    async with AsyncGzipFile("file.gz", "rb") as f:
        print(await f.read())

asyncio.run(main())

# Deterministic metadata
async with AsyncGzipFile(
    "dataset.gz", "wb", mtime=0, original_filename="dataset.csv"
) as f:
    await f.write(b"stable bytes")

Performance

  • Text I/O: Often ~2-3x faster than standard gzip in bulk text workflows.
  • Binary I/O: Typically near parity for bulk reads/writes, and can be slower for very small chunk sizes.
  • Concurrency: CPU-heavy zlib compress/decompress calls run in the default executor above a 256 KiB threshold, so multiple gzip streams on the same event loop compress and decompress in parallel instead of serializing on the loop thread. The repo's concurrent-I/O benchmark runs ~4x faster on 1.4.0 than on 1.3.x as a result; single-stream throughput stays at parity.
  • Memory: Optimized buffer management for stable memory usage.
  • JSONL: For large gzipped JSONL files, prefer AsyncGzipTextFile(..., newline="\n", chunk_size=512 * 1024) to reduce line-iteration overhead.

See the Performance Guide for detailed benchmarks.

Contributing

See CONTRIBUTING.md for development instructions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiogzip-1.4.0.tar.gz (56.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aiogzip-1.4.0-py3-none-any.whl (26.4 kB view details)

Uploaded Python 3

File details

Details for the file aiogzip-1.4.0.tar.gz.

File metadata

  • Download URL: aiogzip-1.4.0.tar.gz
  • Upload date:
  • Size: 56.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aiogzip-1.4.0.tar.gz
Algorithm Hash digest
SHA256 f90b85b361bef0c00d87b516443b139bb95db1af4b0044c3adcd3748863b08a7
MD5 c89f9ea70ff71dfc04ed6f7a852ce440
BLAKE2b-256 7354b6181bdd2c76ea8b59b24b43a25a39ed45f046e4c711dfd298a264ef4aac

See more details on using hashes here.

Provenance

The following attestation bundles were made for aiogzip-1.4.0.tar.gz:

Publisher: publish.yml on geoff-davis/aiogzip

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aiogzip-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: aiogzip-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 26.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aiogzip-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e7e10526f9784e33b9dbdac8dff8c878ecfcfd4281800c24eba75b38797a29c2
MD5 a13ea696e48fcab40870ff69577e338e
BLAKE2b-256 fa2bcb9c8915597c5bb4c32dfbc3dae0fee0756aa073321ee84161a383d42d24

See more details on using hashes here.

Provenance

The following attestation bundles were made for aiogzip-1.4.0-py3-none-any.whl:

Publisher: publish.yml on geoff-davis/aiogzip

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page