Skip to main content

Fast random access of gzip files in Python

Project description

https://travis-ci.org/pauldmccarthy/indexed_gzip.svg?branch=master

The indexed_gzip project is a Python extension which aims to provide a drop-in replacement for the built-in Python gzip.GzipFile class, the IndexedGzipFile.

indexed_gzip was written to allow fast random access of compressed NIFTI image files (for which GZIP is the de-facto compression standard), but will work with any GZIP file. indexed_gzip is easy to use with nibabel (http://nipy.org/nibabel/).

The standard gzip.GzipFile class exposes a random access-like interface (via its seek and read methods), but every time you seek to a new point in the uncompressed data stream, the GzipFile instance has to start decompressing from the beginning of the file, until it reaches the requested location.

An IndexedGzipFile instance gets around this performance limitation by building an index, which contains seek points, mappings between corresponding locations in the compressed and uncompressed data streams. Each seek point is accompanied by a chunk (32KB) of uncompressed data which is used to initialise the decompression algorithm, allowing us to start reading from any seek point. If the index is built with a seek point spacing of 1MB, we only have to decompress (on average) 512KB of data to read from any location in the file.

See https://github.com/pauldmccarthy/indexed_gzip for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

indexed_gzip-0.3.1.tar.gz (222.2 kB view details)

Uploaded Source

Built Distributions

indexed_gzip-0.3.1-cp35-cp35m-macosx_10_6_intel.whl (411.3 kB view details)

Uploaded CPython 3.5mmacOS 10.6+ Intel (x86-64, i386)

indexed_gzip-0.3.1-cp27-cp27m-macosx_10_6_intel.whl (428.1 kB view details)

Uploaded CPython 2.7mmacOS 10.6+ Intel (x86-64, i386)

File details

Details for the file indexed_gzip-0.3.1.tar.gz.

File metadata

  • Download URL: indexed_gzip-0.3.1.tar.gz
  • Upload date:
  • Size: 222.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for indexed_gzip-0.3.1.tar.gz
Algorithm Hash digest
SHA256 eb4739258e5f9ec6bb246a4c53cdae0c04d1ef25613f9304daa5f4e2952e4273
MD5 c2e5426648b51deb39752b8d10946c0e
BLAKE2b-256 fe873d0bafc3195795b60be08aaa34e687bc51fdadff9e99b80dabfd4a88991b

See more details on using hashes here.

File details

Details for the file indexed_gzip-0.3.1-cp35-cp35m-macosx_10_6_intel.whl.

File metadata

File hashes

Hashes for indexed_gzip-0.3.1-cp35-cp35m-macosx_10_6_intel.whl
Algorithm Hash digest
SHA256 016219c8c700ea2a262a87a1489b92a4e168f3eedba3714dd14afc53e1da09aa
MD5 50c649adf39fbcb00a432f87c057d036
BLAKE2b-256 eb65091ab1ea52c2e20b4bb32d13026d28bbb337a4fedd052492abe18c99001c

See more details on using hashes here.

File details

Details for the file indexed_gzip-0.3.1-cp27-cp27m-macosx_10_6_intel.whl.

File metadata

File hashes

Hashes for indexed_gzip-0.3.1-cp27-cp27m-macosx_10_6_intel.whl
Algorithm Hash digest
SHA256 6935241dbfe8ad01c14331272269bb4fc96aae6f9e34e8458f211a93d5dc0381
MD5 17a4406c5a02150b987c56521c8e06f9
BLAKE2b-256 f05d710a2e5b0569cd3f1c49f68f7bc69e6d1b412128a3d623ab3e4f3a3ad94b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page