Parallel random access to gzip files

These details have not been verified by PyPI

Project links

Homepage

Project description

Pragzip: Parallel Random Access Gzip

C++17

(Alternative Name: Rapidgzip: Random Access Parallel (Indexed) Decompression for Gzip Files)

This repository contains the command line tool pragzip, which can be used for parallel decompression of almost any gzip file.

The Python module provides a PragzipFile class, which can be used to seek inside gzip files without having to decompress them first. Alternatively, you can use this simply as a parallelized gzip decoder as a replacement for Python's builtin gzip module in order to fully utilize all your cores.

The random seeking support is the same as provided by indexed_gzip but further speedups are realized at the cost of higher memory usage thanks to a least-recently-used cache in combination with a parallelized prefetcher.

This repository is a light-weight fork of the indexed_bzip2 repository, in which the main development takes place. This repository was created for visibility reasons and in order to keep indexed_bzip2 and pragzip releases separate. It will be updated at least for each release. Issues regarding pragzip should be opened here.

Performance
1. Decompression with Existing Index
2. Decompression from Scratch
Installation
Usage
Citation
Internal Architecture
Tracing the Decoder

Performance

These are simple timing tests for reading all the contents of a gzip file sequentially.

Results are shown for an AMD Ryzen 3900X 12-core (24 virtual cores) processor and with gzipFilePath="4GiB-base64.gz", which is a 4 GiB gzip compressed file with base64 random data.

Be aware that the chunk size requested from the Python code does influence the performance heavily. This benchmarks use a chunk size of 512 KiB.

Decompression with Existing Index

	4GiB-base64	4GiB-base64	20x-silesia	20x-silesia
Uncompressed Size	4 GiB		3.95 GiB
Compressed Size	3.04 GiB		1.27 GiB
Module	Bandwidth / (MB/s)	Speedup	Bandwidth / (MB/s)	Speedup
gzip	250	1	293	1
pragzip (0 threads)	4480	17.9	4830	16.5
pragzip (1 threads)	294	1.2	350	1.2
pragzip (2 threads)	580	2.3	678	2.3
pragzip (6 threads)	1680	6.7	1940	6.6
pragzip (12 threads)	3110	12.5	3460	11.8
pragzip (24 threads)	4510	18.0	5070	17.3
pragzip (32 threads)	4330	17.3	4720	16.1

Decompression from Scratch

Python

	4GiB-base64	4GiB-base64	20x-silesia	20x-silesia
Uncompressed Size	4 GiB		3.95 GiB
Compressed Size	3.04 GiB		1.27 GiB
Module	Bandwidth / (MB/s)	Speedup	Bandwidth / (MB/s)	Speedup
gzip	250	1	293	1
pragzip (0 threads)	3280	13.1	2280	7.8
pragzip (1 threads)	222	0.9	236	0.8
pragzip (2 threads)	428	1.7	411	1.4
pragzip (6 threads)	1250	5.0	1095	3.7
pragzip (12 threads)	2290	9.2	1390	4.8
pragzip (24 threads)	3300	13.2	2280	7.8
pragzip (32 threads)	3180	12.7	2480	8.5

Note that pragzip is generally faster when given an index because it can delegate the decompression to zlib while it has to use its own gzip decompression engine when no index exists yet.

Note that values deviate roughly by 10% and therefore are rounded.

The code used for benchmarking can be found here.

Installation

You can simply install it from PyPI:

python3 -m pip install --upgrade pip  # Recommended for newer manylinux wheels
python3 -m pip install pragzip

The latest unreleased development version can be tested out with:

python3 -m pip install --force-reinstall 'git+https://github.com/mxmlnkn/indexed_bzip2.git@master#egginfo=pragzip&subdirectory=python/pragzip'

And to build locally, you can use build and install the wheel:

cd python/pragzip
rm -rf dist
python3 -m build .
python3 -m pip install --force-reinstall --user dist/*.whl

Usage

Command Line Tool

pragzip --help

# Parallel decoding: 1.7 s
time pragzip -d -c -P 0 sample.gz | wc -c

# Serial decoding: 22 s
time gzip -d -c sample.gz | wc -c

Python Library

Simple open, seek, read, and close

from pragzip import PragzipFile

file = PragzipFile( "example.gz", parallelization = os.cpu_count() )

# You can now use it like a normal file
file.seek( 123 )
data = file.read( 100 )
file.close()

The first call to seek will ensure that the block offset list is complete and therefore might create them first. Because of this the first call to seek might take a while.

Use with context manager

import os
import pragzip

with pragzip.open( "example.gz", parallelization = os.cpu_count() ) as file:
    file.seek( 123 )
    data = file.read( 100 )

Storing and loading the block offset map

The creation of the list of gzip blocks can take a while because it has to decode the gzip file completely. To avoid this setup when opening a gzip file, the block offset list can be exported and imported.

Open a pure Python file-like object for indexed reading

import io
import os
import pragzip as pragzip

with open( "example.gz", 'rb' ) as file:
    in_memory_file = io.BytesIO( file.read() )

with pragzip.open( in_memory_file, parallelization = os.cpu_count() ) as file:
    file.seek( 123 )
    data = file.read( 100 )

Via Ratarmount

pragzip is planned to be used as a backend inside ratarmount with version 0.12. Then, you can use ratarmount to mount single gzip files easily.

base64 /dev/urandom | head -c $(( 4 * 1024 * 1024 * 1024 )) | gzip > sample.gz
# Serial decoding: 23 s
time gzip -c -d sample.gz | wc -c

python3 -m pip install --user ratarmount
ratarmount sample.gz mounted

# Parallel decoding: 3.5 s
time cat mounted/sample | wc -c

# Random seeking to the middle of the file and reading 1 MiB: 0.287 s
time dd if=mounted/sample bs=$(( 1024 * 1024 )) \
       iflag=skip_bytes,count_bytes skip=$(( 2 * 1024 * 1024 * 1024 )) count=$(( 1024 * 1024 )) | wc -c

C++ library

Because it is written in C++, it can of course also be used as a C++ library. In order to make heavy use of templates and to simplify compiling with Python setuptools, it is mostly header-only so that integration it into another project should be easy. The license is also permissive enough for most use cases.

I currently did not yet test integrating it into other projects other than simply manually copying the source in src/core, src/pragzip, and if integrated zlib is desired also src/external/zlib. If you have suggestions and wishes like support with CMake or Conan, please open an issue.

Citation

A paper describing the implementation details and showing the scaling behavior with up to 128 cores has been submitted to and accepted in ACM HPDC'23, The 32nd International Symposium on High-Performance Parallel and Distributed Computing. If you use this software for your scientific publication, please cite it as:

This is preliminiary. The final citation will become available end of June 2023.

@inproceedings{rapidgzip,
    author    = {Knespel, Maximilian and Brunst, Holger},
    title     = {Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching},
    year      = {2023},
    % isbn    = {},  % To be released end of June
    publisher = {Association for Computing Machinery},
    address   = {New York, NY, USA},
    url       = {https://doi.org/10.1145/3588195.3592992},
    doi       = {10.1145/3588195.3592992},
    abstract  = {Gzip is a file compression format, which is ubiquitously used. Although a multitude of gzip implementations exist, only pugz can fully utilize current multi-core processor architectures for decompression. Yet, pugz cannot decompress arbitrary gzip files. It requires the decompressed stream to only contain byte values 9–126. In this work, we present a generalization of the parallelization scheme used by pugz that can be reliably applied to arbitrary gzip-compressed data without compromising performance. We show that the requirements on the file contents posed by pugz can be dropped by implementing an architecture based on a cache and a parallelized prefetcher. This architecture can safely handle faulty decompression results, which can appear when threads start decompressing in the middle of a gzip file by using trial and error. Using 128 cores, our implementation reaches 8.7 GB/s decompression bandwidth for gzip-compressed base64-encoded data, a speedup of 55 over the single-threaded GNU gzip, and 5.6 GB/s for the Silesia corpus, a speedup of 33 over GNU gzip.},
    booktitle = {Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing},
    % pages   = {16–29},  % To be released end of June
    numpages  = {13},
    keywords  = {Gzip, Decompression, Parallel Algorithm, Performance, Random Access},
    location  = {Orlando, FL, USA},
    series    = {HPDC '23},
}

Internal Architecture

The main part of the internal architecture used for parallelizing is the same as used for indexed_bzip2.

Tracing the Decoder

Performance profiling and tracing is done with Score-P for instrumentation and Vampir for visualization. This is one way, you could install Score-P with most of the functionalities on Ubuntu 22.04.

Installation of Dependencies

Installation steps for Score-P

sudo apt-get install libopenmpi-dev openmpi-bin gcc-11-plugin-dev llvm-dev libclang-dev libunwind-dev \
                     libopen-trace-format-dev otf-trace libpapi-dev

# Install Score-P (to /opt/scorep)
SCOREP_VERSION=8.0
wget "https://perftools.pages.jsc.fz-juelich.de/cicd/scorep/tags/scorep-${SCOREP_VERSION}/scorep-${SCOREP_VERSION}.tar.gz"
tar -xf "scorep-${SCOREP_VERSION}.tar.gz"
cd "scorep-${SCOREP_VERSION}"
./configure --with-mpi=openmpi --enable-shared --without-llvm --without-shmem --without-cubelib --prefix="/opt/scorep-${SCOREP_VERSION}"
make -j $( nproc )
make install

# Add /opt/scorep to your path variables on shell start
cat <<EOF >> ~/.bashrc
if test -d /opt/scorep; then
    export SCOREP_ROOT=/opt/scorep
    export PATH=$SCOREP_ROOT/bin:$PATH
    export LD_LIBRARY_PATH=$SCOREP_ROOT/lib:$LD_LIBRARY_PATH
fi
EOF

echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid

# Check whether it works
scorep --version
scorep-info config-summary

Tracing

Results for a version from 2023-02-04

Comparison without and with rpmalloc preloaded

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.6.0

May 30, 2023

0.5.0

Jan 29, 2023

0.4.0

Nov 10, 2022

0.3.0

Aug 24, 2022

0.2.2

Aug 8, 2022

0.2.1

Aug 7, 2022

0.2.0

Aug 5, 2022

0.1.0

Jul 3, 2022

0.0.1

Jun 29, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pragzip-0.6.0.tar.gz (607.7 kB view details)

Uploaded May 30, 2023 Source

Built Distributions

pragzip-0.6.0-pp39-pypy39_pp73-win_amd64.whl (534.6 kB view details)

Uploaded May 30, 2023 PyPyWindows x86-64

pragzip-0.6.0-pp39-pypy39_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (798.4 kB view details)

Uploaded May 30, 2023 PyPymanylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pragzip-0.6.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (814.4 kB view details)

Uploaded May 30, 2023 PyPymanylinux: glibc 2.17+ x86-64

pragzip-0.6.0-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (853.7 kB view details)

Uploaded May 30, 2023 PyPymanylinux: glibc 2.17+ i686

pragzip-0.6.0-pp39-pypy39_pp73-macosx_10_14_x86_64.whl (702.1 kB view details)

Uploaded May 30, 2023 PyPymacOS 10.14+ x86-64

pragzip-0.6.0-pp38-pypy38_pp73-win_amd64.whl (534.8 kB view details)

Uploaded May 30, 2023 PyPyWindows x86-64

pragzip-0.6.0-pp38-pypy38_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (797.6 kB view details)

Uploaded May 30, 2023 PyPymanylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pragzip-0.6.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (813.5 kB view details)

Uploaded May 30, 2023 PyPymanylinux: glibc 2.17+ x86-64

pragzip-0.6.0-pp38-pypy38_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (852.6 kB view details)

Uploaded May 30, 2023 PyPymanylinux: glibc 2.17+ i686

pragzip-0.6.0-pp38-pypy38_pp73-macosx_10_14_x86_64.whl (702.0 kB view details)

Uploaded May 30, 2023 PyPymacOS 10.14+ x86-64

pragzip-0.6.0-pp37-pypy37_pp73-win_amd64.whl (534.7 kB view details)

Uploaded May 30, 2023 PyPyWindows x86-64

pragzip-0.6.0-pp37-pypy37_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (800.0 kB view details)

Uploaded May 30, 2023 PyPymanylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

pragzip-0.6.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (822.0 kB view details)

Uploaded May 30, 2023 PyPymanylinux: glibc 2.17+ x86-64

pragzip-0.6.0-pp37-pypy37_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (858.6 kB view details)

Uploaded May 30, 2023 PyPymanylinux: glibc 2.17+ i686

pragzip-0.6.0-pp37-pypy37_pp73-macosx_10_14_x86_64.whl (702.0 kB view details)

Uploaded May 30, 2023 PyPymacOS 10.14+ x86-64

pragzip-0.6.0-cp311-cp311-win_amd64.whl (538.7 kB view details)

Uploaded May 30, 2023 CPython 3.11Windows x86-64

pragzip-0.6.0-cp311-cp311-musllinux_1_1_x86_64.whl (7.3 MB view details)

Uploaded May 30, 2023 CPython 3.11musllinux: musl 1.1+ x86-64

pragzip-0.6.0-cp311-cp311-musllinux_1_1_i686.whl (7.2 MB view details)

Uploaded May 30, 2023 CPython 3.11musllinux: musl 1.1+ i686

pragzip-0.6.0-cp311-cp311-manylinux_2_28_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.11manylinux: glibc 2.28+ x86-64

pragzip-0.6.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.11manylinux: glibc 2.17+ x86-64

pragzip-0.6.0-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (6.6 MB view details)

Uploaded May 30, 2023 CPython 3.11manylinux: glibc 2.17+ i686

pragzip-0.6.0-cp311-cp311-macosx_10_14_x86_64.whl (753.5 kB view details)

Uploaded May 30, 2023 CPython 3.11macOS 10.14+ x86-64

pragzip-0.6.0-cp310-cp310-win_amd64.whl (540.8 kB view details)

Uploaded May 30, 2023 CPython 3.10Windows x86-64

pragzip-0.6.0-cp310-cp310-musllinux_1_1_x86_64.whl (7.3 MB view details)

Uploaded May 30, 2023 CPython 3.10musllinux: musl 1.1+ x86-64

pragzip-0.6.0-cp310-cp310-musllinux_1_1_i686.whl (7.2 MB view details)

Uploaded May 30, 2023 CPython 3.10musllinux: musl 1.1+ i686

pragzip-0.6.0-cp310-cp310-manylinux_2_28_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.10manylinux: glibc 2.28+ x86-64

pragzip-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.10manylinux: glibc 2.17+ x86-64

pragzip-0.6.0-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (6.6 MB view details)

Uploaded May 30, 2023 CPython 3.10manylinux: glibc 2.17+ i686

pragzip-0.6.0-cp310-cp310-macosx_10_14_x86_64.whl (754.5 kB view details)

Uploaded May 30, 2023 CPython 3.10macOS 10.14+ x86-64

pragzip-0.6.0-cp39-cp39-win_amd64.whl (542.3 kB view details)

Uploaded May 30, 2023 CPython 3.9Windows x86-64

pragzip-0.6.0-cp39-cp39-musllinux_1_1_x86_64.whl (7.3 MB view details)

Uploaded May 30, 2023 CPython 3.9musllinux: musl 1.1+ x86-64

pragzip-0.6.0-cp39-cp39-musllinux_1_1_i686.whl (7.2 MB view details)

Uploaded May 30, 2023 CPython 3.9musllinux: musl 1.1+ i686

pragzip-0.6.0-cp39-cp39-manylinux_2_28_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.9manylinux: glibc 2.28+ x86-64

pragzip-0.6.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.9manylinux: glibc 2.17+ x86-64

pragzip-0.6.0-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl (6.6 MB view details)

Uploaded May 30, 2023 CPython 3.9manylinux: glibc 2.17+ i686

pragzip-0.6.0-cp39-cp39-macosx_10_14_x86_64.whl (755.8 kB view details)

Uploaded May 30, 2023 CPython 3.9macOS 10.14+ x86-64

pragzip-0.6.0-cp38-cp38-win_amd64.whl (542.3 kB view details)

Uploaded May 30, 2023 CPython 3.8Windows x86-64

pragzip-0.6.0-cp38-cp38-musllinux_1_1_x86_64.whl (7.4 MB view details)

Uploaded May 30, 2023 CPython 3.8musllinux: musl 1.1+ x86-64

pragzip-0.6.0-cp38-cp38-musllinux_1_1_i686.whl (7.3 MB view details)

Uploaded May 30, 2023 CPython 3.8musllinux: musl 1.1+ i686

pragzip-0.6.0-cp38-cp38-manylinux_2_28_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.8manylinux: glibc 2.28+ x86-64

pragzip-0.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.8manylinux: glibc 2.17+ x86-64

pragzip-0.6.0-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl (6.6 MB view details)

Uploaded May 30, 2023 CPython 3.8manylinux: glibc 2.17+ i686

pragzip-0.6.0-cp38-cp38-macosx_10_14_x86_64.whl (754.9 kB view details)

Uploaded May 30, 2023 CPython 3.8macOS 10.14+ x86-64

pragzip-0.6.0-cp37-cp37m-win_amd64.whl (541.4 kB view details)

Uploaded May 30, 2023 CPython 3.7mWindows x86-64

pragzip-0.6.0-cp37-cp37m-musllinux_1_1_x86_64.whl (7.3 MB view details)

Uploaded May 30, 2023 CPython 3.7mmusllinux: musl 1.1+ x86-64

pragzip-0.6.0-cp37-cp37m-musllinux_1_1_i686.whl (7.2 MB view details)

Uploaded May 30, 2023 CPython 3.7mmusllinux: musl 1.1+ i686

pragzip-0.6.0-cp37-cp37m-manylinux_2_28_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.7mmanylinux: glibc 2.28+ x86-64

pragzip-0.6.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.7mmanylinux: glibc 2.17+ x86-64

pragzip-0.6.0-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl (6.6 MB view details)

Uploaded May 30, 2023 CPython 3.7mmanylinux: glibc 2.17+ i686

pragzip-0.6.0-cp37-cp37m-macosx_10_14_x86_64.whl (754.4 kB view details)

Uploaded May 30, 2023 CPython 3.7mmacOS 10.14+ x86-64

pragzip-0.6.0-cp36-cp36m-win_amd64.whl (541.3 kB view details)

Uploaded May 30, 2023 CPython 3.6mWindows x86-64

pragzip-0.6.0-cp36-cp36m-musllinux_1_1_x86_64.whl (7.3 MB view details)

Uploaded May 30, 2023 CPython 3.6mmusllinux: musl 1.1+ x86-64

pragzip-0.6.0-cp36-cp36m-musllinux_1_1_i686.whl (7.2 MB view details)

Uploaded May 30, 2023 CPython 3.6mmusllinux: musl 1.1+ i686

pragzip-0.6.0-cp36-cp36m-manylinux_2_28_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.6mmanylinux: glibc 2.28+ x86-64

pragzip-0.6.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.7 MB view details)

Uploaded May 30, 2023 CPython 3.6mmanylinux: glibc 2.17+ x86-64

pragzip-0.6.0-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl (6.6 MB view details)

Uploaded May 30, 2023 CPython 3.6mmanylinux: glibc 2.17+ i686

pragzip-0.6.0-cp36-cp36m-macosx_10_14_x86_64.whl (756.4 kB view details)

Uploaded May 30, 2023 CPython 3.6mmacOS 10.14+ x86-64

File details

Details for the file pragzip-0.6.0.tar.gz.

File metadata

Download URL: pragzip-0.6.0.tar.gz
Upload date: May 30, 2023
Size: 607.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.11

File hashes

Hashes for pragzip-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`48deb92f8f3b6d374132d33eb2c3bfab1f1942732d9500642df2ee0e7a0e65a0`
MD5	`9309a72b90c8de894688a9b0fd6c0d70`
BLAKE2b-256	`f3ba1fbfcffbeb9ffbaa77ccf743185f79b9b05727aafb57f58e40aec6bfdfe1`

Algorithm	Hash digest
SHA256	`de2b9746dac9ec4ebc7749e0562ee4eb3cfbe80530faf7afb128b3383d09c286`
MD5	`4485e0ed6198105bbc97440f5361afdc`
BLAKE2b-256	`15c298d7fa0689b2c1e66af626fd29ee0d95eb18c34bba408f19033b9738c56a`

Algorithm	Hash digest
SHA256	`3fde5be65156461624664b3d9d103239b98498665a0c4ead9c83a72f80eb65f9`
MD5	`524dd64a517b02c8ffe98c93910bc39f`
BLAKE2b-256	`d8370bf5865bb9b1e3bf0c61c3760f8c1f39b6ce4b26dfe48fabf34ed97cb8e2`

Algorithm	Hash digest
SHA256	`cd522584ae098770dc6ef4332476dfde7b760ee1cef94f01d3b35a771570e091`
MD5	`1424138f71f2731ace492c90ff1cb70d`
BLAKE2b-256	`1df7988bb9954c510448c22b216c522fecf0c46f564b5f0e2c386674d28d9a12`

Algorithm	Hash digest
SHA256	`5f4e7cff2b80ab428e38da7a64c5c13d47375d391175417a15095eee48d24240`
MD5	`5c5243cf0016dd744e7a45ed178ef0de`
BLAKE2b-256	`e0be6485c39570dbfc9da2c45c66a539c64f0009aa61e864104578bc35f7ee0a`

Algorithm	Hash digest
SHA256	`e525717e2fe04059d741cd9da5494ecc6637ccab3598abd0c615e8ae2d04e430`
MD5	`2b49126c22868ce1f9f4e19a9ff23700`
BLAKE2b-256	`1503982b133b6240008a9ee3f291323209e8112c7e4c1bdd925bb983a9f62058`

Algorithm	Hash digest
SHA256	`8bb335ad6bb7a041e01b5c853255aa70f42dbf13a6bd57368abae12cb5a29553`
MD5	`da2675208ee7d032bcc686405ec86863`
BLAKE2b-256	`109e1f77a8085a005505c956357caef9faf9e46737dcd804c541a533b79b20a5`

Algorithm	Hash digest
SHA256	`082c39c0e6d5caf96f70a3260e6a4b834dcdf6fa0df2bd204eb3dc502f647905`
MD5	`d14e07a44c43b532922bb98c67c90968`
BLAKE2b-256	`799ada7818a1c3d5a9c5b1fa86925277a9fe2d9417ad5b4fe3f046b6a4888c54`

Algorithm	Hash digest
SHA256	`2e21a07c4fc9790be33ed77e8cec84636679f335b822727650157b298cbd840d`
MD5	`9e88804679653dc2b1a03642166efdbc`
BLAKE2b-256	`577bbca38de0fe2243f3917e3381961bfe0f715e8853e7ec371a68cc7c6393f9`

Algorithm	Hash digest
SHA256	`c745896c09e14ed7eddf465407424532de1cfdf39b6ab15170f97547d8a0b9cb`
MD5	`000fe73540d1bd25149adc0e20d2edbd`
BLAKE2b-256	`4aa939f5d6c71eb6c811191d77632a73b19de0b6002bdc3c0a94c1a3ff757622`

Algorithm	Hash digest
SHA256	`8f41fcbe0989fac4894a16f3ad3f44339333527682cf0ddddfbaa7d13f2e46e3`
MD5	`8dfe3bc10ea64ffec60d5ed2c684cd01`
BLAKE2b-256	`695ba9b03d82e60a65cdec2c5d4e976b1fd5558a122f294e9a47871c19ee5168`

Algorithm	Hash digest
SHA256	`4e07abb795914047fee28946f7b3ddc86f087f8868bfa13511ee0476b3bf1c01`
MD5	`9367b7fc1cdc455d9ee8b60aa3e19601`
BLAKE2b-256	`3d164c91d086309675d788d0799f25e1d57f2ee2000c5f40f2d8295133a5bc1a`

Algorithm	Hash digest
SHA256	`bebf37d48799d92bd921ad40642de2c66244fa4417cb4c17fbd42dd429155c1a`
MD5	`6eddf266fcd517979ddcfdd25b6045bf`
BLAKE2b-256	`3a07be2d1e23d25e59e51b3a87bda76b97d686705b782c2b747114b58656b6d8`

Algorithm	Hash digest
SHA256	`cdc582d88f5bfc21b7fb1d4685684f194e742d6c3976ecdca8609d5a4b4c5fb6`
MD5	`176bb4c91074a763cf251801cb4accc6`
BLAKE2b-256	`d5838291b30edc94a012259b8a0806547e264a24133d7464c7d6d183c6980604`

Algorithm	Hash digest
SHA256	`cb41810ffc10e53a44d3e5184cc64d683c797aba401aa8ce9373037b864a0d64`
MD5	`aab144628f78c7f9fd947fb6883f0d3c`
BLAKE2b-256	`824809a9196f8273d085d19f913e8d7e1fe1efe5e5be13e47363228c057b31d7`

Algorithm	Hash digest
SHA256	`8dd74680707f2318835012b8cd190203a17081f566a4f4f4e635448275782e72`
MD5	`eeaa2876fb78ff74ebbb30535c03e4c5`
BLAKE2b-256	`ebee4f64415d8288671243ce34e2feb3cd6251ab6eaf9584d41d77100d0c6e71`

Algorithm	Hash digest
SHA256	`83892bb9bf3e1025939f09731e9c76d4f49ab6f7c2e60ab29d12d9786e94b0a4`
MD5	`8d9341c50e2c13ad2dbf5c9024cc7a12`
BLAKE2b-256	`0344584fd479856238e3bd260bc4532bef231aea10e104f6cde8d8ac8d5ebcdd`

Algorithm	Hash digest
SHA256	`b7ac06d8ad55780d86aff506eef9690c503eb6c7b94f522803c3dfc1885e466f`
MD5	`2356c5eca974e65275bd0d2eb0e4d89a`
BLAKE2b-256	`91d5375b0c9f97dd33f57abfdee4cd141336c7252edfa9d0dbc995b4e3a13781`

Algorithm	Hash digest
SHA256	`90e36fe32bcf0223eec3832e698c01a65a7ca9722e1662957d29a846c268aadb`
MD5	`a47f10533baab7c2c179f5d589f7228d`
BLAKE2b-256	`67d2257a741dc1d0f1933d00c9cb0d6c558a5762e4be14cd02079b9f2c8d79b8`

Algorithm	Hash digest
SHA256	`57f7fb21419ea98d2ee05752a4353c029f93e50634deddbafa0e95bd3362c1d4`
MD5	`7d9bb718e083db73630f33cc4b4e46fb`
BLAKE2b-256	`9dd3ac3fe40a97c57308f42dda2c4f42079d3560877fed9c8445b77d00323873`

Algorithm	Hash digest
SHA256	`25e5bb30d5283a35df0a5452edf9db010f2be399305d7f9501603201b9e20b49`
MD5	`f1b20999a6aaba95611718fe6b661fd7`
BLAKE2b-256	`cd3ff250f4f7bde79fc4a24e7f80d8f79e4cb726ffc26b5a3608bbd01b5d1a96`

pragzip 0.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Pragzip: Parallel Random Access Gzip

Table of Contents

Performance

Decompression with Existing Index

Decompression from Scratch

Python

Installation

Usage

Command Line Tool

Python Library

Simple open, seek, read, and close

Use with context manager

Storing and loading the block offset map

Open a pure Python file-like object for indexed reading

Via Ratarmount

C++ library

Citation

Internal Architecture

Tracing the Decoder

Installation of Dependencies

Tracing

Results for a version from 2023-02-04

Comparison without and with rpmalloc preloaded

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

Algorithm	Hash digest
SHA256	`25766efd2aba39a74f618ca7f230e24ec427b7c4eb1bb04ebd34e646f249d4bb`
MD5	`ffd4d2d16531b1b35f9691ff198a34c7`
BLAKE2b-256	`83cdeb7af9dfd715c41a1748ca36bb1afb303916916641e3b44b2c27e9031285`

Algorithm	Hash digest
SHA256	`fb69e1574ed30b10117d07980eac17286b38e7b13edc709dfde0a39ccb5540d5`
MD5	`9913e97e7496c7d217d4f68feb837cb1`
BLAKE2b-256	`393a0bcbb8b4a3403cf5c43c1f68ff0d84b5eab1e6d0bf380047dfdb5d5a68cf`

Algorithm	Hash digest
SHA256	`0364369f870ca7fe1cfea8fff15927af63c278e6af6921b04797aca88793fe12`
MD5	`3e31fe90d9a6df3c856b7260228acc3e`
BLAKE2b-256	`3d98fafaa1f50c9dc12436909bea02466e95855c878f0601975aa840bc025b1d`

Algorithm	Hash digest
SHA256	`4762f6d9c09be626947dcc26affbdf618634f5c94041a80308b599391b28fd7c`
MD5	`e4c4f0265614bb747c0b55aad1c580c0`
BLAKE2b-256	`f2c1d969defb8f36e91dede5ceb8c00918630a298d1c16633c1ece2628ea2733`

Algorithm	Hash digest
SHA256	`fcc1a2ae9306405c728b028f68b6f39aebc94d965e736cc2f5fc42f3b8e9af72`
MD5	`d44e0573274c59d49445ae64e7800954`
BLAKE2b-256	`3c700b1cafa804c2e822c18e3025ad8a600f7c865d8529b489017dd6fc01ccab`

Algorithm	Hash digest
SHA256	`b12b7cee2dd5c9cfb49e515decf65e6ae79d44d57ee67a5acafc47aae736d6e4`
MD5	`4444cae5ef171a41ee4003393032415f`
BLAKE2b-256	`a31cddb819d70dc1205b6fae0d4889d77e319b51515d14b44326ce112fe33f52`

Algorithm	Hash digest
SHA256	`6297f0032cd0f85a198c7e642093342664a8131755161a659294a49435f1e8b2`
MD5	`c3a1ceb1b017fa3df5ee0326a9f32d77`
BLAKE2b-256	`94b1254e3b3a19494590446055a7e85a4881ff9025cab54e55c6bc80729805fc`

Algorithm	Hash digest
SHA256	`4d87aa1b5cccedd6f4f657ea71607cf272975470f9e67f13127205b7fe5f6fea`
MD5	`2c2f2aeaad831b45d7c5496bd803fdaa`
BLAKE2b-256	`b6245de4192ba4c800424a9e82a6ed995e316f8006bf3b269987d62820244461`

Algorithm	Hash digest
SHA256	`933d745966414627a5e04873ed95767da7ecc397701944227f78d4fb5e77f011`
MD5	`702c01cf639b1d7bcf2a44d1261c21b7`
BLAKE2b-256	`f079c1f1fb901265e62785b288c71c21c6d2852d71d1f09ebb1608f644bf6d72`

Algorithm	Hash digest
SHA256	`14aaa276ca62c92fdfea8e91a8e0a501840e57902ef153a8519f79a38a229647`
MD5	`006316d3aaeae677556102393e062a69`
BLAKE2b-256	`55715260f1c0eb14b77c8d11952f4baf1ad6a5f2e1e5c6ac48c373d9a70baa76`