Low-level interface to the zlib library that enables capturing the decoding state
Project description
zlib-state
Low-level interface to the zlib library that enables capturing the decoding state.
Install
From PyPi:
pip install zlib-state
From source:
pip install .
Tested on Ubuntu/macOs/Windows with Python 3.7-3.12.
GzipStateFile
Wraps Decompressor as a buffered reader.
Based on my benchmarking, this is somewhat slower than python's gzip.
A typical usage pattern looks like:
import zlib_state
TARGET_LINE = 5000 # pick back up after around the 5,000th line
# Specify keep_last_state=True to tell object to grab and keep the state and pos after each block
with zlib_state.GzipStateFile('testdata/frankenstein.txt.gz', keep_last_state=True) as f:
for i, line in enumerate(f):
if i == TARGET_LINE:
state, pos = f.last_state, f.last_state_pos
with zlib_state.GzipStateFile('testdata/frankenstein.txt.gz') as f:
f.zseek(pos, state)
remainder = f.read()
Decompressor
Very basic decompression object that's picky and unforgiving.
Based on my benchmarking, this can iterate over gzip files faster than python's gzip.
A typical usage pattern looks like:
import zlib_state
decomp = zlib_state.Decompressor(32 + 15) # from zlib; 32 indicates gzip header, 15 window size
block_count = 0
with open('testdata/frankenstein.txt.gz', 'rb') as f:
while not decomp.eof():
needed_input = decomp.needs_input()
if needed_input > 0:
# decomp needs more input, and it tells you how much.
decomp.feed_input(f.read(needed_input))
# next_chunk may be empty (e.g., if finished with gzip headers) or may contain data.
# It sends as much as it has left in its output buffer, or asks zlib to continue.
next_chunk = decomp.read() # you can also pass a maximum size to take and/or a buffer to write to
if decomp.block_boundary():
block_count += 1
# When it reaches the end of a deflate block, it always stops. At these times, you can grab the state
# if you wish.
if block_count == 4: # resume after the 4th block
state = decomp.get_state() # includes zdict, bits, byte -- everything it needs to resume from pos
pos = decomp.total_in() # the current position in the binary file to resume from
print(f'{block_count} blocks processed')
# resume from somewhere in the file. Only possible spots are the block boundaries, given the state
f.seek(pos)
decomp = zlib_state.Decompressor(-15) # from zlib; 15 window size, negative means no headers
decomp.set_state(*state)
while not decomp.eof():
needed_input = decomp.needs_input()
if needed_input > 0:
# decomp needs more input, and it tells you how much.
decomp.feed_input(f.read(needed_input))
next_chunk = decomp.read()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
zlib_state-0.1.8.tar.gz
(9.5 kB
view hashes)
Built Distributions
Close
Hashes for zlib_state-0.1.8-cp312-cp312-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3664c4ea81d7c3fb348a33e8b26bf3b3ed46713f4aaf8577dd50ec2daa0dc36 |
|
MD5 | 4f78f862584684bc7d3ce6fe256d1eee |
|
BLAKE2b-256 | 02538d15e3b2cd5341d4d61d6f26498044f9d006ddff58b9a4fef50e982a274b |
Close
Hashes for zlib_state-0.1.8-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30b89c890d09dc2236527acf1c517fb33ed47f89c4d825f60be9aaf2b66eed18 |
|
MD5 | ee30b3d4903ac6843ba120d2849fe80d |
|
BLAKE2b-256 | bb9e0a7793478fd42c4e59b60568db452aceab266d0e58041c873df85c63dc47 |
Close
Hashes for zlib_state-0.1.8-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8857d28aa61612903a95a993f1f3da5662ed2367c381c92176b11d48ccdeb54b |
|
MD5 | 1e1772be6c8918ea9ea69c7fe0d36e84 |
|
BLAKE2b-256 | 329664a08b18f57895f145b6a96a9ce5008eda018a7dc3083646f1068678acd0 |
Close
Hashes for zlib_state-0.1.8-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 51f4414753fd80c162f68d7d18a38045b5a3ff3ca83e11fcbdd83883b1aba635 |
|
MD5 | 891a8170fb3e3cb4b748591f1103d4a5 |
|
BLAKE2b-256 | bf12c81c48072d8feadbd0b5d15d4bedf6f22d983e534ec34422a48a258b94af |
Close
Hashes for zlib_state-0.1.8-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b2a93513dfceee7432211fb23f81e6e4cbc99c28a1affd9b305aa8c1cbd2b56 |
|
MD5 | 191700a9ad1cdb87f64e84bb2e757d4b |
|
BLAKE2b-256 | 39f3426102b4523e9442ea087d9b227685c2c60be574e48e6ad596e636d2c379 |