Low-level interface to the zlib library that enables capturing the decoding state
Project description
zlib-state
Low-level interface to the zlib library that enables capturing the decoding state.
Install
From PyPi:
pip install zlib-state
From source:
python setup.py install
Tested on ubuntu/macos/windows with python 3.6-3.10.
GzipStateFile
Wraps Decompressor as a buffered reader.
Based on my benchmarking, this is somewhat slower than python's gzip.
A typical usage pattern looks like:
import zlib_state
TARGET_LINE = 5000 # pick back up after around the 5,000th line
# Specify keep_last_state=True to tell object to grab and keep the state and pos after each block
with zlib_state.GzipStateFile('testdata/frankenstein.txt.gz', keep_last_state=True) as f:
for i, line in enumerate(f):
if i == TARGET_LINE:
state, pos = f.last_state, f.last_state_pos
with zlib_state.GzipStateFile('testdata/frankenstein.txt.gz') as f:
f.zseek(pos, state)
remainder = f.read()
Decompressor
Very basic decompression object that's picky and unforgiving.
Based on my benchmarking, this can iterate over gzip files faster than python's gzip.
A typical usage pattern looks like:
import zlib_state
decomp = zlib_state.Decompressor(32 + 15) # from zlib; 32 indicates gzip header, 15 window size
block_count = 0
with open('testdata/frankenstein.txt.gz', 'rb') as f:
while not decomp.eof():
needed_input = decomp.needs_input()
if needed_input > 0:
# decomp needs more input, and it tells you how much.
decomp.feed_input(f.read(needed_input))
# next_chunk may be empty (e.g., if finished with gzip headers) or may contain data.
# It sends as much as it has left in its output buffer, or asks zlib to continue.
next_chunk = decomp.read() # you can also pass a maximum size to take and/or a buffer to write to
if decomp.block_boundary():
block_count += 1
# When it reaches the end of a deflate block, it always stops. At these times, you can grab the state
# if you wish.
if block_count == 4: # resume after the 4th block
state = decomp.get_state() # includes zdict, bits, byte -- everything it needs to resume from pos
pos = decomp.total_in() # the current position in the binary file to resume from
print(f'{block_count} blocks processed')
# resume from somewhere in the file. Only possible spots are the block boundaries, given the state
f.seek(pos)
decomp = zlib_state.Decompressor(-15) # from zlib; 15 window size, negative means no headers
decomp.set_state(*state)
while not decomp.eof():
needed_input = decomp.needs_input()
if needed_input > 0:
# decomp needs more input, and it tells you how much.
decomp.feed_input(f.read(needed_input))
next_chunk = decomp.read()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
zlib-state-0.1.4.tar.gz
(9.4 kB
view hashes)
Built Distributions
Close
Hashes for zlib_state-0.1.4-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3190a359bc71153c711bea17763b8b638dafd1da265b12e92ca97c8f374fa5e6 |
|
MD5 | 152091dd99cb7222c0f6bfbc0c16ee98 |
|
BLAKE2b-256 | 69e22a977fea5e360ebe94e9b12e2fba423271a3475b20cf1a6c7437e076a152 |
Close
Hashes for zlib_state-0.1.4-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a5ba177825677f6b0dd115512835b835e71c18830f466366bb2b4e901222a0f |
|
MD5 | 7e1fcd124ef6c70c7b9c86079c44dc26 |
|
BLAKE2b-256 | 51685f09811f982688c0048057fd4d29a55ff95a345db3521a52e410390f83f3 |
Close
Hashes for zlib_state-0.1.4-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b6d5a5d45c14ef33f867a78326a7effa15044d141675b90e55428a9543a7b95 |
|
MD5 | 24bee76ad3f57d1b6f93c49c9a819ba1 |
|
BLAKE2b-256 | 451765ea79ea39514f255c8ab6d1b077d06da54cd9c4e3161c92dca7ecf697ec |
Close
Hashes for zlib_state-0.1.4-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d498206b0f55142dba9b2c2a52ae37459730c7bd6c1b6ce9083a6e04dfec4266 |
|
MD5 | 7aa2ed85e883fadf3e75ad2994e25859 |
|
BLAKE2b-256 | f434c4a4f0d9485c6daf1cb771c0892b2f6fc82c893c20d03475297b1e58b133 |
Close
Hashes for zlib_state-0.1.4-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e634d6dd0b02cb7f21f9440f171c7707c552fe2b18942fd2d8346a3d2802b8da |
|
MD5 | 2f2e8d721fbeec7609e88f6f8665122c |
|
BLAKE2b-256 | 09dfc8f1319913d443e9a0f75322fbe7d12c6877373217a797afb4458fee925a |