Skip to main content

Pure Python module providing a variable-length, content-based blocking algorithm

Project description

Chop a file into variable-length, content-based chunks.

Example use: .. code-block:: python

>>> import rolling_checksum_mod
>>> with open('/tmp/big-file.bin', 'rb') as file_:
>>>     for chunk in rolling_checksum_mod.min_max_chunker(file_):
>>>         # chunk is now a piece of the data from file_, and it will not always have the same length.
>>>         # Instead, it has the property that if you insert a byte at the beginning of /tmp/big-file.bin,
>>>         # most of the chunks of the file will remain the same.  This can be nice for a deduplicating
>>>         # backup program.
>>>         print(len(chunk))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rolling_checksum_py_mod-1.0.1.tar.gz (3.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page