Binary delta encoding tools.
Project description
About
Binary delta encoding in Python 3.6+ and C.
Based on http://www.daemonology.net/bsdiff/ and HDiffPatch, with the following features:
bsdiff, hdiffpatch and match-blocks algorithms.
sequential, hdiffpatch or in-place (resumable) patch types.
BZ2, LZ4, LZMA, Zstandard, heatshrink or CRLE compression.
Sequential patches allow streaming.
Maximum file size is 2 GB for the bsdiff algorithm. There is practically no limit for the hdiffpatch and match-blocks algorithms.
Incremental apply patch implemented in C, suitable for memory constrained embedded devices. Only the sequential patch type is supported.
SA-IS or divsufsort instead of qsufsort for bsdiff.
Optional experimental data format aware algorithm for potentially smaller patches. I don’t recommend anyone to use this functionality as the gain is small in relation to memory usage and code complexity!
There is a risk this functionality uses patent https://patents.google.com/patent/EP1988455B1/en. Anyway, this patent expires in August 2019 as I understand it.
Supported data formats:
ARM Cortex-M4
AArch64
Project homepage: https://github.com/eerimoq/detools
Documentation: http://detools.readthedocs.org/en/latest
Installation
pip install detools
Statistics
Patch sizes, memory usage (RSS) and elapsed times when creating a patch from Python-3.7.3.tar (79M) to Python-3.8.1.tar (84M) for various algorithm, patch type and compression combinations.
See tests/benchmark.sh for details on how the data was collected.
Algorithm |
Patch type |
Compr. |
Patch size |
RSS |
Time |
---|---|---|---|---|---|
bsdiff |
sequential |
lzma |
3,5M |
662M |
0:24.29 |
bsdiff |
sequential |
none |
86M |
646M |
0:15.20 |
hdiffpatch |
hdiffpatch |
lzma |
2,4M |
523M |
0:13.74 |
hdiffpatch |
hdiffpatch |
none |
7,2M |
523M |
0:10.24 |
match-blocks |
sequential |
lzma |
2,9M |
273M |
0:08.57 |
match-blocks |
sequential |
none |
84M |
273M |
0:01.72 |
match-blocks |
hdiffpatch |
lzma |
2,6M |
212M |
0:06.07 |
match-blocks |
hdiffpatch |
none |
9,7M |
212M |
0:01.30 |
Same as above, but for MicroPython ESP8266 binary releases (from 604k to 615k).
Algorithm |
Patch type |
Compr. |
Patch size |
RSS |
Time |
---|---|---|---|---|---|
bsdiff |
sequential |
lzma |
71K |
46M |
0:00.64 |
bsdiff |
sequential |
none |
609K |
27M |
0:00.33 |
hdiffpatch |
hdiffpatch |
lzma |
65K |
42M |
0:00.37 |
hdiffpatch |
hdiffpatch |
none |
123K |
25M |
0:00.32 |
match-blocks |
sequential |
lzma |
194K |
46M |
0:00.44 |
match-blocks |
sequential |
none |
606K |
25M |
0:00.22 |
match-blocks |
hdiffpatch |
lzma |
189K |
43M |
0:00.38 |
match-blocks |
hdiffpatch |
none |
313K |
24M |
0:00.19 |
Example usage
Examples in C are found in c.
Command line tool
The create patch subcommand
Create a patch foo.patch from tests/files/foo/old to tests/files/foo/new.
$ detools create_patch tests/files/foo/old tests/files/foo/new foo.patch
Successfully created 'foo.patch' in 0.01 seconds!
$ ls -l foo.patch
-rw-rw-r-- 1 erik erik 127 feb 2 10:35 foo.patch
Create the same patch as above, but without compression.
$ detools create_patch --compression none \
tests/files/foo/old tests/files/foo/new foo-no-compression.patch
Successfully created 'foo-no-compression.patch' in 0 seconds!
$ ls -l foo-no-compression.patch
-rw-rw-r-- 1 erik erik 2792 feb 2 10:35 foo-no-compression.patch
Create a hdiffpatch patch foo-hdiffpatch.patch.
$ detools create_patch --algorithm hdiffpatch --patch-type hdiffpatch \
tests/files/foo/old tests/files/foo/new foo-hdiffpatch.patch
Successfully created patch 'foo-hdiffpatch.patch' in 0.01 seconds!
$ ls -l foo-hdiffpatch.patch
-rw-rw-r-- 1 erik erik 146 feb 2 10:37 foo-hdiffpatch.patch
Lower memory usage with --algorithm match-blocks algorithm. Mainly useful for big files. Creates slightly bigger patches than bsdiff and hdiffpatch.
$ detools create_patch --algorithm match-blocks \
tests/files/foo/old tests/files/foo/new foo-hdiffpatch-64.patch
Successfully created patch 'foo-hdiffpatch-64.patch' in 0.01 seconds!
$ ls -l foo-hdiffpatch-64.patch
-rw-rw-r-- 1 erik erik 404 feb 8 11:03 foo-hdiffpatch-64.patch
Non-sequential but smaller patch with --patch-type hdiffpatch.
$ detools create_patch \
--algorithm match-blocks --patch-type hdiffpatch \
tests/files/foo/old tests/files/foo/new foo-hdiffpatch-sequential.patch
Successfully created 'foo-hdiffpatch-sequential.patch' in 0.01 seconds!
$ ls -l foo-hdiffpatch-sequential.patch
-rw-rw-r-- 1 erik erik 389 feb 8 11:05 foo-hdiffpatch-sequential.patch
The create in-place patch subcommand
Create an in-place patch foo-in-place.patch.
$ detools create_patch_in_place --memory-size 3000 --segment-size 500 \
tests/files/foo/old tests/files/foo/new foo-in-place.patch
Successfully created 'foo-in-place.patch' in 0.01 seconds!
$ ls -l foo-in-place.patch
-rw-rw-r-- 1 erik erik 672 feb 2 10:36 foo-in-place.patch
The create bsdiff patch subcommand
Create a bsdiff patch foo-bsdiff.patch, compatible with the original bsdiff program.
$ detools create_patch_bsdiff \
tests/files/foo/old tests/files/foo/new foo-bsdiff.patch
Successfully created 'foo-bsdiff.patch' in 0 seconds!
$ ls -l foo-bsdiff.patch
-rw-rw-r-- 1 erik erik 261 feb 2 10:36 foo-bsdiff.patch
The apply patch subcommand
Apply the patch foo.patch to tests/files/foo/old to create foo.new.
$ detools apply_patch tests/files/foo/old foo.patch foo.new
Successfully created 'foo.new' in 0 seconds!
$ ls -l foo.new
-rw-rw-r-- 1 erik erik 2780 feb 2 10:38 foo.new
The in-place apply patch subcommand
Apply the in-place patch foo-in-place.patch to foo.mem.
$ cp tests/files/foo/in-place-3000-500.mem foo.mem
$ detools apply_patch_in_place foo.mem foo-in-place.patch
Successfully created 'foo.mem' in 0 seconds!
$ ls -l foo.mem
-rw-rw-r-- 1 erik erik 3000 feb 2 10:40 foo.mem
The bsdiff apply patch subcommand
Apply the patch foo-bsdiff.patch to tests/files/foo/old to create foo.new.
$ detools apply_patch_bsdiff tests/files/foo/old foo-bsdiff.patch foo.new
Successfully created 'foo.new' in 0 seconds!
$ ls -l foo.new
-rw-rw-r-- 1 erik erik 2780 feb 2 10:41 foo.new
The patch info subcommand
Print information about the patch foo.patch.
$ detools patch_info foo.patch
Type: sequential
Patch size: 127 bytes
To size: 2.71 KiB
Patch/to ratio: 4.6 % (lower is better)
Diff/extra ratio: 9828.6 % (higher is better)
Size/data ratio: 0.3 % (lower is better)
Compression: lzma
Number of diffs: 2
Total diff size: 2.69 KiB
Average diff size: 1.34 KiB
Median diff size: 1.34 KiB
Number of extras: 2
Total extra size: 28 bytes
Average extra size: 14 bytes
Median extra size: 14 bytes
Contributing
Fork the repository.
Install prerequisites.
pip install -r requirements.txt
Implement the new feature or bug fix.
Implement test case(s) to ensure that future changes do not break legacy.
Run the tests.
make test
Create a pull request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file detools-0.53.0.tar.gz
.
File metadata
- Download URL: detools-0.53.0.tar.gz
- Upload date:
- Size: 11.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95b5bdeaaaf76e73ca4c8015f2d7c0cc98fbe91f5119b0a97faeb991315241db |
|
MD5 | 4230a386ac69ebcaa3e33d1dc1bf8ade |
|
BLAKE2b-256 | ef4cbb789ed55b5266cb02101d6efe52ea48e9fc4a0900690868f297153e3db2 |