Reversible adjacent XOR differencing transform algo
Project description
XOR ∆
A reversible adjacent XOR differencing transform algo
"To compress, perchance to save..."
Synopsis
xor-delta is a small experimental Python package that explores XOR-adjacent delta encoding as a preprocessing transform for compression.
It answers a very specific question:
Does XORing adjacent values reduce entropy in a way that helps real compressors?
- Short answer: maybe 🤷🏼♂️
- Long answer: Step into my office...
What is XOR-delta?
For Those in a Hurry...
The core transform used by this project is:
A_i^{(k+1)} = A_i^{(k)} \oplus A_{i+1}^{(k)} \quad \forall i
Where:
- $A^{(k)}$ is the original sequence at step $k$
- $\oplus$ denotes bitwise XOR
- One endpoint value (the anchor) is stored to make the transform reversible
How Does It Work?
Given a sequence of values:
v0, v1, v2, v3, ...
XOR-delta encoding stores:
- one anchor value (first or last)
- a list of XORs between adjacent values:
v0 ^ v1, v1 ^ v2, v2 ^ v3, ...
This transform is:
- lossless
- reversible
- cheap
- not compression by itself
It’s a preprocessing step you can feed into standard compressors like zlib, bz2, or lzma.
Installation
pip install xor-delta
Or
git clone https://GitHub.com/DJStompZone/xor_delta
cd xor_delta
pip install . # use `--with=dev` if you plan to run tests
Python API
Integer sequences
from xor_delta import xor_delta_encode_ints, xor_delta_decode_ints
data = [10, 11, 12, 13]
encoded = xor_delta_encode_ints(data)
decoded = xor_delta_decode_ints(encoded)
assert decoded == data
Byte sequences
from xor_delta import xor_delta_encode_bytes, xor_delta_decode_bytes
data = b"hello world"
anchor, diffs, side = xor_delta_encode_bytes(data)
restored = xor_delta_decode_bytes(anchor, diffs, side)
assert restored == data
CLI Benchmark Tool
xor-delta ships with a benchmarking CLI that compares compression before and after XOR-delta.
Run the default benchmark (Shakespeare)
xor-delta-bench
This automatically downloads Shakespeare from Project Gutenberg, caches it locally, and benchmarks:
- raw bytes
- XOR-adjacent bytes
using:
zlibbz2lzma
corpus_cache/pg100.txt.<hash>
RAW raw=5,638,525 zlib=2,138,296 (0.379x) bz2=1,586,908 (0.281x) lzma=1,673,804 (0.297x)
XOR raw=5,638,525 zlib=2,546,436 (0.452x) bz2=1,708,046 (0.303x) lzma=1,890,440 (0.335x)
xor-vs-raw zlib +19.09% bz2 +7.63% lzma +12.94%
Interpretation:
XOR-delta made compression worse for English text across all tested compressors.
That’s the point — we measured it instead of guessing.
Benchmark your own files
xor-delta-bench myfile.bin
xor-delta-bench mydir/
Use a Gutenberg preset
xor-delta-bench --gutenberg shakespeare
# Feel free to send a PR if you want more presets <3
Or any URL
xor-delta-bench --gutenberg-url https://example.com/text.txt
Downloads are cached in corpus_cache/.
When does XOR-delta help?
XOR-adjacent transforms can help when:
- data has small local variation
- values are structured, not textual
- adjacent samples are correlated
Examples:
- counters
- timestamps
- some sensor streams
- monotonic-ish numeric data
It can hurt when:
- data is already high-entropy
- compressors already exploit structure better (text + LZ)
- XOR destroys symbol locality
Development
Run tests:
pytest
# Or if you're using Poetry
poetry run pytest
License
MIT
Credits
Created by DJ Stomp https://github.com/DJStompZone/xor_delta
Inspired by spectcow's original description of the algorithm, full credit for the core concept goes to them.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xor_delta-1.0.0.tar.gz.
File metadata
- Download URL: xor_delta-1.0.0.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.13.9 Linux/6.2.1-PRoot-Distro
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09a10f8dd5c8981d5ddabad3ec63143d2592ed13f2c0664d47d90596d423ddc3
|
|
| MD5 |
f7ec43153791e2df35b3a5b46e3c61ec
|
|
| BLAKE2b-256 |
f867f8d73c017e8b9d4a371cb53e1f9f4f7f3e83d789d6064d47e760375d3367
|
File details
Details for the file xor_delta-1.0.0-py3-none-any.whl.
File metadata
- Download URL: xor_delta-1.0.0-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.13.9 Linux/6.2.1-PRoot-Distro
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa71ac91199af9ab54f87500df449bdaa4e30d2379abc87532c9e43c02c55df2
|
|
| MD5 |
fefc1b5d08c938de565394faef73b4ba
|
|
| BLAKE2b-256 |
d9eee5b2b6d44703bfd1d04961d58ad22508fb0b0860dbbca961ccebf13e5672
|