High-performance Myers diff algorithm in C with Python bindings
Project description
myers-diff
High-performance Myers diff algorithm implementation in C with Python bindings.
Installation
pip install myers-diff
Usage
import myers_diff
# Compare two lists
list_a = ["hello", "world", "foo", "bar"]
list_b = ["hello", "there", "foo", "baz"]
# Get full edit operations
ops = myers_diff.diff(list_a, list_b)
for op in ops:
print(f"{op['type']} [{op['index']}]: {op['line']}")
# DELETE [1]: world
# INSERT [1]: there
# DELETE [3]: bar
# INSERT [3]: baz
# Get just the edit count
count = myers_diff.diff_count(list_a, list_b)
print(f"Edit distance: {count}") # 4
# Bounded diff - early exit if distance exceeds max_d
ops = myers_diff.diff(list_a, list_b, max_d=2)
if ops is None:
print("Lists are too different (> 2 edits)")
else:
print(f"Found {len(ops)} edits")
API
diff(a, b, max_d=-1)
Compute the edit operations to transform list a into list b.
a: Source list of stringsb: Target list of stringsmax_d: Maximum allowed edit distance. If the actual distance exceeds this, returnsNonefor early exit. Use-1(default) for no limit.
Returns a list of operations [{"type": "DELETE"|"INSERT", "index": int, "line": str}, ...] or None if max_d is exceeded.
diff_count(a, b, max_d=-1)
Get the number of edit operations without building the full operation list.
Returns the count as an integer, or None if max_d is exceeded.
distance(a, b)
Compute the edit distance using dynamic programming (no early exit option).
Returns the edit distance as an integer.
Performance
Optimized C implementation with:
- Pre-computed string hashes for fast comparison
- Common prefix/suffix trimming
- Compact trace storage
- Optional early termination with
max_dparameter
Benchmarks on ~4500 word lists with ~2200 edits:
- Full diff: ~10ms
- Bounded diff with early exit: <1ms
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file myers_diff-0.1.0.tar.gz.
File metadata
- Download URL: myers_diff-0.1.0.tar.gz
- Upload date:
- Size: 71.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77c30b61b0682768d75bc5aeaa3635edf2cb8a6150b12b09d66fb03df4cacccd
|
|
| MD5 |
1e391a2bd884232f6ec8be142bc957ab
|
|
| BLAKE2b-256 |
ee1f4c6ed6f8557f4dfebfb926903b98d008558e0ab56071663a0f00982a1ad4
|
File details
Details for the file myers_diff-0.1.0-cp312-cp312-macosx_10_9_universal2.whl.
File metadata
- Download URL: myers_diff-0.1.0-cp312-cp312-macosx_10_9_universal2.whl
- Upload date:
- Size: 126.7 kB
- Tags: CPython 3.12, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c87a451992de93b8d075b204825daccd6f5ca0efd2b63dcc2b7f29f2d0351355
|
|
| MD5 |
bf9bf45642ca9716c18017fa1be4ca0a
|
|
| BLAKE2b-256 |
c9ffb1c61b1a19420a0e7dc9ddd2c9a0175a3c0c65ae37d500e31c9b287bb9e6
|