Skip to main content

High-performance Myers diff algorithm in C with Python bindings

Project description

myers-diff

High-performance Myers diff algorithm implementation in C with Python bindings.

Installation

pip install myers-diff

Usage

import myers_diff

# Compare two lists
list_a = ["hello", "world", "foo", "bar"]
list_b = ["hello", "there", "foo", "baz"]

# Get full edit operations
ops = myers_diff.diff(list_a, list_b)
for op in ops:
    print(f"{op['type']} [{op['index']}]: {op['line']}")
# DELETE [1]: world
# INSERT [1]: there
# DELETE [3]: bar
# INSERT [3]: baz

# Get just the edit count
count = myers_diff.diff_count(list_a, list_b)
print(f"Edit distance: {count}")  # 4

# Bounded diff - early exit if distance exceeds max_d
ops = myers_diff.diff(list_a, list_b, max_d=2)
if ops is None:
    print("Lists are too different (> 2 edits)")
else:
    print(f"Found {len(ops)} edits")

API

diff(a, b, max_d=-1)

Compute the edit operations to transform list a into list b.

  • a: Source list of strings
  • b: Target list of strings
  • max_d: Maximum allowed edit distance. If the actual distance exceeds this, returns None for early exit. Use -1 (default) for no limit.

Returns a list of operations [{"type": "DELETE"|"INSERT", "index": int, "line": str}, ...] or None if max_d is exceeded.

diff_count(a, b, max_d=-1)

Get the number of edit operations without building the full operation list.

Returns the count as an integer, or None if max_d is exceeded.

distance(a, b)

Compute the edit distance using dynamic programming (no early exit option).

Returns the edit distance as an integer.

Performance

Optimized C implementation with:

  • Pre-computed string hashes for fast comparison
  • Common prefix/suffix trimming
  • Compact trace storage
  • Optional early termination with max_d parameter

Benchmarks on ~4500 word lists with ~2200 edits:

  • Full diff: ~10ms
  • Bounded diff with early exit: <1ms

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

myers_diff-0.1.0.tar.gz (71.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

myers_diff-0.1.0-cp312-cp312-macosx_10_9_universal2.whl (126.7 kB view details)

Uploaded CPython 3.12macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file myers_diff-0.1.0.tar.gz.

File metadata

  • Download URL: myers_diff-0.1.0.tar.gz
  • Upload date:
  • Size: 71.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for myers_diff-0.1.0.tar.gz
Algorithm Hash digest
SHA256 77c30b61b0682768d75bc5aeaa3635edf2cb8a6150b12b09d66fb03df4cacccd
MD5 1e391a2bd884232f6ec8be142bc957ab
BLAKE2b-256 ee1f4c6ed6f8557f4dfebfb926903b98d008558e0ab56071663a0f00982a1ad4

See more details on using hashes here.

File details

Details for the file myers_diff-0.1.0-cp312-cp312-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for myers_diff-0.1.0-cp312-cp312-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 c87a451992de93b8d075b204825daccd6f5ca0efd2b63dcc2b7f29f2d0351355
MD5 bf9bf45642ca9716c18017fa1be4ca0a
BLAKE2b-256 c9ffb1c61b1a19420a0e7dc9ddd2c9a0175a3c0c65ae37d500e31c9b287bb9e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page