Skip to main content

A pure, minimalist, no-dependency Python library of various edit distances.

Project description

pyeditdistance

PyPI PyPI - Downloads PyPI - License

A pure, minimalist Python library of various edit distance metrics. MIT-licensed, zero dependencies.

Implemented methods:

Levenshtein and Damerau-Levenshtein distances use the Wagner-Fischer dynamic programming algorithm [2].

Some basic unit tests can be executed using pytest

Installation

pip install pyeditdistance

Optional (user-specific): pip install --user pyeditdistance

Usage

from pyeditdistance import distance as d

s1 = "I am Joe Bloggs"
s2 = "I am John Galt"

# Levenshtein distance
res = d.levenshtein(s1, s2) # => 8

# Normalized Levenshtein
res = d.normalized_levenshtein(s1, s2) # => 0.4324...

# Damerau-Levenshtein
s3 = "abc"
s4 = "cb"
res = d.damerau_levenshtein(s3, s4) # => 2

# Hamming distance
s5 = "abcccdeeffghh zz"
s6 = "bacccdeeffhghz z"
res = d.hamming(s5, s6) # => 6

# Longest common subsequence (LCS)
s7 = "AAGGQQERqer"
s8 = "AaQERqer"
res = d.longest_common_subsequence(s7, s8) # => 7

References

  1. L. Yujian and L. Bo, "A normalized Levenshtein distance metric," IEEE Transactions on Pattern Analysis and Machine Intelligence (2007). https://ieeexplore.ieee.org/document/4160958
  2. R. Wagner and M. Fisher, "The string to string correction problem," Journal of the ACM, 21:168-178, 1974.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyeditdistance-1.0.1.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

pyeditdistance-1.0.1-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file pyeditdistance-1.0.1.tar.gz.

File metadata

  • Download URL: pyeditdistance-1.0.1.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.12

File hashes

Hashes for pyeditdistance-1.0.1.tar.gz
Algorithm Hash digest
SHA256 cab95198abe506437d2a82bfe151f63ed1f62358e3358522d4c0b5e96d258308
MD5 aa7d81aaa8836b0ef2e582c2b906fae1
BLAKE2b-256 cfcb2946404f631983903ddaa53da379bc16d15b922dc190c526b8958d81e229

See more details on using hashes here.

File details

Details for the file pyeditdistance-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pyeditdistance-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 525fc3c241bc9dbd3a713236d3d85ab299620b37a186c6df76ff4856597db148
MD5 4decaa0970d42f2876ddbbd572662467
BLAKE2b-256 33b6e9ada1f6cc8bb748ed0f81912d760edfa03e1c94ff5f70e70a47099b5243

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page