A pure, minimalist, no-dependency Python library of various edit distances.
Project description
pyeditdistance
A pure, minimalist Python library of various edit distance metrics. MIT-licensed, zero dependencies.
Implemented methods:
- Levenshtein (iterative and recursive implementations)
- Normalized Levenshtein (using Yujian-Bo [1])
- Damerau-Levenshtein
- Hamming distance
- Longest common subsequence (LCS)
Levenshtein and Damerau-Levenshtein distances use the Wagner-Fischer dynamic programming algorithm [2].
Some basic unit tests can be executed using pytest
Installation
pip install pyeditdistance
Optional (user-specific):
pip install --user pyeditdistance
Usage
from pyeditdistance import distance as d
s1 = "I am Joe Bloggs"
s2 = "I am John Galt"
# Levenshtein distance
res = d.levenshtein(s1, s2) # => 8
# Normalized Levenshtein
res = d.normalized_levenshtein(s1, s2) # => 0.4324...
# Damerau-Levenshtein
s3 = "abc"
s4 = "cb"
res = d.damerau_levenshtein(s3, s4) # => 2
# Hamming distance
s5 = "abcccdeeffghh zz"
s6 = "bacccdeeffhghz z"
res = d.hamming(s5, s6) # => 6
# Longest common subsequence (LCS)
s7 = "AAGGQQERqer"
s8 = "AaQERqer"
res = d.longest_common_subsequence(s7, s8) # => 7
References
- L. Yujian and L. Bo, "A normalized Levenshtein distance metric," IEEE Transactions on Pattern Analysis and Machine Intelligence (2007). https://ieeexplore.ieee.org/document/4160958
- R. Wagner and M. Fisher, "The string to string correction problem," Journal of the ACM, 21:168-178, 1974.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyeditdistance-1.0.1.tar.gz.
File metadata
- Download URL: pyeditdistance-1.0.1.tar.gz
- Upload date:
- Size: 5.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cab95198abe506437d2a82bfe151f63ed1f62358e3358522d4c0b5e96d258308
|
|
| MD5 |
aa7d81aaa8836b0ef2e582c2b906fae1
|
|
| BLAKE2b-256 |
cfcb2946404f631983903ddaa53da379bc16d15b922dc190c526b8958d81e229
|
File details
Details for the file pyeditdistance-1.0.1-py3-none-any.whl.
File metadata
- Download URL: pyeditdistance-1.0.1-py3-none-any.whl
- Upload date:
- Size: 4.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
525fc3c241bc9dbd3a713236d3d85ab299620b37a186c6df76ff4856597db148
|
|
| MD5 |
4decaa0970d42f2876ddbbd572662467
|
|
| BLAKE2b-256 |
33b6e9ada1f6cc8bb748ed0f81912d760edfa03e1c94ff5f70e70a47099b5243
|