Damererau-Levenshtein implementation with Rust for fast performance.
Project description
Rust implementation of the Damerau-Levenshtein distance
Damerau-Levenshtein implementation in Rust as Python package. You can use this package if you need to calculate a distance metric for lists of integers or strings, and you need high-performance.
This package is based on the C implementation pyxDamerauLevenshtein.
Install
pip install pyrsdameraulevenshtein
Use
import pyrsdameraulevenshtein
distance = pyrsdameraulevenshtein.distance_int([1, 2, 3], [1, 3])
# distance = 1
normalized_distance = pyrsdameraulevenshtein.normalized_distance_int([1, 2, 3], [1, 3])
# normalized_distance = 0.33
similarity = pyrsdameraulevenshtein.similarity_int([1, 2, 3], [1, 3])
# similarity = 0.66
distance = pyrsdameraulevenshtein.distance_str(["A", "B", "C"], ["A", "C"])
# distance = 1
normalized_distance = pyrsdameraulevenshtein.normalized_distance_str(["A", "B", "C"], ["A", "C"])
# normalized_distance = 0.33
similarity = pyrsdameraulevenshtein.similarity_str(["A", "B", "C"], ["A", "C"])
# similarity = 0.66
distance = pyrsdameraulevenshtein.distance_unicode("ABC", "AC")
# distance = 1
normalized_distance = pyrsdameraulevenshtein.normalized_distance_unicode("ABC", "AC")
# normalized_distance = 0.33
similarity = pyrsdameraulevenshtein.similarity_unicode("ABC", "AC")
# similarity = 0.66
Get started
- First, create a virtual python environment.
- Install packages
pip install -r requirements.txt
- Create the Rust binary
- Full performance:
maturin build --release
andpip install target/wheels/*.whl
- Develop version:
maturin develop
- Full performance:
- Run the tests
python tests/DamerauLevenshteinTest.py
Performance
Speed comparison with the C implementation pyxDamerauLevenshtein results in 4 times faster performance.
import random
import time
import pyrsdameraulevenshtein
from pyxdameraulevenshtein import damerau_levenshtein_distance
n = 100000
x = 10
a_lists = [random.sample(list(range(x)), k=x, counts=[x for i in range(x)]) for i in range(n)]
b_lists = [random.sample(list(range(x)), k=x, counts=[x for i in range(x)]) for i in range(n)]
tic = time.perf_counter()
for a, b in zip(a_lists, b_lists):
result = pyrsdameraulevenshtein.distance_int(a, b)
toc = time.perf_counter()
print(f"{toc - tic:0.4f} seconds, RUST implementation")
# 0.0864 seconds, RUST implementation
tic = time.perf_counter()
for a, b in zip(a_lists, b_lists):
result = damerau_levenshtein_distance(a, b)
toc = time.perf_counter()
print(f"{toc - tic:0.4f} seconds, Gold standard - pyxdameraulevenshtein implementation")
# 0.3195 seconds, Gold standard - pyxdameraulevenshtein implementation
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Close
Hashes for pyrsdameraulevenshtein-1.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4727d9af88af5e22fedc8ec29abf080f88a670a84fe2f8e9832a48b7eab42472 |
|
MD5 | 9f4ce267a93cb24347ba8d80e6f1852b |
|
BLAKE2b-256 | f55f790314fb6acfdaaac6fa955915184563fc9e4c6e1d46e78e30a378aee621 |
Close
Hashes for pyrsdameraulevenshtein-1.0.1-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d897eca63f1bc547426e6a071934b69dca923b4214580ac8880b868d962454b |
|
MD5 | 212300efc5b2321fbd90bb0825a38bba |
|
BLAKE2b-256 | ab5a39772ba1bd891a369836e4d189bf6514e83e02adb7bf3014e82b3ea0c3ff |
Close
Hashes for pyrsdameraulevenshtein-1.0.1-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66555aa4645b1fe8c4aa7e520c984faeb25c96112f156f85d04105ec4d26d9d2 |
|
MD5 | bfb2f29b91f8926e4a4b5153bfeb2a7a |
|
BLAKE2b-256 | 2dbf09b83d033c40ff46d9fce454031c8dcfd3682c82c66d0499bbcb09c98d68 |
Close
Hashes for pyrsdameraulevenshtein-1.0.1-cp310-cp310-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 106a521178fdf8f15585f6e58d5bd04c53f0817220d3af5d83ff1cc83a4edf62 |
|
MD5 | ac4eeaf763b0ddde1dfaa0c40378d97a |
|
BLAKE2b-256 | 7e3b506cc617a43866a03fe1629473006f217e28f90654b80ddb815222fcfc52 |
Close
Hashes for pyrsdameraulevenshtein-1.0.1-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e920e451abaee2341af1bfa96724d99aab17ba8d36e3226dc31c57030b42467 |
|
MD5 | 462686322fa9cfa71aed66ce1f74732b |
|
BLAKE2b-256 | ccd04344c757fad73690e4d43233d333f9b70cb0d0e8a66075d28e264a247098 |
Close
Hashes for pyrsdameraulevenshtein-1.0.1-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e84488f2f2be725454d99290f2a06edc71847e96664a960426df4afb0a5a61e0 |
|
MD5 | 152b3dd292f28c4e0fa1e6f70f5c7be4 |
|
BLAKE2b-256 | 943f2c06598a8e935c4144ae045f78f83cf6f7b21b685cd00ae02f9a0e42af04 |
Close
Hashes for pyrsdameraulevenshtein-1.0.1-cp39-cp39-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fea1a27369d39b4d6aa56e2c547054b69ff0e2ea8b47763eb9e019584abcc2ca |
|
MD5 | b204ba9317d9a22497810bb21fd3d360 |
|
BLAKE2b-256 | d5a1b12818deea523a28d382397854940178c17484b83dbb53cc094d5cb54641 |