Usefull differ function with Levenshtein distance.
Project description
Python C Extention 2 Sequence Compare
Usefull differ function with Levenshtein distance.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
- python2.7
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> from cdiffer import dist
>>>
>>> dist('coffee', 'cafe')
3
>>> dist(list('coffee'), list('cafe'))
3
>>> dist(tuple('coffee'), tuple('cafe'))
3
>>> dist(iter('coffee'), iter('cafe'))
3
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
6
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, base on levenshtein edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.7
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False, rep_rate=60)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['insert', None, 1, None, 'a']
['delete', 1, None, 'o', None]
['equal', 2, 2, 'f', 'f']
['delete', 3, None, 'f', None]
['delete', 4, None, 'e', None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['insert', None, 1, None, 'a']
['delete', 1, None, 'o', None]
['delete', 3, None, 'f', None]
['delete', 4, None, 'e', None]
>>> # Matching rate option is `rep_rate` (default is 60(%))
>>> for x in differ('coffee', 'cafe', rep_rate=0):
... print(x)
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None, 'f', None]
['delete', 4, None, 'e', None]
['equal', 5, 3, 'e', 'e']
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
162 ns ± 0.752 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
709 ns ± 3.33 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
658 ns ± 7.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.08 µs ± 6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.14 µs ± 5.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
199 ns ± 0.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
50.3 ns ± 0.103 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
161 ns ± 0.079 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
708 ns ± 5.43 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
671 ns ± 2.35 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.11 µs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.15 µs ± 7.85 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
196 ns ± 0.242 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
50.9 ns ± 0.628 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
683 ns ± 1.41 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.21 µs ± 9.12 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.16 µs ± 13.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.63 µs ± 9.98 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
2.1 µs ± 8.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
1 µs ± 3.05 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
493 ns ± 1.28 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
436 ns ± 1.45 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
434 ns ± 2.07 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
932 ns ± 3.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cdiffer-0.2.3.tar.gz
(22.6 kB
view hashes)
Built Distributions
cdiffer-0.2.3-cp39-cp39-win_amd64.whl
(152.3 kB
view hashes)
cdiffer-0.2.3-cp38-cp38-win_amd64.whl
(150.5 kB
view hashes)
Close
Hashes for cdiffer-0.2.3-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f3c0a650fc4733fe5775127c68464ffc4a4601e8a147d4b7e79766d628c309fb |
|
MD5 | f30c22f48912ed506ebc3e7640e864cb |
|
BLAKE2b-256 | 436d99e0f7abfce59554be0db71140ab6176d03e6d9eb5b6f3c41b48f4df8e07 |
Close
Hashes for cdiffer-0.2.3-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea1a29d7090b345f512201d42d914630d4fe34e22532031f5f5182e2d324bb3a |
|
MD5 | 53a9ab75aa19345259425b78a1e5f87a |
|
BLAKE2b-256 | 84a498058724fac0daae2f559a5bddbbb6dffe9e01efa3dc8882a10df51dbe71 |
Close
Hashes for cdiffer-0.2.3-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dad67d68350889e08d097b38735df32ede5bdae0da946bd59697a4c809582786 |
|
MD5 | 4daf1ec50bd48cd5e63872143d267a25 |
|
BLAKE2b-256 | 0296fe887346508a183009985532e0adc0f32509c8e025b2565af15d5e88e46a |
Close
Hashes for cdiffer-0.2.3-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 838a92147bcad544596fbe5d4ee734d89ce9c69ee3ea04ad4539c6e581572393 |
|
MD5 | ea84fd877c462e14ff4c6ed300683696 |
|
BLAKE2b-256 | 543af12832a9e6e9210f083450c276ff5931b5573336da5017635dcbebee3bea |
Close
Hashes for cdiffer-0.2.3-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a2bab8b8ca0f06db1cfa8c2ad99657f9f62fa61b6d927cf92cc75104f61b90d |
|
MD5 | 2ebc0ef9c879db559ef4fae7d8c4d27c |
|
BLAKE2b-256 | 581a0d6f77aa983702d7af49e5059b967c57f4efcf9fd376881d762e55569dfe |
Close
Hashes for cdiffer-0.2.3-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d463a837dafa9d1379ac7ba7aa397602dea788d83b775247194ada5455b8e240 |
|
MD5 | 794b64c6438725d6909ebcd43d174f1d |
|
BLAKE2b-256 | a8fa3f138c11c246f72d3bbf5f0d5139fc1e0b5c0c9baeebce1f85f4a8d33bc1 |
Close
Hashes for cdiffer-0.2.3-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2d8bbcc9abe51bbce6631e856c0032d8af545fa6c14a8328bd514aa33cb23be |
|
MD5 | 19b317b7b28138a9f3cb48b70fb3c042 |
|
BLAKE2b-256 | ca3074bca9128d8b82ec0678e4278fd578024480984a408d821092490ac3f1d4 |
Close
Hashes for cdiffer-0.2.3-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c3a07b6741027f7b7cf24087b8c1177d5d78314acc284823be890641a887efac |
|
MD5 | 24e66c9cc76f8e5e28d47b5ad28a35ca |
|
BLAKE2b-256 | 68b62af195152e14719bfe8535d9ddc81e964a7e56e87776218014e28f624ff4 |
Close
Hashes for cdiffer-0.2.3-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fe24186db7495fbf9d5dfde8a850e74733e6abca54ea6ed72db444f74abe090 |
|
MD5 | 6a2ecb87a445df6355ec5d29a160048f |
|
BLAKE2b-256 | fcc9f9c196fa58c7d5c5b5d92d27345a9cb6228d215279dc51aeb00f2e8117c0 |
Close
Hashes for cdiffer-0.2.3-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5ce4863cb735b7a69bcfef5f8eee5c0a7875320d993a893a5e29838961f1ed0 |
|
MD5 | ae14f3ac13fe890fdcdd0cc0bad1e91c |
|
BLAKE2b-256 | e3b57f338b7500ea7ecb398785ff33ed19b7f49d1dca080fe76a09339037909d |
Close
Hashes for cdiffer-0.2.3-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c83e76ba870b23b593f23e812830051f7652e722df6074e94688c1a5cc2d510 |
|
MD5 | 33881640758b416ddc9bb87ae5a59ae1 |
|
BLAKE2b-256 | 3e058e492fe3aac0828ce8c9b37f228724ac1a7f23f711223b470be1e2951ef5 |
Close
Hashes for cdiffer-0.2.3-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 109331f321f56d943326257de89252d04afa0913c1662893225b9f46772054d4 |
|
MD5 | d83c17cfa9a0125384347d80db735777 |
|
BLAKE2b-256 | b9a7649cdc56e9ca296f77f310a5750f331a0282020888bd1d7297f2b70867d7 |
Close
Hashes for cdiffer-0.2.3-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7d680b305a8c094c064680ecdd863d976415fb977a1242d7009be6ce7450c043 |
|
MD5 | 422de71f9aadeb63fe8d3f6a4cbc327e |
|
BLAKE2b-256 | 570cde2c4151b5abd70e2308262a2ad7b3b16b08bf579e85823c1e4c188f17c5 |
Close
Hashes for cdiffer-0.2.3-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4db3ef44faad7fb951a97706f8c9f7475e75e2e7e1e0fe8d13ffa13316240ae |
|
MD5 | f7da599392bf5eed63a8a224009e698f |
|
BLAKE2b-256 | 39dfc68c72b45588902c6f60973119545d3eec092fe5e1caedf086343c0b44f4 |
Close
Hashes for cdiffer-0.2.3-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c4d5b4459f4fd788c6d3cd7192283fcf8655fa55cc492db2d64e6601ed9369f |
|
MD5 | bcac6afa7f4899bf5808dc3723ac3798 |
|
BLAKE2b-256 | d9438dfe7ece16ce8b71d088f425145fc251d7685a5eb937f0f2b0473b795381 |
Close
Hashes for cdiffer-0.2.3-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf440b93a2fe2e8aac866f9bcd8bdac99b0f19d4de3919526b9f49b9d1fa1815 |
|
MD5 | 1d35b62ba9323d74fca246c7b84aa65a |
|
BLAKE2b-256 | 85d82ce63596942edd24fd419d7eb8a2040116223243e06aa5450bfc26776234 |
Close
Hashes for cdiffer-0.2.3-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 90233b1914541077c8dd2cc3915ef8f3403bfe6532aa76d0e2a826edb29987d2 |
|
MD5 | 905628576ea2f93a37878611efa7cd07 |
|
BLAKE2b-256 | 33e51ec31da05dec6c8bce6572c9d749d2a938182ff29c548f020879cea62c0c |
Close
Hashes for cdiffer-0.2.3-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a015ce60384f587ab67b9a4a76657a556dac8ac53a9846b747a8f0450b0f7760 |
|
MD5 | cb326c038bf31cb69081527052878c17 |
|
BLAKE2b-256 | 8dbde5e6c5b16440c85a2e39a91bd7db7b1e29335d2e32cd5e5021841c55a2c6 |
Close
Hashes for cdiffer-0.2.3-cp27-cp27m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d2bce7734ff25e40072981962827f5e1c98294195f6826145577757099b185f |
|
MD5 | d81f7337cf1bbc402defc39515ad0cd0 |
|
BLAKE2b-256 | 438f9847401238ac86c6486e3cb252e00c12a36c6bb3c2b9527581167ed2e33c |