Usefull differ function with Levenshtein distance.
Project description
Python C Extention 2 Sequence Compare
Usefull differ function with Levenshtein distance.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
- python2.7
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> from cdiffer import dist
>>>
>>> dist('coffee', 'cafe')
3
>>> dist(list('coffee'), list('cafe'))
3
>>> dist(tuple('coffee'), tuple('cafe'))
3
>>> dist(iter('coffee'), iter('cafe'))
3
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
6
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, it's usually equal or somewhat higher than difflib.SequenceMatcher.ratio(), because it's based on real minimal edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.6
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['replace', 1, 1, 'o', 'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
...:
173 ns ± 0.206 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
741 ns ± 2.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
702 ns ± 2.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 7.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
882 ns ± 7.51 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
210 ns ± 0.335 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
51.8 ns ± 1.18 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
...:
186 ns ± 0.476 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
718 ns ± 0.878 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
691 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 2.01 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
920 ns ± 8.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
223 ns ± 0.938 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
55 ns ± 0.308 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
...:
814 ns ± 2.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.36 µs ± 2.02 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.33 µs ± 4.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.37 µs ± 4.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
2.03 µs ± 19.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
865 ns ± 1.89 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
724 ns ± 1.72 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
320 ns ± 1.26 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
327 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
983 ns ± 17.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cdiffer-0.1.7.tar.gz
(23.9 kB
view hashes)
Built Distributions
Close
Hashes for cdiffer-0.1.7-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2fd875913cf6cc8195611aed4aebb589d6e63fdeb9860fbed7e60c2e82eb9fdc |
|
MD5 | cb5159062efbe1525ca95883774f2e01 |
|
BLAKE2b-256 | a7f026e1c1422c24a4c8b5ec01625edfa6433d4cb6901a71d923c3cbc01ee538 |
Close
Hashes for cdiffer-0.1.7-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d9ead743b903f97c9dba21c136d229b9464eed237348360fdfe5870a2198f4a |
|
MD5 | 66ba39d18c63e2a6d7488882fa22ef19 |
|
BLAKE2b-256 | afa9151cf63dac82101e96f83ff0fa3349eaa23e1b3bfca5e54e07d1946418de |
Close
Hashes for cdiffer-0.1.7-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d27005c542366ae83e531cd261a1ba26842edbd386dd5280a48ee972fc95aa1 |
|
MD5 | 0cea127c4574bc6605e47145dd49bf34 |
|
BLAKE2b-256 | e49dfe2008f4c93c5b6f0ee0932f0891bf50a24bbf480347695ef302166a7a2f |
Close
Hashes for cdiffer-0.1.7-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fa5f036f4feaabd898e7b2826e93ba0210560a488eb5407917363379704f586 |
|
MD5 | 2e3d4ac05435e5cca719d3ce45e43a5c |
|
BLAKE2b-256 | 9bc03da95eb63f317128c02d051943dbf14c1049940cf4ae0e534335c3f9d15a |
Close
Hashes for cdiffer-0.1.7-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5abda1e5349cdfae54c0c7ef3d9d1f28828dfde82946d69693f6e019fc7f579b |
|
MD5 | 0a22d57a06a604e594d2193f6a05a2dd |
|
BLAKE2b-256 | 6a3182c289243ad19cec9ea55c8a80721b0448e95901d9d0827d750989299bbc |
Close
Hashes for cdiffer-0.1.7-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e99e7c80c4733ac36480e1a56bffe911a8c9e9b24f8742db36517da8279da19a |
|
MD5 | 02f56d5e05fbf01354bca3d5267eea62 |
|
BLAKE2b-256 | 1bbdf6aa1b0cd5cd093325f7571708262d18a65be319ddd2a8d3cf7909caacdf |
Close
Hashes for cdiffer-0.1.7-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e14a54f7cb4d0d5f8398dc175038be0ae51996c0a2da26ce60f91b904a4d56d |
|
MD5 | 1dd52778194626f88efd22f7f37449e3 |
|
BLAKE2b-256 | 7f2ece2a41d913dc47bd068fa81fc55b5a101bbb3627d690bd07a01931c50281 |
Close
Hashes for cdiffer-0.1.7-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0bacae38e7a29c2e2cd4321c31cfd9c202b1219de30f58df7e7097e69c1f5fbd |
|
MD5 | 6976cd6147c8bb7076b44880d152b603 |
|
BLAKE2b-256 | f9cade7d5dc3daf603a07922d17736ada8f1b63c30587f8a73abbda133b1512a |
Close
Hashes for cdiffer-0.1.7-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e0d248fa39d73ccacf394caf86bc6ec80b9a494f4f3a851471c32406ac782fc |
|
MD5 | e2e97e7dac00fde5f674a74bd74695da |
|
BLAKE2b-256 | 2603ca80b5eabe529e232c5e25f67a301f7ee6505f6bcd708605ca7ec12bb8f2 |
Close
Hashes for cdiffer-0.1.7-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a01bb9fafc4a2c46da04619bd873b7bdbce9f3ceaefb51172acb28961869373 |
|
MD5 | 3660632a66847c01f907c45b9c50e954 |
|
BLAKE2b-256 | a9983a1232c360256a1bb304eb908769d54b95acf210bbe26c29680d4fc33ec1 |
Close
Hashes for cdiffer-0.1.7-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 093718ed3b5ba71abf3c06895961a6b6c04eda5dd1d8fe554f2a303e8e9007f8 |
|
MD5 | f9bb70340c8e5ce1e26863327edf1163 |
|
BLAKE2b-256 | 0bf718ab851c4ca7b6d24b17c315a892dd9cc96429c3f69ab8b22f943e9ed12d |
Close
Hashes for cdiffer-0.1.7-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc589a3a68235dbd1ed627c1eeca9d69f563e361ba22b362ac89624b309186ab |
|
MD5 | 8433e4dcf74c61f088e375e110f09f18 |
|
BLAKE2b-256 | 9accacb31adff77bc2e0ee950ccde53a6a54b215108c4b380b42cf4e30905581 |
Close
Hashes for cdiffer-0.1.7-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b7198eb944912e67aba4a55a312413cc8916f1fbe754cd27d94c433bf88c4f3 |
|
MD5 | 5b3252a248f01b820d0d89edebd6f793 |
|
BLAKE2b-256 | b0da4279ab6efdb7e895f498dc67aec7eb1eb834a0d1892120cd9f69c1d6b6b6 |
Close
Hashes for cdiffer-0.1.7-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99e839402d9879979be67629f2b4f019cb09e97dcfe9c09681d291c4f38be6d2 |
|
MD5 | dae536b5063aebfa3e4046d5763165c8 |
|
BLAKE2b-256 | 33b4ccc5fba39ba567d973a5284a1b11a974656a0593d14e7e6a6d1a7b57753b |
Close
Hashes for cdiffer-0.1.7-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3cecaca4aa40663943ed53f0a318e97087fd41131e1252b00937460b7a8f9ae1 |
|
MD5 | 9141c3562109e20d4335758a521d57ab |
|
BLAKE2b-256 | 3cc6ecac9db8afb1cc972dc0ab4543af8e36617a16798c5ca7a7ee3eb34a7cd7 |
Close
Hashes for cdiffer-0.1.7-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e8c9ffea05eb98bf8355053b3edde37eb4b2eadc2ae3739fd27f577a1c6f229a |
|
MD5 | 6fc01cf36e7608a315869bea03072e66 |
|
BLAKE2b-256 | d7bde1e2052ab0caeaae30e5b7478af211630aec2e3ecb4592c67e5c851a098c |
Close
Hashes for cdiffer-0.1.7-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b04a4888cdab72154c053ad9f5d770b93ebc6c8302d10ccd75230571623e3ce4 |
|
MD5 | 3b51cd61f523acd3e8d8d70b0d73c8a7 |
|
BLAKE2b-256 | 64df15e6a22c0477cbcbd7e048600deafc91190adcf6ca59d837f7298b8aeb4f |
Close
Hashes for cdiffer-0.1.7-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 894111fdf162626f1de44f4619806b4726d987c35bc443e4dd707157be3a4b1e |
|
MD5 | 30cfd7e4b2515876722c934303c6bbe9 |
|
BLAKE2b-256 | 834f94999259ff1c165f36eddf80cb22069273f99eb457506591114b4a6777a6 |
Close
Hashes for cdiffer-0.1.7-cp27-cp27m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4095db922fe5cf54f798a0ae14a46655f89ad0808228b647e87c5eb31bf96ff0 |
|
MD5 | a09652f5dc157f98fca86d86fac20c9f |
|
BLAKE2b-256 | c45c2af1ec406df7c76071d2c4f19ae8879383ed1a0272b52aff7f87d33f47f8 |