Edit distance, Similarity and 2 sequence differences printing
Project description
Python C Extention 2 Sequence Compare
Edit distance, Similarity and 2 sequence differences printing.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
- python2.7
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
Help on built-in function dist in module cdiffer:
dist(...)
Compute absolute Levenshtein distance of two strings.
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> dist('coffee', 'cafe')
4
>>> dist(list('coffee'), list('cafe'))
4
>>> dist(tuple('coffee'), tuple('cafe'))
4
>>> dist(iter('coffee'), iter('cafe'))
4
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
12
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, base on levenshtein edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.6
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False, rep_rate=60)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['delete', 1, None,'o',None]
['insert', None, 1,None,'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['delete', 1, None,'o',None]
['insert', None, 1,None,'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
>>> for x in differ('coffee', 'cafe', rep_rate = 0):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True, rep_rate = 0):
... print(x)
...
['replace', 1, 1, 'o', 'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
125 ns ± 0.534 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
677 ns ± 2.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
638 ns ± 3.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
681 ns ± 2.16 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
843 ns ± 3.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
125 ns ± 0.417 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
50.5 ns ± 0.338 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
123 ns ± 0.301 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
680 ns ± 2.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
647 ns ± 1.78 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
680 ns ± 7.57 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
848 ns ± 4.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
130 ns ± 0.595 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
54.8 ns ± 0.691 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
735 ns ± 4.18 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.36 µs ± 5.17 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.31 µs ± 5.25 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.37 µs ± 5.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.33 µs ± 5.32 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.07 µs ± 6.75 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
638 ns ± 3.67 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
524 ns ± 2.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
539 ns ± 2.23 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.07 µs ± 1.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cdiffer-0.4.4.tar.gz
(18.6 kB
view hashes)
Built Distributions
cdiffer-0.4.4-cp39-cp39-win_amd64.whl
(489.9 kB
view hashes)
cdiffer-0.4.4-cp38-cp38-win_amd64.whl
(488.0 kB
view hashes)
Close
Hashes for cdiffer-0.4.4-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 760d9300091a08c97bcece922ce63427067c8460e1fef91c9369aff6e5357b2c |
|
MD5 | e0927ea91f50bf6a1c33e13329fd725b |
|
BLAKE2b-256 | 787081a5d19906cc050a35b5f726cdfe2c023b0b77a23f3e79c722b74d967c5d |
Close
Hashes for cdiffer-0.4.4-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d57126e03ae276f019702190e19785e112fef2aa20b3d19004286a76b5cf5f6 |
|
MD5 | 4983c600b4bb95ecf232fe1edac82972 |
|
BLAKE2b-256 | 3fb7be2825b5173f9e365c03f013e7da12b532da2509a36c0156f6bb1c5f3128 |
Close
Hashes for cdiffer-0.4.4-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7bed91dae87ba3fcbc5d675ce1065ea7df678bbb82b5934b18703569686ccc0a |
|
MD5 | 55df4f3ba460bcbc03fd95db2727d5a7 |
|
BLAKE2b-256 | c5dfa9fe3cb7b124abff7e8b90e0e490ed9a2504a542551e60d175a0c8a8a65c |
Close
Hashes for cdiffer-0.4.4-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab6d4b8efba3f476d64c5cc6ebb36557b8b21a400a1ad07f452812def91f4f76 |
|
MD5 | cc9feabf65724ad37cd964af1ce0cd68 |
|
BLAKE2b-256 | 090d0d41e6f9948b4e8a10299fc9a2c118190ee83adf8c63a37bb48ded541ef8 |
Close
Hashes for cdiffer-0.4.4-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47e688e6681eaddfd3210ff62927344b812c100ee8efde23433347e229cf3de6 |
|
MD5 | 178ac8f6c54d4ff754ba098add6d58c6 |
|
BLAKE2b-256 | 474c1b1114bd027aff0354948c76d5f2537543887f989d92230f32201e4ff8ad |
Close
Hashes for cdiffer-0.4.4-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 34bc79bbb882bc0dcb1e58a99da118acf9a7a2ec2f76842c625e4c93c668ab59 |
|
MD5 | 845062256d6fd403c9457db102060c1c |
|
BLAKE2b-256 | fc30507a9fe1acfcdc59fd03ad8fc55af17664051b29eb5217627816830a7ad8 |
Close
Hashes for cdiffer-0.4.4-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba3bc602a7120d18fb192f21cf19f74a88ca70d93e848c7bbbcf0e107684c0de |
|
MD5 | 87e0fde3d8f514a0947e11b8b7caa037 |
|
BLAKE2b-256 | bf9d1cffb83d28c6fd6af96dfb8a0c65a1ffd0c9f8486b93823ac5d7e7897dd9 |
Close
Hashes for cdiffer-0.4.4-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b5273fac74c1391ef0c806baef01e49d1a34f3a60669f701f55030ae1356a5f |
|
MD5 | 4b367c7ef68f1a0d22d152eaa4f929b2 |
|
BLAKE2b-256 | 178f9bdc03d2eb51450a495010bc66272748f526ecbc2461b9cf459a371f09a4 |
Close
Hashes for cdiffer-0.4.4-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75e47d4d574657a5592eca3fdf5524502b651fd57d095ce46f5d972a72016792 |
|
MD5 | 95d148efd7a769488c68acfcf64e4030 |
|
BLAKE2b-256 | d454ca18ee51af676076d39771ceda0312bb04491f49e412a8a68c604cfe1a51 |
Close
Hashes for cdiffer-0.4.4-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 811e509dde75611d2b5a140fa5c3cedd7ada2304e1205a81635f69a7cda1256d |
|
MD5 | f2f3df0e0971e3572ea2efc879d8684a |
|
BLAKE2b-256 | 842a647aec1df9e40aae7b1ee78b531f4dafd7ea39b173f757ca232af89e4f88 |
Close
Hashes for cdiffer-0.4.4-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d6a15aa173f34580b52c3781d6f0ae75c143b9d04301e2c0f994c698aad9d700 |
|
MD5 | 7abe88290ffbc8ad0453fb88b2d8f517 |
|
BLAKE2b-256 | 9a234c86a842fa398ed0e6d442e31328950358b432caca01d70dc96bc7ecfd1f |
Close
Hashes for cdiffer-0.4.4-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25fea58d8d8fd6b2ab6990dfdfcec91f3ffd6661c171e130c8622e1224d9ebf4 |
|
MD5 | 703b423355961f55507c3813ba6b400d |
|
BLAKE2b-256 | d50a1ee83b325c1d9c1d53333e254e5a6d9f7f238bc5a1731581e1c14db47adf |
Close
Hashes for cdiffer-0.4.4-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1b2c8a07ab2ec4317df06dec18da4df8c7ceaffcb87520e89dc11bd4399a22e |
|
MD5 | fb064c8426ea64f33e28e432948001a5 |
|
BLAKE2b-256 | 22ce71b47d7319025e88fff4387012cecab04f062bbd0a17a1b67de997e60020 |
Close
Hashes for cdiffer-0.4.4-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ebae279b1165e41426483adfb51346ae7e9aabe8d5c28ad9ecb46f5431af077c |
|
MD5 | 50d0ca530e9cd87f0217e42053e42c9b |
|
BLAKE2b-256 | 2d12887478812ecdf9af71792afbb45decabde601eb4b93fd7276edd7a2a4c31 |
Close
Hashes for cdiffer-0.4.4-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0217253ca94c8a13df0a9c73db2a606aecf15c19e764b6dab5299f1763e6a528 |
|
MD5 | 4c7ff53dd1141100edbfde6c2cc90bf0 |
|
BLAKE2b-256 | c2bd55c27466180e6301f2567af8d048c31740fdb4eca072abfecd86ab13430a |
Close
Hashes for cdiffer-0.4.4-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b0a4a150dfa9ee7c9b4e2a57b61c0cf29b923f58c4478521a96f33d35721fb1 |
|
MD5 | 27c91454d93689a72d8dbb496bb06d20 |
|
BLAKE2b-256 | cf8fd968b02aed90a2b423fea339d3ada39da2ac9c48e9da2adad41eb8c793e5 |
Close
Hashes for cdiffer-0.4.4-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 47674470ef2b9bdfc4a50c9d22be60720cf548dc30dc4ea6406dbc1f83a830b3 |
|
MD5 | 6fc7992a5021056ec83e54a2c72b0652 |
|
BLAKE2b-256 | 56c08c31de3189a86d67457cc0de86141b6cdcf912729a69ecd883f037db6fb1 |
Close
Hashes for cdiffer-0.4.4-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 619ade385757164fe9f996b5a75c0189a5815a773136f17c188b47261a1bf6dc |
|
MD5 | 7dcf6f74d599adbe9703491990183adc |
|
BLAKE2b-256 | b33e3c09fc0a0c8766bea06748a0f06f6cbe79e693f3d79fe84bb961a48e1a0c |