Usefull differ function with Levenshtein distance.
Project description
Python C Extention 2 Sequence Compare
Usefull differ function with Levenshtein distance.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
- python2.7
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> from cdiffer import dist
>>>
>>> dist('coffee', 'cafe')
3
>>> dist(list('coffee'), list('cafe'))
3
>>> dist(tuple('coffee'), tuple('cafe'))
3
>>> dist(iter('coffee'), iter('cafe'))
3
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
6
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, it's usually equal or somewhat higher than difflib.SequenceMatcher.ratio(), because it's based on real minimal edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.6
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['replace', 1, 1, 'o', 'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
...:
173 ns ± 0.206 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
741 ns ± 2.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
702 ns ± 2.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 7.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
882 ns ± 7.51 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
210 ns ± 0.335 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
51.8 ns ± 1.18 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
...:
186 ns ± 0.476 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
718 ns ± 0.878 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
691 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 2.01 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
920 ns ± 8.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
223 ns ± 0.938 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
55 ns ± 0.308 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
...:
814 ns ± 2.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.36 µs ± 2.02 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.33 µs ± 4.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.37 µs ± 4.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
2.03 µs ± 19.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
865 ns ± 1.89 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
724 ns ± 1.72 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
320 ns ± 1.26 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
327 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
983 ns ± 17.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cdiffer-0.1.6.tar.gz
(24.3 kB
view hashes)
Built Distributions
Close
Hashes for cdiffer-0.1.6-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8fb88e13ac4c01f0c39e7cb7a7c3d8e001df1b6bbad61069c30cf4f8038f0c43 |
|
MD5 | a773e85d73decef44cbfb0653eb90ac4 |
|
BLAKE2b-256 | a7df300fafa42bc6a689243f4d88de78f3b86bdac10a64050a062470d2268e1b |
Close
Hashes for cdiffer-0.1.6-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4df19558bd766a91b62655bcffb0d72e85c6ff19d17077715f2e7407ce806cf0 |
|
MD5 | ab2e290e0a3c6bf4cfe7f6b437304615 |
|
BLAKE2b-256 | 2172fd2d3d21005ee540964afa051c76ac93861b6e6399544763ac7f6d239722 |
Close
Hashes for cdiffer-0.1.6-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80c1e902ecff6dec898457dc03e7f52dfb3c23524a9dc94efa4dc7ec335e3107 |
|
MD5 | aab052c0d99638ae141edea57f798dab |
|
BLAKE2b-256 | 101893d6258a0b085b9e758367a3f1c6debf71048b5a4f584d824debcaa3a27a |
Close
Hashes for cdiffer-0.1.6-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f0585f956a2ee3c32091c6f380c75e96a5911e28bcfa529c1f317878e66555ac |
|
MD5 | 313ffd3ab5a200b6066d2a511d00f71f |
|
BLAKE2b-256 | 0606507e3d8f181732f854de5ecd353fdde189a6bd155fada5cf79b456a30bac |
Close
Hashes for cdiffer-0.1.6-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0cdf5faf0f685af3f0d0655016946ec8389a57d090d5e23198406d4c8feafa8a |
|
MD5 | d91dc1105b50002c211ee781792da945 |
|
BLAKE2b-256 | 4439eba1820bd91bcf74fd20fb503f5339c1e3145ef336189c34637594a8e4fa |
Close
Hashes for cdiffer-0.1.6-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad94a73c4681b357faa5c4abfd45e7289a58be45aae61f61758f4d5e420bddd0 |
|
MD5 | 0e10f76ddae6f8d1126982815c4327b9 |
|
BLAKE2b-256 | f49c558653c5d7720162eb5558894cfe936484810c667ae402f6b6cb1d3bc4dd |
Close
Hashes for cdiffer-0.1.6-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2273ffc8f3234078a905ba55645eee2c7d61559cb5d4723e5afd9a9f1590c47 |
|
MD5 | 8df7bad82fe7f399042a13b5efb286d2 |
|
BLAKE2b-256 | 1e551883a1976ace49a575caf05f85e198f1c0b79721e7085d18ea0a6dfd23c6 |
Close
Hashes for cdiffer-0.1.6-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6598102d2dd19e20259198f3690c0fd520ebc14ab0564ec534b016a2b5b49d4f |
|
MD5 | c0eb728558f65df286aed94e6c90564c |
|
BLAKE2b-256 | c1bcafb4e0d077eff4c5814ffdfe4c84d105abfc475b9e7b882622c40e85bd57 |
Close
Hashes for cdiffer-0.1.6-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 775d2517bb75a415ef442f1d455831198c24e6bf2c746a7e1f56397e90ee77e5 |
|
MD5 | 52c5c8c364787404d330272a3469fb98 |
|
BLAKE2b-256 | 4829ea370a410af9c31540fe3e87c34fa4a97a7d346789d37f4c001048573a16 |
Close
Hashes for cdiffer-0.1.6-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e684b0cfe3e57cb786820fa8d00998089a4252ea9e2260c23fc1a15c67f6eade |
|
MD5 | e5552abd4990519c057279b79dd9d4a6 |
|
BLAKE2b-256 | 9a00fcbdf7f8e42fe568a45ca27f75ce0937ce6012293e8a084bd0869bfed5cb |
Close
Hashes for cdiffer-0.1.6-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 60d7690ad3494e2834ae108d8de8a0fed8ea2a9c2f6d5a4fbd21be43bcb27958 |
|
MD5 | 2d023d94be45af69430dff65a2790199 |
|
BLAKE2b-256 | 66a80f41a8e77d0da4d99392bc548dff3e7ef650785a2895bb2de3501884a614 |
Close
Hashes for cdiffer-0.1.6-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2690213246001eb1d0afff2fd3d876b299f7ca2a2b2b37a309bdc13073ab47b5 |
|
MD5 | 54937e270b2eb818ce7ae79b35fbeb14 |
|
BLAKE2b-256 | eb0ec489fbbb41b87e34057020306a1846c8bb76265f7115f1da18b7cb30ab31 |
Close
Hashes for cdiffer-0.1.6-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4297c9d56015279e05fa0ad058cb879a2b9f31e860d34605dee3e53db68333da |
|
MD5 | 8a76323b56cea8d8d367bfc2f9b4ee5f |
|
BLAKE2b-256 | 4b6e1752cfd94cb136cd0cf922c2c6faf34f0a3c19e24cd66ce1877fc5ffa357 |
Close
Hashes for cdiffer-0.1.6-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 533fda3fe920ede531a66f10e18a95aee16337fa7c5aff9d302fc1b00370f553 |
|
MD5 | 49303cab7e1d12b886dce4cd8e17b042 |
|
BLAKE2b-256 | a65482a194bbda258570ed8141dbcff3244daa9593c9a0f9831e7a447cddc762 |
Close
Hashes for cdiffer-0.1.6-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 052af1e0905e2f6a98803852405ad0b6e9bdc65ae447da3f548b908b528ef91b |
|
MD5 | 942e195f7465925577754611449098c6 |
|
BLAKE2b-256 | d8a3ed309ac28de80da9e5c349ed86b274369338f2d1c0c5f0bdbe8342f44c38 |
Close
Hashes for cdiffer-0.1.6-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe6a9f26f05d551216a9cdad94d384382cfd42e05b7e511e40f59ffac0ec9e3d |
|
MD5 | baadcbb5672fd4cc4baa29332384122a |
|
BLAKE2b-256 | 1b7a4931a4b86ca321bca042528a6941be1c26acdab27e433de00ff9ca5b3e8e |
Close
Hashes for cdiffer-0.1.6-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 73cbfeb70a5dec158d8d53f9e99d377289973fc86976a9909f3de59debf007eb |
|
MD5 | 39653c32e9accb4c77520783ef42d7ac |
|
BLAKE2b-256 | 67a839f15dc761b611ba7ff7b7d3004688b1008f44af861159fdacdd56c8288d |
Close
Hashes for cdiffer-0.1.6-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1e567d663a7881a3d45bce1b37bb9dee6b69a2a5aa88b006271a348fed62cfe |
|
MD5 | 7968cc2ad83046c2a0d7cf9d8496205e |
|
BLAKE2b-256 | 9716770e4e3ea5d8e7ad7d28d621a0ccfd3bff8954c3ce4c13219750f1dab475 |
Close
Hashes for cdiffer-0.1.6-cp27-cp27m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e8a4efb99a8bab051b012836a334a70e3abc951c71464eaaa9db2cbad2c53aff |
|
MD5 | 5445e954f2aef623e53f04223c6e0b75 |
|
BLAKE2b-256 | 842b442210aa4764262c8e6240d2df4174a2027b7c929bc40fc340462bac3b4c |