Usefull differ function with Levenshtein distance.
Project description
Python C Extention 2 Sequence Compare
Usefull differ function with Levenshtein distance.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
- python2.7
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> from cdiffer import dist
>>>
>>> dist('coffee', 'cafe')
3
>>> dist(list('coffee'), list('cafe'))
3
>>> dist(tuple('coffee'), tuple('cafe'))
3
>>> dist(iter('coffee'), iter('cafe'))
3
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
6
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, base on levenshtein edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.7
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False, rep_rate=60)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['insert', None, 1, None, 'a']
['delete', 1, None, 'o', None]
['equal', 2, 2, 'f', 'f']
['delete', 3, None, 'f', None]
['delete', 4, None, 'e', None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['insert', None, 1, None, 'a']
['delete', 1, None, 'o', None]
['delete', 3, None, 'f', None]
['delete', 4, None, 'e', None]
>>> # Matching rate option is `rep_rate` (default is 60(%))
>>> for x in differ('coffee', 'cafe', rep_rate=0):
... print(x)
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None, 'f', None]
['delete', 4, None, 'e', None]
['equal', 5, 3, 'e', 'e']
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
162 ns ± 0.752 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
709 ns ± 3.33 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
658 ns ± 7.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.08 µs ± 6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.14 µs ± 5.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
199 ns ± 0.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
50.3 ns ± 0.103 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
161 ns ± 0.079 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
708 ns ± 5.43 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
671 ns ± 2.35 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.11 µs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.15 µs ± 7.85 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
196 ns ± 0.242 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
50.9 ns ± 0.628 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
683 ns ± 1.41 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.21 µs ± 9.12 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.16 µs ± 13.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.63 µs ± 9.98 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
2.1 µs ± 8.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
1 µs ± 3.05 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
493 ns ± 1.28 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
436 ns ± 1.45 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
434 ns ± 2.07 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
932 ns ± 3.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cdiffer-0.2.4.tar.gz
(22.7 kB
view hashes)
Built Distributions
cdiffer-0.2.4-cp39-cp39-win_amd64.whl
(153.1 kB
view hashes)
cdiffer-0.2.4-cp38-cp38-win_amd64.whl
(151.4 kB
view hashes)
Close
Hashes for cdiffer-0.2.4-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1d4e72cb2fca926f808c6022e77a84d183cd46aba4a03124d9b68498e0edca1 |
|
MD5 | 67d30cc1b607bb570a28ed33c14e1958 |
|
BLAKE2b-256 | 7117a38b6d3f85fcd8c34715ac8bedd9392c1eb9c4970729c7615e9ddffd1ef4 |
Close
Hashes for cdiffer-0.2.4-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d0cd3774c81fe4175483200cddb38338744398d63252b1c099d59b04dd6a8ac |
|
MD5 | 058e22420f530c186c8916565c719a5a |
|
BLAKE2b-256 | 0ef8d91f70b0725048bba61d689d04f272b29f6d317ac9d04c2b7cb55c450ef4 |
Close
Hashes for cdiffer-0.2.4-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63ba6757855983f5dd9f7892bf0f714839a74649987fa909f77bbdc759ca3bc5 |
|
MD5 | 252169655e7c1fd98910e7cf9d0f5730 |
|
BLAKE2b-256 | 1c54e4ff58441f5b246b2f3e8a775eb9392acbee1f83b0f394144c28923db3a4 |
Close
Hashes for cdiffer-0.2.4-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f575cb08eab65a0c96dbd02ebe1b1afe67c98051b8dcf44bfd2b355c3133c67 |
|
MD5 | 77b0d622149de9499d06ce0968f9c332 |
|
BLAKE2b-256 | 02960fcd81f9c3db4c7786caec295bd52e6fb280fdba41968282944a002b9461 |
Close
Hashes for cdiffer-0.2.4-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7103e7154bd6b9b25daa5a330d1cb8b6a562ff50ee793aae2cf1c3d5a94da668 |
|
MD5 | 05af6e58485069d41ad95c47ac5e5e8b |
|
BLAKE2b-256 | 666a9f49f97444a497ce4189693b770fd820ec183ae212b82d66cc5f563e2be8 |
Close
Hashes for cdiffer-0.2.4-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f70820fbd3a8270b4e2df4327eedc6e9b4a5237fe0ae70812b1721d67ef9c3e8 |
|
MD5 | bb59772d81b03496c9d6494b894aa5e2 |
|
BLAKE2b-256 | 8119334652a1c8ff0ed5fd95f08554f54cb1367e95a1d6722d37d807d2b8722f |
Close
Hashes for cdiffer-0.2.4-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba86a26da67ea88a9269a70c4b0776a69284f5b19275491a2646a3e744e986b8 |
|
MD5 | 297954aaecc0aae74f4949c46d797f99 |
|
BLAKE2b-256 | 947765687080f09da32d9b9438077d056e1f7718c43e99fd7b5e57ecbe8bf1d0 |
Close
Hashes for cdiffer-0.2.4-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29cd5e21f3f8c4a83d94b4f5a059d5290fdaa4a9405739b956f25e5bbc1a05fc |
|
MD5 | fc1f47fb5863b4eac4235bb45f2e8966 |
|
BLAKE2b-256 | 9be056518a860cd1e8913eeb33b2fa79ba7044ed539c69e82d237c67a2528627 |
Close
Hashes for cdiffer-0.2.4-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e65f438cbf909e7c910b59c58bf7c064c4d41a6d532922c7fc7dae449f5c4d7d |
|
MD5 | e43982d614db62b545ab2dec37bfecea |
|
BLAKE2b-256 | 6e1488f609eed2ebc3a1b0a9957e27a4eba86f88796698a74a3cbdfd3e0b82bb |
Close
Hashes for cdiffer-0.2.4-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8764e52222457dd1b6a7d6cc39904c7cd0dffadfbbf731ce7795f13f86740bd8 |
|
MD5 | e4a2d74f48f9a7a539b52840eb317c62 |
|
BLAKE2b-256 | c7faf7706e82d94fd774ed30381e26c027c2d6759e813795b3439b6fcb3ace79 |
Close
Hashes for cdiffer-0.2.4-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd8ecac32f68f9a1cba49ed16f4717a3aca7a6deba4617c07bb5f1f150670bdd |
|
MD5 | 2d71b53c7142077598e01cdc95702918 |
|
BLAKE2b-256 | 5156887b6947b18630bf1ac8b64991992e13844382f7a0230c4bb92a0c53d299 |
Close
Hashes for cdiffer-0.2.4-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf8ec0732a3380ed31af1bd2640717d1d571d976e7961d5365138cd915936d21 |
|
MD5 | 32ebddbe3f42408c10c92e35cba8afef |
|
BLAKE2b-256 | 130c42238f5a8768d5078cc230e5ff5ec38351b3f0bf3e1aeec6210ccf52e3ea |
Close
Hashes for cdiffer-0.2.4-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c64b01a4c9cbb221044d6c934b19aeeb281f074eddc5b505c0b419f33337e125 |
|
MD5 | 87545212d5702d8f3a37e4705572e62a |
|
BLAKE2b-256 | b858218411a3cd7e46a8d17accfc89ba7a755141402f34b8c0faec0f7d732901 |
Close
Hashes for cdiffer-0.2.4-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6bebcfd2578790f89b57765ceb7a6b388ad7ca6101f2bdd43d9dc433edf1e0a |
|
MD5 | 2acfd452fc201e3825b482fe6f45de7c |
|
BLAKE2b-256 | 70fa239d44c04295a9ad2f1180ee6a5b1378f597e464c28f11b16abd46b061fa |
Close
Hashes for cdiffer-0.2.4-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69f0c42301efadadeb97f7d0ad610a04d452995a009975c6aae5f1f174377634 |
|
MD5 | 5a1e8a0c23b193d342b4fc5bdd5a2477 |
|
BLAKE2b-256 | f132b2b9bc6a2c49ce9b2c891ec2e7a8d63008cc6cb9033f96d7cb535468419c |
Close
Hashes for cdiffer-0.2.4-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc2be26f622f6f5a4335fc5891b720a9f830625d941ea06667a59be423b8702a |
|
MD5 | 84a4f39189ee79691fb89b570035e3a3 |
|
BLAKE2b-256 | 7326f71fe2f13b73b83110abcd7f39eeaafb7e52c0ca338485cd0f96ba060212 |
Close
Hashes for cdiffer-0.2.4-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c8f01628d322acaba0dc682e6c15aaf780de11b6a2b7cf09153e0cd997f384f8 |
|
MD5 | d53b8143dbfcfd9ccfa2d80598942365 |
|
BLAKE2b-256 | 2b1905743c9e8183416567829b69e5a335aa6a953b8266074b9bc6a43799a703 |
Close
Hashes for cdiffer-0.2.4-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f8209f4af925d50fc606b0fe88129669ae5ea8460bba056f788cdb0ec67f22ea |
|
MD5 | c85e7cb5472b9388a802c6a92f97990c |
|
BLAKE2b-256 | fd40bc940825ed7ab2104cfca260da89d73a82be6a643022b5f3ab40dfd0f689 |
Close
Hashes for cdiffer-0.2.4-cp27-cp27m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8ae4662ee3ff7a7ceeabd6e42f1cfe2d1cb56dd40bfa57348a4871a8a9faca1 |
|
MD5 | f42d04be6bc419f2a61d500dafeb0c32 |
|
BLAKE2b-256 | a878a47ed5a9db56d208348cef4f6f7b8dc1b9e7b4792c5557a7b306534da0ab |