Usefull differ function with Levenshtein distance.
Project description
Python CExtention 2 Sequence Compare
Usefull differ function with Levenshtein distance.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
- python2.7
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> from cdiffer import dist
>>>
>>> dist('coffee', 'cafe')
3
>>> dist(list('coffee'), list('cafe'))
3
>>> dist(tuple('coffee'), tuple('cafe'))
3
>>> dist(iter('coffee'), iter('cafe'))
3
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
6
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, it's usually equal or somewhat higher than difflib.SequenceMatcher.ratio(), because it's based on real minimal edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.6
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['replace', 1, 1, 'o', 'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
...:
173 ns ± 0.206 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
741 ns ± 2.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
702 ns ± 2.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 7.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
882 ns ± 7.51 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
210 ns ± 0.335 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
51.8 ns ± 1.18 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
...:
186 ns ± 0.476 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
718 ns ± 0.878 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
691 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 2.01 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
920 ns ± 8.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
223 ns ± 0.938 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
55 ns ± 0.308 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
...:
814 ns ± 2.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.36 µs ± 2.02 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.33 µs ± 4.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.37 µs ± 4.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
2.03 µs ± 19.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
865 ns ± 1.89 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
724 ns ± 1.72 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
320 ns ± 1.26 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
327 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
983 ns ± 17.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cdiffer-0.1.3.tar.gz
(24.1 kB
view hashes)
Built Distributions
Close
Hashes for cdiffer-0.1.3-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5ddc8e91e87b041a9394684e62f064da2af2570bbbe5927e0123f1809a5d68c |
|
MD5 | 29678238265d020e49c80dae0badfb19 |
|
BLAKE2b-256 | 148560fb888cbb2f7f558447b89fa9db0355cdae509df586cd066e0298db3069 |
Close
Hashes for cdiffer-0.1.3-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44a196628a75d9ca8f86295e93411095bbff33d4cfea171cd94e456ce8d8217d |
|
MD5 | beb3226701e717f8af9f582e87f82a85 |
|
BLAKE2b-256 | fe8de0a42f608e5468698ccc74d411855f5d461a6567b8e897912fd6cb973377 |
Close
Hashes for cdiffer-0.1.3-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f7572fc804f1695e342348a091377aed7fa1bdcee8cf33793b69fc96df8c15aa |
|
MD5 | 22dbdd5161a0ca1e4fce214b207df163 |
|
BLAKE2b-256 | 10a49312762b4779b83602224a870d4dbf862d06b4e58ea358cf2e0c1f7ad39e |
Close
Hashes for cdiffer-0.1.3-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae6be262e4dd3be1b9f340081fc58732677b177a0ef0e5be79a4927df8f03fd7 |
|
MD5 | 3242b9b04a2f0ddd3816231e30c0c2d2 |
|
BLAKE2b-256 | 648fcfd25d25a072a2f444a0f7e9fa3e911af38a29c16eeb7ec19aeb446475e7 |
Close
Hashes for cdiffer-0.1.3-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 358d2cbf88cf30bf7977d12cb883cf2865ccc28e10e7799edaafe7c241292116 |
|
MD5 | 86c30d350ed0d3c981c8d3b51527bc4c |
|
BLAKE2b-256 | 65e1ccc666ae88fe0d2644b003a2e7435f695ea248af88564747802ed28cbd1a |
Close
Hashes for cdiffer-0.1.3-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d8cf4c319b38ca8c284c701e87dfdd8f1c011b66cbe279dc0d4f27d4fc49635e |
|
MD5 | 4ae8db81574c8a9a17af0d6965f05a6d |
|
BLAKE2b-256 | 2e880836b47687cb11d1e3ceafead7f21428f29817b86a6f1b02e79710cfdc76 |
Close
Hashes for cdiffer-0.1.3-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d47be9a92043ac8f69fad98745a6568ca021b84c73745c4475fa0e438a2863e8 |
|
MD5 | 5f7813281bc6fc9786d0f10bc996a291 |
|
BLAKE2b-256 | baac8fb9573b721f929c38e25c524adbb6d5d16a0a94aeb9abeb172de9cd26ee |
Close
Hashes for cdiffer-0.1.3-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a304b9f529c22c264da05418cb1dd17032429b460e675f6f279295574833699d |
|
MD5 | fe70b72d2adcaa0d072fc93890d21b71 |
|
BLAKE2b-256 | bc14564dbb29c8a8c01bb43395f4fda7c992e27462561d042deb6490bfc2d4ec |
Close
Hashes for cdiffer-0.1.3-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 096a1a36dccc273c0155344b0656fce3eb9100607c3ef55b9846a494131ad561 |
|
MD5 | 8651a90c0befd0f18d2f8e00211e5174 |
|
BLAKE2b-256 | f506b44cf0d2c20949aee4a2ad1404fc10b0b9c7d35af673d9c976dd36e3a554 |
Close
Hashes for cdiffer-0.1.3-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6db21dfd028b0c42c8cae056c669b2e34c1041f56c9b5690f916ef0192870c6 |
|
MD5 | e8562a4ebca7a9ca3d44d594d54cc82b |
|
BLAKE2b-256 | 74add5405b883ba9fb80a787bb33663ac5616d34e9b9a66c9b5e4f0273b04ffc |
Close
Hashes for cdiffer-0.1.3-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e03f6436f31c6614403a7cfdb944cf7fc165a1f81ea422539191399f94350718 |
|
MD5 | 0ec5f7e4bd5a1ee10977def45fabff62 |
|
BLAKE2b-256 | 2ed695e00d74b5031bc3c977094b2cb147afa9541f70e24e67fe971c355270dd |
Close
Hashes for cdiffer-0.1.3-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d142e86b75eee53f1edf67c5a08ff2b8e111276cddfeb4118c7feb7e53189d37 |
|
MD5 | a4aae5f8bd7660da6a11e2a052cd1142 |
|
BLAKE2b-256 | 7a422e1133b28229dbf4dda177e832eb4560a19e12c347617b09e5bbbe6cd82f |
Close
Hashes for cdiffer-0.1.3-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9d0302eda39a4c43ee10130e65f69b610e01312b5aaac82d87f0c0faf011ef7 |
|
MD5 | 444837219665ca890f6b8adb5175ffe7 |
|
BLAKE2b-256 | 3832b4b97e272db194f4da8169f4f5d882d965fdee0e96e6330f407203acd143 |
Close
Hashes for cdiffer-0.1.3-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cdb21807d592159e096d6116b2c2873ecb47006accb6da9263b781b09c990d78 |
|
MD5 | d5fcb137f21dd57a91c895064a2fd3eb |
|
BLAKE2b-256 | a862f58a905f50980e33ad89e257573fb6a4ad057b2221207f8fb305fa2af2d0 |
Close
Hashes for cdiffer-0.1.3-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5aca4da01cdb2807ebbf6ab86e7713dd0b03e67b11f512ab74610d4eaeb51e03 |
|
MD5 | fda5af4bebe135b67689289d0d5ba585 |
|
BLAKE2b-256 | cbb828a67bc9659196417d0c3c4206d4dd8cfe68625cfa871e6eb1010c69cf3e |
Close
Hashes for cdiffer-0.1.3-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99ce9cee7e7c8f42801208c7cd21d267b09d9b0d79fdec7a5e177f20ea4c8423 |
|
MD5 | 693c3b7939bc499e9e77fd4800b8f39e |
|
BLAKE2b-256 | a44e09d878d34d3af6943b9251138613bf7de1320b98d9618746e47bf0a98169 |
Close
Hashes for cdiffer-0.1.3-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c154b766913086c91d3fbb9751cb0a187f5a93dcb32c97abe5dee5a650b1cb58 |
|
MD5 | d14647f1ca2b27dfb068e3be7ed57fc6 |
|
BLAKE2b-256 | 092f6152a284c0ade17b05c3bbbee9c61ddbf5bc0cc68d0bbc25750d49c4a90d |
Close
Hashes for cdiffer-0.1.3-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f15f3a18a0254dc77843d41a0dc9df859d8fe351ca230dfa3c270efc529beed5 |
|
MD5 | 4511be1a445ae2f70192c32780d3a5b6 |
|
BLAKE2b-256 | aca1f1b3658608780ac38cba4bd08cc36bdb05cee52d0f1e4d24c7cd8ed606b0 |
Close
Hashes for cdiffer-0.1.3-cp27-cp27m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 697b827f90f22b81f0ffe4b76459d07934898828d8efa4cb9676904262a1d913 |
|
MD5 | 65af99802cae625af18c7128cbc8acee |
|
BLAKE2b-256 | 647b79ea2feb09090c3a7e7a499e66656bfe71f899b70424077144cc1c4737ee |