Usefull differ function with Levenshtein distance.
Project description
Python C Extention 2 Sequence Compare
Usefull differ function with Levenshtein distance.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
- python2.7
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> from cdiffer import dist
>>>
>>> dist('coffee', 'cafe')
3
>>> dist(list('coffee'), list('cafe'))
3
>>> dist(tuple('coffee'), tuple('cafe'))
3
>>> dist(iter('coffee'), iter('cafe'))
3
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
6
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, base on levenshtein edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.7
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False, rep_rate=60)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['insert', None, 1, None, 'a']
['delete', 1, None, 'o', None]
['equal', 2, 2, 'f', 'f']
['delete', 3, None, 'f', None]
['delete', 4, None, 'e', None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['insert', None, 1, None, 'a']
['delete', 1, None, 'o', None]
['delete', 3, None, 'f', None]
['delete', 4, None, 'e', None]
>>> # Matching rate option is `rep_rate` (default is 60(%))
>>> for x in differ('coffee', 'cafe', rep_rate=0):
... print(x)
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None, 'f', None]
['delete', 4, None, 'e', None]
['equal', 5, 3, 'e', 'e']
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
162 ns ± 0.752 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
709 ns ± 3.33 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
658 ns ± 7.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.08 µs ± 6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.14 µs ± 5.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
199 ns ± 0.38 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
50.3 ns ± 0.103 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
161 ns ± 0.079 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
708 ns ± 5.43 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
671 ns ± 2.35 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.11 µs ± 15.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.15 µs ± 7.85 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
196 ns ± 0.242 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
50.9 ns ± 0.628 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
683 ns ± 1.41 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.21 µs ± 9.12 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.16 µs ± 13.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.63 µs ± 9.98 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
2.1 µs ± 8.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
1 µs ± 3.05 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
493 ns ± 1.28 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
436 ns ± 1.45 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
434 ns ± 2.07 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
932 ns ± 3.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cdiffer-0.2.1.tar.gz
(18.4 kB
view hashes)
Built Distributions
cdiffer-0.2.1-cp39-cp39-win_amd64.whl
(151.6 kB
view hashes)
cdiffer-0.2.1-cp38-cp38-win_amd64.whl
(150.0 kB
view hashes)
Close
Hashes for cdiffer-0.2.1-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f14c68c4b5453b6b2c4bf88950f41b4d3f4cfaa90ad251b4191af19b47b9fab4 |
|
MD5 | fca02cdef59bdd79c8396ea0b46dfb25 |
|
BLAKE2b-256 | e82231aef29eb8c85797f77ea3855c1a7cc11300aec342c6abc38e1ab5862b26 |
Close
Hashes for cdiffer-0.2.1-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7dc36bc2d8d5e8427d59502056f8572e2ce2f3c0f7f330895eef2e3efb4b8e1b |
|
MD5 | 189b3a01f43efc173a7bba6131134f66 |
|
BLAKE2b-256 | 35a28bac805fdfceacd972b39c9379c2d716142f6799e7da52d217c904d52cea |
Close
Hashes for cdiffer-0.2.1-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06f001e875755bdd2a50d09cee4f65cdb9c43cfa5c0bbbe8310496fcf65849e1 |
|
MD5 | a43651e4df6f116a7a569c7025538c1d |
|
BLAKE2b-256 | 96a7cffb73d4d104b27afc85b6f891631d6c17987e2f3cc3103288e0f9f02079 |
Close
Hashes for cdiffer-0.2.1-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e71e52d8050f57c066dd35104031f23dc03f2a0e4a4bfa1a8375ee5e47d32049 |
|
MD5 | 27699f40cb2826e2430248fd6d908398 |
|
BLAKE2b-256 | d07f12cf9087f8992e0004b398224d773045e3d86ef31149827eb720a3ee5eab |
Close
Hashes for cdiffer-0.2.1-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a05be8d3fbd1a5367cc4f81ef533bff7da538ff70294abfd377253a1cd729f9 |
|
MD5 | 55aa6844a5450ccd311798dd6771e358 |
|
BLAKE2b-256 | e40d92e8b6d4643730e124528903a985c726edcca7064bad19d098b61c8a2672 |
Close
Hashes for cdiffer-0.2.1-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 64fae90e1ac970d7422e520c6e1a21780c9b336da1f79eac983825051f437032 |
|
MD5 | abea791b17b946248649717b4caa9621 |
|
BLAKE2b-256 | 2a10684676348e33b3a732a572906abc314b393f969043feadfd8e2df9923bdc |
Close
Hashes for cdiffer-0.2.1-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04a95d858fed81536b3d9c4ff7f0172c0c825502ae6e97c679de92885f4a04cb |
|
MD5 | 6ab3900a1e628ad463a4e1d280e49be9 |
|
BLAKE2b-256 | 9f624b5bf0a4d4c77673c0df7e68da4fc93585b4cc8f6065b13fdd64b62ae43a |
Close
Hashes for cdiffer-0.2.1-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1db6bc13261b48a1b275cb71510adf8b442c8071e05774d0c45c75e99420054c |
|
MD5 | 2e5937550b97e861f775f97256e15732 |
|
BLAKE2b-256 | e82c2c5a72b7ddc85c96820e951af63d2f0dc4d181182a5eb660132b9cb03a4a |
Close
Hashes for cdiffer-0.2.1-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | baced10bd14013314669826d04e259b705ae8695eff485da3c5d932fa0bafe45 |
|
MD5 | 1d5da7a4ef92cf46c69b03cf8f2f8884 |
|
BLAKE2b-256 | 287785d64f80d7baa25c13de04f189ea299e28091671cb46246185cd1191538a |
Close
Hashes for cdiffer-0.2.1-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7865ac33538e2541206e44d70e3b83988eddcdb31a8e83db8f2d3500006159d |
|
MD5 | 0dcc028fc697925b23187e77cc3a9339 |
|
BLAKE2b-256 | c7220f84cb14e186700a5af1bb14ec31fc4db6bab0c4ca5c8dbcf66ad4b27a66 |
Close
Hashes for cdiffer-0.2.1-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96b6ece466559b6f0746e8fc654a19599fea670a3de10f765e4d12ba3e41fa06 |
|
MD5 | c4dd13a55fcee833403e81361f1e8e6f |
|
BLAKE2b-256 | 4cb7f527ffba37ec7ae51e45bd320274b04f38a6ed8ecadc78fe5b3551dd1049 |
Close
Hashes for cdiffer-0.2.1-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40b042d49c684226e655844e7b8e5b7167f28cd1909287a67c46977884f41cd7 |
|
MD5 | cf327a8a89d42f5ae1b4c6a8a76283c6 |
|
BLAKE2b-256 | cda174412adf48fd5ea5deec200b5e8350a5ef0b40a2db3df2f54926b406d996 |
Close
Hashes for cdiffer-0.2.1-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7dfa02790d81df7ea20cba9979ee800c099829ed8ba758780e164dee658e942c |
|
MD5 | 047c53c72f01dfe2d4e2182d108d60bc |
|
BLAKE2b-256 | 9defe4a8b744ffa2240ae4cd87166952097787260e2c7e76dad0d3b043345c60 |
Close
Hashes for cdiffer-0.2.1-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 880a19a4545b7d7d07cb4c86c1472bb672aabd7852c87c858819b4a3d45b2ec3 |
|
MD5 | b41599061ef6eb40806b0eb1f2977d0c |
|
BLAKE2b-256 | 60fd5a7b00ec4671cf6630af2c75a77dbf5a388270a5a46db1dfed030e0a31e7 |
Close
Hashes for cdiffer-0.2.1-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78c84b9b6c1a0d64c3657b2077c45a7997be266c47199a0253899b085ebe03af |
|
MD5 | 9a52074cd34dc326064849a596cd1f4f |
|
BLAKE2b-256 | 6f5f45ba566fa0855f5c5c9260f9b651e7cc9d0fb7a0f3e5e551fb45aa946b67 |
Close
Hashes for cdiffer-0.2.1-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f26ab9b8f85ca3395fa84519484294787c34a5992f977e67b93df8e55bf2e92 |
|
MD5 | 0df8d3c3c4a6ae3364daf2fc4dce69a2 |
|
BLAKE2b-256 | 47239f8d0da5802de7dd858d0f6e06f9053d155553ed48a9142eaea1df4d9b18 |
Close
Hashes for cdiffer-0.2.1-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4603059bc5c1bf6ef1df6cad23c1b84f2d83cfb639578f93b07e18fb10b596c |
|
MD5 | 99878693a3e12d8bd0fdd894bb3ac224 |
|
BLAKE2b-256 | 78065c7ef2a9ab33b7a461bafa18c4d3b4a649c04c20e5fe77592daa4d9224ff |
Close
Hashes for cdiffer-0.2.1-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 304b5dac01504b22ac4837df00f7f91b2da7c54ec7e98071d48c95dd49e187e9 |
|
MD5 | b91764d8dc90912095fb47e68d0628ae |
|
BLAKE2b-256 | 17f2a5eede45f233bb6640cbd0e39f8d9338def01736d119fd6fb91aeab0578d |
Close
Hashes for cdiffer-0.2.1-cp27-cp27m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83ee8fed15fef5aea02cb769f7a0615154619947fa296850a5206c360947b22c |
|
MD5 | 5d10574e3fcc73db7a2a9273f2073fed |
|
BLAKE2b-256 | 89d0d04f19265f47928c5b5cc4e78f402bc62172e854029b36bd934499ef1514 |