Usefull differ function with Levenshtein distance.
Project description
Python C Extention 2 Sequence Compare
Usefull differ function with Levenshtein distance.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
- python2.7
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> from cdiffer import dist
>>>
>>> dist('coffee', 'cafe')
3
>>> dist(list('coffee'), list('cafe'))
3
>>> dist(tuple('coffee'), tuple('cafe'))
3
>>> dist(iter('coffee'), iter('cafe'))
3
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
6
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, it's usually equal or somewhat higher than difflib.SequenceMatcher.ratio(), because it's based on real minimal edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.6
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['replace', 1, 1, 'o', 'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
...:
173 ns ± 0.206 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
741 ns ± 2.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
702 ns ± 2.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 7.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
882 ns ± 7.51 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
210 ns ± 0.335 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
51.8 ns ± 1.18 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
...:
186 ns ± 0.476 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
718 ns ± 0.878 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
691 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 2.01 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
920 ns ± 8.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
223 ns ± 0.938 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
55 ns ± 0.308 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
...:
814 ns ± 2.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.36 µs ± 2.02 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.33 µs ± 4.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.37 µs ± 4.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
2.03 µs ± 19.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
865 ns ± 1.89 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
724 ns ± 1.72 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
320 ns ± 1.26 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
327 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
983 ns ± 17.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file cdiffer-0.1.4-cp36-cp36m-manylinux2014_aarch64.whl
.
File metadata
- Download URL: cdiffer-0.1.4-cp36-cp36m-manylinux2014_aarch64.whl
- Upload date:
- Size: 52.8 kB
- Tags: CPython 3.6m
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d50185b1a60b2f23a2c9bd64521c78c0fd4896fa0f05dffc118bf20abda1314 |
|
MD5 | 861f78d72015fd62083c8d540606fd7e |
|
BLAKE2b-256 | 445cf6ea6425201691980cfcd97052872ae0692fef422f28965974d71c6fc523 |
File details
Details for the file cdiffer-0.1.4-cp36-cp36m-manylinux2010_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.1.4-cp36-cp36m-manylinux2010_x86_64.whl
- Upload date:
- Size: 54.5 kB
- Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d21ede1b2972ed2d2d0fe05d631a6c96bb22e7b7dc9937dec679f6d8c55f3018 |
|
MD5 | 20d2d144f6fab768c5753607c1b4a338 |
|
BLAKE2b-256 | e265df4e64fa752b48d3deeb8f92f8451367eaed88934aae76a11e994a21e731 |
File details
Details for the file cdiffer-0.1.4-cp36-cp36m-macosx_10_15_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.1.4-cp36-cp36m-macosx_10_15_x86_64.whl
- Upload date:
- Size: 22.4 kB
- Tags: CPython 3.6m, macOS 10.15+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd6ce99d44d5efb2bf6aa7c0b1d45e76fed2d36aabbcc9c8dd6706962eac7da0 |
|
MD5 | e84341b46d032d6d6b3b71ecefebb138 |
|
BLAKE2b-256 | 3b4d328eddcab0d32922a6885eb04ac9f1b3180af15eceee67d43cb498f301cd |
File details
Details for the file cdiffer-0.1.4-cp27-cp27mu-manylinux2014_aarch64.whl
.
File metadata
- Download URL: cdiffer-0.1.4-cp27-cp27mu-manylinux2014_aarch64.whl
- Upload date:
- Size: 50.1 kB
- Tags: CPython 2.7mu
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2fb41869663633ff2683697cd13c75fc759ffd4f9a5a4d62b7f05a7ab359250f |
|
MD5 | d383e40469f0bc7784ec115b93aa36a6 |
|
BLAKE2b-256 | 9e2a669ed6028a3a7d2c0b45a11a11b0e110751836807b4068c525ae6bd9b488 |
File details
Details for the file cdiffer-0.1.4-cp27-cp27mu-manylinux2010_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.1.4-cp27-cp27mu-manylinux2010_x86_64.whl
- Upload date:
- Size: 51.5 kB
- Tags: CPython 2.7mu, manylinux: glibc 2.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12ecbf75669f77c69a2b293826a16a2aacc43ea22b5d215b97a1a99764cdb9ef |
|
MD5 | e022c0bf1c305319d96a49fe6c606644 |
|
BLAKE2b-256 | 0e7e882479ebcb4574f7d69b7936e188a57a9e60e2c139ec41910dbd1748cd2b |
File details
Details for the file cdiffer-0.1.4-cp27-cp27m-macosx_10_15_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.1.4-cp27-cp27m-macosx_10_15_x86_64.whl
- Upload date:
- Size: 22.5 kB
- Tags: CPython 2.7m, macOS 10.15+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9d6b58d4994fb8506ff85f07559e6ff72f8bcf092ced787fe67d7fbc888087a |
|
MD5 | aa5b76be6710d364b41b3368f602c512 |
|
BLAKE2b-256 | 1b500cecb3c27346a921e6b148676ef76180aa7fe5f9d0a3a2e358f97bb928a7 |