Edit distance, Similarity and 2 sequence differences printing
Project description
Python C Extention 2 Sequence Compare
Edit distance, Similarity and 2 sequence differences printing.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
Help on built-in function dist in module cdiffer:
dist(...)
Compute absolute Levenshtein distance of two strings.
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> dist('coffee', 'cafe')
4
>>> dist(list('coffee'), list('cafe'))
4
>>> dist(tuple('coffee'), tuple('cafe'))
4
>>> dist(iter('coffee'), iter('cafe'))
4
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
12
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, base on levenshtein edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.6
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False, rep_rate=60)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['delete', 1, None,'o',None]
['insert', None, 1,None,'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['delete', 1, None,'o',None]
['insert', None, 1,None,'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
>>> for x in differ('coffee', 'cafe', rep_rate = 0):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True, rep_rate = 0):
... print(x)
...
['replace', 1, 1, 'o', 'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
cdiffer.compare
compare and prety printing 2 sequence data.
Usage
compare(source_sequence, destination_sequence, diffonly=False, rep_rate=60, condition_value=" ---> ")
Examples
>>> from cdiffer import compare
... compare('coffee', 'cafe')
[[60, 'insert', 'c', 'a', 'f', 'e'],
[60, 'delete', 'c', 'o', 'f', 'f', 'e', 'e']]
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
125 ns ± 0.534 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
677 ns ± 2.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
638 ns ± 3.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
681 ns ± 2.16 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
843 ns ± 3.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
125 ns ± 0.417 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
50.5 ns ± 0.338 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
123 ns ± 0.301 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
680 ns ± 2.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
647 ns ± 1.78 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
680 ns ± 7.57 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
848 ns ± 4.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
130 ns ± 0.595 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
54.8 ns ± 0.691 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
735 ns ± 4.18 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.36 µs ± 5.17 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.31 µs ± 5.25 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.37 µs ± 5.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.33 µs ± 5.32 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.07 µs ± 6.75 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
638 ns ± 3.67 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
524 ns ± 2.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
539 ns ± 2.23 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.07 µs ± 1.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [6]: %timeit compare("coffee", "cafe")
...: %timeit compare([list("abc"), list("abc")], [list("abc"), list("acc"), list("xtz")], rep_rate=50)
...: %timeit compare(["abc", "abc"], ["abc", "acc", "xtz"], rep_rate=40)
...: %timeit compare(["abc", "abc"], ["abc", "acc", "xtz"], rep_rate=50)
844 ns ± 3.88 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
3.32 µs ± 6.92 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
1.16 µs ± 3.94 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.3 µs ± 31.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file cdiffer-0.6.10.tar.gz
.
File metadata
- Download URL: cdiffer-0.6.10.tar.gz
- Upload date:
- Size: 29.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2638a9853757ec249b4ed959f6aeeb1fe2b41ccd3d89f205530747b5bc4d7702 |
|
MD5 | f8a484f481b4daa4402eecfb9ea642ae |
|
BLAKE2b-256 | 2b894211b293c53e655fd1097960695a25f2c8e34b63b48c177881d3ce170349 |
File details
Details for the file cdiffer-0.6.10-cp39-cp39-win_amd64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 729.8 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e82f2d70babc12caf6a63102d2733e3acc8534cf26aec6abaedd1beda7bc0712 |
|
MD5 | e0661b642871c0f4fc0ac4e1acdcb859 |
|
BLAKE2b-256 | e9f60780307fa530c470b3c41e64a210dc5b74e4f45b1cfc8e5531c0cd848d1b |
File details
Details for the file cdiffer-0.6.10-cp39-cp39-manylinux2014_aarch64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp39-cp39-manylinux2014_aarch64.whl
- Upload date:
- Size: 1.7 MB
- Tags: CPython 3.9
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4effe8d705fdbc3550dca402324b300055f01726b626343fd7813a661b18912 |
|
MD5 | efff2d90825e8c6ff992863b8605dc3b |
|
BLAKE2b-256 | 042d8b7478bbd54cf26695cb4b4dfa1f821533dd2e9eb66294a4faf0da4488e9 |
File details
Details for the file cdiffer-0.6.10-cp39-cp39-manylinux2010_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp39-cp39-manylinux2010_x86_64.whl
- Upload date:
- Size: 2.1 MB
- Tags: CPython 3.9, manylinux: glibc 2.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb34c145bb03d43b02418caa6e7154a9871511af4f06ab401b0186043d06eabe |
|
MD5 | 9e4533ca30109e1990377dcfc93a0fcc |
|
BLAKE2b-256 | 6b1d35339eac038dbca20d287c0879782d8b7770e8455973e4bc77712c6ab4ac |
File details
Details for the file cdiffer-0.6.10-cp39-cp39-macosx_10_16_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp39-cp39-macosx_10_16_x86_64.whl
- Upload date:
- Size: 641.9 kB
- Tags: CPython 3.9, macOS 10.16+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f3ef75d109787e37c6657d1a37420366429160853cf7afd948404d5468653eb |
|
MD5 | 53f21cde1eff2fb9294c2300914a0a4a |
|
BLAKE2b-256 | cc5632f785f3111b5d598c9b837a3e2726f544d506f1baccdf9aa12583893b01 |
File details
Details for the file cdiffer-0.6.10-cp38-cp38-win_amd64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 728.3 kB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ff7800899d88ef6656c2764e5f6edcc52a4fff81896fe5849ee1624cb6d01f1e |
|
MD5 | cf6911f63bca7c86d71d7dd820e1cf7a |
|
BLAKE2b-256 | 00c3d20f6640b1233d59f6c08b9cc3c05e551527ad2d739e6f20581a402ea36b |
File details
Details for the file cdiffer-0.6.10-cp38-cp38-manylinux2010_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp38-cp38-manylinux2010_x86_64.whl
- Upload date:
- Size: 2.2 MB
- Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad893349bd96f9b1049ca3930c91f790ac4c2e61dc5de196686375afb7497ae0 |
|
MD5 | e3fd8a217b47f45d85746dd769a5bce1 |
|
BLAKE2b-256 | fc015980f7642741701b32d966da087b2a0669022b3fd6ffe7cb2d947fa1f494 |
File details
Details for the file cdiffer-0.6.10-cp38-cp38-macosx_10_16_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp38-cp38-macosx_10_16_x86_64.whl
- Upload date:
- Size: 642.4 kB
- Tags: CPython 3.8, macOS 10.16+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07469b7cea0439ff83d4ed657261c0d3473c347a7b736813d4d4e89dbb07f45f |
|
MD5 | cc278f949f3f34772e8b18ed1bc00740 |
|
BLAKE2b-256 | c285fb7edb885226374fbe679dfa02e7f632718a381a10f237ebdf91089e3280 |
File details
Details for the file cdiffer-0.6.10-cp37-cp37m-win_amd64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp37-cp37m-win_amd64.whl
- Upload date:
- Size: 736.0 kB
- Tags: CPython 3.7m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 493ecfb0b5d2a3bd15c4a1a258de31ce833def5975a56bf0b99f06222f6119c4 |
|
MD5 | 939387ad7debf8e19d82bf270e73c90a |
|
BLAKE2b-256 | e679677cff34afd3135a5f8774d18a2b29be5bfedf1b26ea7e684073da158450 |
File details
Details for the file cdiffer-0.6.10-cp37-cp37m-manylinux2014_aarch64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp37-cp37m-manylinux2014_aarch64.whl
- Upload date:
- Size: 1.6 MB
- Tags: CPython 3.7m
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ce7e4a7624ed1206d06baff892e34a44986412150d636d2518a27474f297038 |
|
MD5 | 0c981b8d9a1816e5180aa800c4450774 |
|
BLAKE2b-256 | cf393a111b5b7094be7456fe1838d3b15d997ce4b9271b9599abb3b758c60e6e |
File details
Details for the file cdiffer-0.6.10-cp37-cp37m-manylinux2010_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp37-cp37m-manylinux2010_x86_64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5c8d0c9335964fae6415439c56f936e713e9dc24129af8f83b75129545feb3fa |
|
MD5 | aa3afe5c448b7a5d17fb2c5dc0e00ba2 |
|
BLAKE2b-256 | db8cad411f6b7efc187423ccf81986553685823e6cdd15e2e3f152b3edd05554 |
File details
Details for the file cdiffer-0.6.10-cp37-cp37m-macosx_10_16_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp37-cp37m-macosx_10_16_x86_64.whl
- Upload date:
- Size: 703.8 kB
- Tags: CPython 3.7m, macOS 10.16+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 076e67fd2e867cd1526802d30f6b4a2d146b941b9fd00075d4e0cd7b086fcebc |
|
MD5 | ff182bed1a8a2a4049d1021889c4052d |
|
BLAKE2b-256 | 7000c0e890e695cc3823f0784a7cbcc7ebf230713455deb8395b0555613e0773 |
File details
Details for the file cdiffer-0.6.10-cp36-cp36m-win_amd64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp36-cp36m-win_amd64.whl
- Upload date:
- Size: 735.9 kB
- Tags: CPython 3.6m, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 192304c24ffd9cdac096082121c5d4cf730730dcdc42c0e45de8b590b383c707 |
|
MD5 | 306e51ee388cab24c2fa0c1657d8ceaf |
|
BLAKE2b-256 | e5b811111de5912a8758e3ea4e8246ca5fabc3f337833cb64aecd910ebdf14a1 |
File details
Details for the file cdiffer-0.6.10-cp36-cp36m-manylinux2010_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp36-cp36m-manylinux2010_x86_64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4356457b4deafeafe9d5a3f81e467b45577f830662cb6ad274443396d4968d27 |
|
MD5 | 3b30a6064c2ca635cc5a4b801834a437 |
|
BLAKE2b-256 | 543658e8f662f9f2c92dddbd8def7477acdcb98d0e79282366606cae1dd56c32 |
File details
Details for the file cdiffer-0.6.10-cp36-cp36m-macosx_10_16_x86_64.whl
.
File metadata
- Download URL: cdiffer-0.6.10-cp36-cp36m-macosx_10_16_x86_64.whl
- Upload date:
- Size: 703.8 kB
- Tags: CPython 3.6m, macOS 10.16+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ffa54e4ffc13135b2ee620fcedc31ae5a4e4cf46a9ac25fb6ec4d279165eb6d9 |
|
MD5 | 736053715fd3d98d3d2e6deb8179487d |
|
BLAKE2b-256 | 0f14dac0347707b3f100440c7b23a641dc9416691402bc0bfbdb003dfbafc8c7 |