Edit distance, Similarity and 2 sequence differences printing
Project description
Python C Extention 2 Sequence Compare
Edit distance, Similarity and 2 sequence differences printing.
How to Install?
pip install cdiffer
Requirement
- python3.6 or later
- python2.7
cdiffer.dist
Compute absolute Levenshtein distance of two strings.
Usage
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
Help on built-in function dist in module cdiffer:
dist(...)
Compute absolute Levenshtein distance of two strings.
dist(sequence, sequence)
Examples (it's hard to spell Levenshtein correctly):
>>> dist('coffee', 'cafe')
4
>>> dist(list('coffee'), list('cafe'))
4
>>> dist(tuple('coffee'), tuple('cafe'))
4
>>> dist(iter('coffee'), iter('cafe'))
4
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
12
>>> dist('coffee', 'coffee')
0
cdiffer.similar
Compute similarity of two strings.
Usage
similar(sequence, sequence)
The similarity is a number between 0 and 1, base on levenshtein edit distance.
Examples
>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.6
>>> similar('hoge', 'bar')
0.0
cdiffer.differ
Find sequence of edit operations transforming one string to another.
Usage
differ(source_sequence, destination_sequence, diffonly=False, rep_rate=60)
Examples
>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['delete', 1, None,'o',None]
['insert', None, 1,None,'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
... print(x)
...
['delete', 1, None,'o',None]
['insert', None, 1,None,'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
>>> for x in differ('coffee', 'cafe', rep_rate = 0):
... print(x)
...
['equal', 0, 0, 'c', 'c']
['replace', 1, 1, 'o', 'a']
['equal', 2, 2, 'f', 'f']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
['equal', 5, 3, 'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True, rep_rate = 0):
... print(x)
...
['replace', 1, 1, 'o', 'a']
['delete', 3, None,'f',None]
['delete', 4, None,'e',None]
Performance
C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from cdiffer import *
In [2]: %timeit dist('coffee', 'cafe')
...: %timeit dist(list('coffee'), list('cafe'))
...: %timeit dist(tuple('coffee'), tuple('cafe'))
...: %timeit dist(iter('coffee'), iter('cafe'))
...: %timeit dist(range(4), range(5))
...: %timeit dist('coffee', 'xxxxxx')
...: %timeit dist('coffee', 'coffee')
125 ns ± 0.534 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
677 ns ± 2.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
638 ns ± 3.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
681 ns ± 2.16 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
843 ns ± 3.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
125 ns ± 0.417 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
50.5 ns ± 0.338 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [3]: %timeit similar('coffee', 'cafe')
...: %timeit similar(list('coffee'), list('cafe'))
...: %timeit similar(tuple('coffee'), tuple('cafe'))
...: %timeit similar(iter('coffee'), iter('cafe'))
...: %timeit similar(range(4), range(5))
...: %timeit similar('coffee', 'xxxxxx')
...: %timeit similar('coffee', 'coffee')
123 ns ± 0.301 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
680 ns ± 2.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
647 ns ± 1.78 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
680 ns ± 7.57 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
848 ns ± 4.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
130 ns ± 0.595 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
54.8 ns ± 0.691 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [4]: %timeit differ('coffee', 'cafe')
...: %timeit differ(list('coffee'), list('cafe'))
...: %timeit differ(tuple('coffee'), tuple('cafe'))
...: %timeit differ(iter('coffee'), iter('cafe'))
...: %timeit differ(range(4), range(5))
...: %timeit differ('coffee', 'xxxxxx')
...: %timeit differ('coffee', 'coffee')
735 ns ± 4.18 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.36 µs ± 5.17 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.31 µs ± 5.25 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.37 µs ± 5.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.33 µs ± 5.32 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.07 µs ± 6.75 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
638 ns ± 3.67 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: a = dict(zip('012345', 'coffee'))
...: b = dict(zip('0123', 'cafe'))
...: %timeit dist(a, b)
...: %timeit similar(a, b)
...: %timeit differ(a, b)
524 ns ± 2.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
539 ns ± 2.23 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.07 µs ± 1.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cdiffer-0.4.3.tar.gz
(18.7 kB
view hashes)
Built Distributions
cdiffer-0.4.3-cp39-cp39-win_amd64.whl
(580.4 kB
view hashes)
cdiffer-0.4.3-cp38-cp38-win_amd64.whl
(578.5 kB
view hashes)
Close
Hashes for cdiffer-0.4.3-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ff9c160e9b6e24cd107e4378081cf288261c07a5f4ce543e255cc69c549df161 |
|
MD5 | 9fb0df36f2d6ca7982652344652043f8 |
|
BLAKE2b-256 | 3baadfd7f651f40fb12a7207ed2eda6424d0e393e4634d4cb8a059525d4bf19c |
Close
Hashes for cdiffer-0.4.3-cp39-cp39-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d93fbd4eadb77a663017b0fb4a946a5f6f59cf11c04006575da6adf1ee8e272 |
|
MD5 | 23b422cadb2488f79a7423150f221073 |
|
BLAKE2b-256 | 8cc8b81fa3de4f27ebeb9b779f608b54dfbae2fe4fae0af4d84b6480b36038f7 |
Close
Hashes for cdiffer-0.4.3-cp39-cp39-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6da3fa558df69ac6ac30f5d61062a5654179948210f611d8143cdf5757d5514 |
|
MD5 | dc0cf6143fb1c2d4d48191a25582c8e2 |
|
BLAKE2b-256 | aff2e766f804e1d1063f6de90fc4f965ca50e88437d388d1530ea061a3f0d549 |
Close
Hashes for cdiffer-0.4.3-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7e3f6948d0f8ef829f28b18d4d899b31c4d0d2e8dd9f38cf336c89405ff21b09 |
|
MD5 | 4032196adf46db7b0c5660cec5b4d300 |
|
BLAKE2b-256 | b1c7eb8f4393d4b12ba696519c9e598f3028269ca529cb4f28749ad8583bc2f6 |
Close
Hashes for cdiffer-0.4.3-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 538390530b545459b616e53ee136cbdd0af5bc9d9d91c0ec72516477beed57be |
|
MD5 | 8ce7f315c73c672b2f993733b83f16b8 |
|
BLAKE2b-256 | 6af1987cb0974d22ac73be0de1dcfcb8968a52e01f60f99d9fe05c04289a5370 |
Close
Hashes for cdiffer-0.4.3-cp38-cp38-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb3a13d02c047b84cab5a9f3faccd36185ddc024ad679b07dd7f3aef5987533b |
|
MD5 | 034977ec350fa47ffab733ecf1604af9 |
|
BLAKE2b-256 | 6a02c83ce64e4d7509cd52df264bf71c40bd7882af2bcece670c01187d57e820 |
Close
Hashes for cdiffer-0.4.3-cp38-cp38-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a0de751c6fdace645546b064e8b2e9b19dc86f5de875feaf786676bb2c2e5d5 |
|
MD5 | 457ed99cb89c362ca6370aacebd13312 |
|
BLAKE2b-256 | 3e55aae8cfdea91196772c49ad4fe6cb0caa1ed04b271e9da0b0847e51a31736 |
Close
Hashes for cdiffer-0.4.3-cp38-cp38-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1833a288d3029e0a92eb9a29710d8d6317533472814be4da445aa39d4fc40387 |
|
MD5 | 288d7511bf1170d030cd9f4021b8a677 |
|
BLAKE2b-256 | 909e55f318c9588ce63fe3ddc9531903ea92d0ee3aa7d00a86cd5bfded4c641d |
Close
Hashes for cdiffer-0.4.3-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3cbe6aead6a7581e92abbe6c48b701e025b38933cbc3a0850c2a1c273563924e |
|
MD5 | db060e841b6ca125b10ae9757a953fc3 |
|
BLAKE2b-256 | 399008200dc4b8ad9c611f0e0466c96de3882173219ede5557a92c557b74e35c |
Close
Hashes for cdiffer-0.4.3-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba759a4a2423d0c2c6c7d61da28af13ffad6cc1a96cf18c63204adf61ab332ad |
|
MD5 | a4d33eb507b3adc29f9f56d163972c97 |
|
BLAKE2b-256 | 471a5de037b088f0e62aeb598a8d9c5df2a75280b66e12be492fae6767fa85ec |
Close
Hashes for cdiffer-0.4.3-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dbd794bb099a921555afa32be7c811ce6f7972b2a9552463038407f739e0e215 |
|
MD5 | 17c0f7b42c9b869defb8107fe41b6b3c |
|
BLAKE2b-256 | 3354696617ae7c18f26c60c5062cbf9f8b10b3be756bd3aa26e60dced87f00a3 |
Close
Hashes for cdiffer-0.4.3-cp37-cp37m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cf17ead0abc8078b8ccb90e61fe8582e5b802006bd1fa8ed44cd6686eb5d8f96 |
|
MD5 | a10e45b75ed95877a68bba5d7ea8698a |
|
BLAKE2b-256 | b9bb151577692b97c0dd2b8c6c8bd70859e6fdbb70c64a29540279d6fb6a0efd |
Close
Hashes for cdiffer-0.4.3-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dfff172239c591f190a48804f9784b228d52064c2cda96e864d390daf5b7b11a |
|
MD5 | 2b257606d96f7f7edc19b3e19a63699c |
|
BLAKE2b-256 | 38444b46147ce74412df17413c1368b7ab28178628548a4599e62d9e8eab873b |
Close
Hashes for cdiffer-0.4.3-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86c6b662ad7de1684f123d4bf2310ef4ecabcbd5a8a37a8edd7f1efc9fba5dc3 |
|
MD5 | 0f49b03940e1dfa8432796a6eb611d82 |
|
BLAKE2b-256 | 76536d0cc31ddb80c96f21ef3a3562088e073656d0d2e5e0565cb0ce844734d7 |
Close
Hashes for cdiffer-0.4.3-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc50a8dc0eb0ec5f4d58498c911ec439bf686edcd03a0125616e8bcf903d9b3e |
|
MD5 | a53552d500af4f8408b0e588c75be18b |
|
BLAKE2b-256 | c8c9f41ab7a831c9a6a9a0bdd604d9e0648e3702e87103eaa2b3bad744cd5520 |
Close
Hashes for cdiffer-0.4.3-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b6a34227f1ad9536b8ce92142e9fde1766049f14db83f1dca36da0bca76d852 |
|
MD5 | 40e622a0c98cfe2e513b1bf6525bb6cd |
|
BLAKE2b-256 | 8568a20aa9a6cbdcf697fa99f0d693b28527af1785e03f5f7d5aef4a813a9faa |
Close
Hashes for cdiffer-0.4.3-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 359b52bd002bfb4bd3d1fccceebe7238968d1f6abdafc3ec53236e1a4fc222a3 |
|
MD5 | 367df1844941db5d3962bbb8133e2fdc |
|
BLAKE2b-256 | 417a20b2d78053f4ba4603ea61eebd5f14f9e943de1c26e76b3d92fac9646b32 |
Close
Hashes for cdiffer-0.4.3-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 19db1cfd3fdb044f4ff76d7951cadfb53429af9e8eca6ebd8a1c41baeaf559df |
|
MD5 | c311d86e4a668e0b77f22f7ac5b2a366 |
|
BLAKE2b-256 | 163030a463a7c980b6394132f06e27d92ed561349a8a0a17074fada849d5999e |