Skip to main content

Usefull differ function with Levenshtein distance.

Project description

Python C Extention 2 Sequence Compare

Upload pypi.org

Usefull differ function with Levenshtein distance.

How to Install?

pip install cdiffer

Requirement

  • python3.6 or later
  • python2.7

cdiffer.dist

Compute absolute Levenshtein distance of two strings.

Usage

dist(sequence, sequence)

Examples (it's hard to spell Levenshtein correctly):

>>> from cdiffer import dist
>>>
>>> dist('coffee', 'cafe')
3
>>> dist(list('coffee'), list('cafe'))
3
>>> dist(tuple('coffee'), tuple('cafe'))
3
>>> dist(iter('coffee'), iter('cafe'))
3
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
6
>>> dist('coffee', 'coffee')
0

cdiffer.similar

Compute similarity of two strings.

Usage

similar(sequence, sequence)

The similarity is a number between 0 and 1, it's usually equal or somewhat higher than difflib.SequenceMatcher.ratio(), because it's based on real minimal edit distance.

Examples

>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.6
>>> similar('hoge', 'bar')
0.0

cdiffer.differ

Find sequence of edit operations transforming one string to another.

Usage

differ(source_sequence, destination_sequence, diffonly=False)

Examples

>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
...     print(x)
...
['equal',   0, 0,   'c', 'c']
['replace', 1, 1,   'o', 'a']
['equal',   2, 2,   'f', 'f']
['delete',  3, None,'f',None]
['delete',  4, None,'e',None]
['equal',   5, 3,   'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
...     print(x)
...
['replace', 1, 1,   'o', 'a']
['delete',  3, None,'f',None]
['delete',  4, None,'e',None]

Performance

C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from cdiffer import *

In [2]: %timeit dist('coffee', 'cafe')
   ...: %timeit dist(list('coffee'), list('cafe'))
   ...: %timeit dist(tuple('coffee'), tuple('cafe'))
   ...: %timeit dist(iter('coffee'), iter('cafe'))
   ...: %timeit dist(range(4), range(5))
   ...: %timeit dist('coffee', 'xxxxxx')
   ...: %timeit dist('coffee', 'coffee')
   ...:
173 ns ± 0.206 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
741 ns ± 2.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
702 ns ± 2.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 7.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
882 ns ± 7.51 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
210 ns ± 0.335 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
51.8 ns ± 1.18 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [3]: %timeit similar('coffee', 'cafe')
   ...: %timeit similar(list('coffee'), list('cafe'))
   ...: %timeit similar(tuple('coffee'), tuple('cafe'))
   ...: %timeit similar(iter('coffee'), iter('cafe'))
   ...: %timeit similar(range(4), range(5))
   ...: %timeit similar('coffee', 'xxxxxx')
   ...: %timeit similar('coffee', 'coffee')
   ...:
186 ns ± 0.476 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
718 ns ± 0.878 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
691 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 2.01 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
920 ns ± 8.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
223 ns ± 0.938 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
55 ns ± 0.308 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %timeit differ('coffee', 'cafe')
   ...: %timeit differ(list('coffee'), list('cafe'))
   ...: %timeit differ(tuple('coffee'), tuple('cafe'))
   ...: %timeit differ(iter('coffee'), iter('cafe'))
   ...: %timeit differ(range(4), range(5))
   ...: %timeit differ('coffee', 'xxxxxx')
   ...: %timeit differ('coffee', 'coffee')
   ...:
814 ns ± 2.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.36 µs ± 2.02 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.33 µs ± 4.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.37 µs ± 4.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
2.03 µs ± 19.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
865 ns ± 1.89 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
724 ns ± 1.72 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [5]: a = dict(zip('012345', 'coffee'))
   ...: b = dict(zip('0123', 'cafe'))
   ...: %timeit dist(a, b)
   ...: %timeit similar(a, b)
   ...: %timeit differ(a, b)
320 ns ± 1.26 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
327 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
983 ns ± 17.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

cdiffer-0.1.4-cp36-cp36m-manylinux2014_aarch64.whl (52.8 kB view details)

Uploaded CPython 3.6m

cdiffer-0.1.4-cp36-cp36m-manylinux2010_x86_64.whl (54.5 kB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

cdiffer-0.1.4-cp36-cp36m-macosx_10_15_x86_64.whl (22.4 kB view details)

Uploaded CPython 3.6m macOS 10.15+ x86-64

cdiffer-0.1.4-cp27-cp27mu-manylinux2014_aarch64.whl (50.1 kB view details)

Uploaded CPython 2.7mu

cdiffer-0.1.4-cp27-cp27mu-manylinux2010_x86_64.whl (51.5 kB view details)

Uploaded CPython 2.7mu manylinux: glibc 2.12+ x86-64

cdiffer-0.1.4-cp27-cp27m-macosx_10_15_x86_64.whl (22.5 kB view details)

Uploaded CPython 2.7m macOS 10.15+ x86-64

File details

Details for the file cdiffer-0.1.4-cp36-cp36m-manylinux2014_aarch64.whl.

File metadata

  • Download URL: cdiffer-0.1.4-cp36-cp36m-manylinux2014_aarch64.whl
  • Upload date:
  • Size: 52.8 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.4-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0d50185b1a60b2f23a2c9bd64521c78c0fd4896fa0f05dffc118bf20abda1314
MD5 861f78d72015fd62083c8d540606fd7e
BLAKE2b-256 445cf6ea6425201691980cfcd97052872ae0692fef422f28965974d71c6fc523

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.4-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: cdiffer-0.1.4-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 54.5 kB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.4-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d21ede1b2972ed2d2d0fe05d631a6c96bb22e7b7dc9937dec679f6d8c55f3018
MD5 20d2d144f6fab768c5753607c1b4a338
BLAKE2b-256 e265df4e64fa752b48d3deeb8f92f8451367eaed88934aae76a11e994a21e731

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.4-cp36-cp36m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: cdiffer-0.1.4-cp36-cp36m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: CPython 3.6m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.4-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 cd6ce99d44d5efb2bf6aa7c0b1d45e76fed2d36aabbcc9c8dd6706962eac7da0
MD5 e84341b46d032d6d6b3b71ecefebb138
BLAKE2b-256 3b4d328eddcab0d32922a6885eb04ac9f1b3180af15eceee67d43cb498f301cd

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.4-cp27-cp27mu-manylinux2014_aarch64.whl.

File metadata

  • Download URL: cdiffer-0.1.4-cp27-cp27mu-manylinux2014_aarch64.whl
  • Upload date:
  • Size: 50.1 kB
  • Tags: CPython 2.7mu
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.4-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 2fb41869663633ff2683697cd13c75fc759ffd4f9a5a4d62b7f05a7ab359250f
MD5 d383e40469f0bc7784ec115b93aa36a6
BLAKE2b-256 9e2a669ed6028a3a7d2c0b45a11a11b0e110751836807b4068c525ae6bd9b488

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.4-cp27-cp27mu-manylinux2010_x86_64.whl.

File metadata

  • Download URL: cdiffer-0.1.4-cp27-cp27mu-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 51.5 kB
  • Tags: CPython 2.7mu, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.4-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 12ecbf75669f77c69a2b293826a16a2aacc43ea22b5d215b97a1a99764cdb9ef
MD5 e022c0bf1c305319d96a49fe6c606644
BLAKE2b-256 0e7e882479ebcb4574f7d69b7936e188a57a9e60e2c139ec41910dbd1748cd2b

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.4-cp27-cp27m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: cdiffer-0.1.4-cp27-cp27m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 22.5 kB
  • Tags: CPython 2.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.4-cp27-cp27m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 b9d6b58d4994fb8506ff85f07559e6ff72f8bcf092ced787fe67d7fbc888087a
MD5 aa5b76be6710d364b41b3368f602c512
BLAKE2b-256 1b500cecb3c27346a921e6b148676ef76180aa7fe5f9d0a3a2e358f97bb928a7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page