Skip to main content

Usefull differ function with Levenshtein distance.

Project description

Python C Extention 2 Sequence Compare

Upload pypi.org

Usefull differ function with Levenshtein distance.

How to Install?

pip install cdiffer

Requirement

  • python3.6 or later
  • python2.7

cdiffer.dist

Compute absolute Levenshtein distance of two strings.

Usage

dist(sequence, sequence)

Examples (it's hard to spell Levenshtein correctly):

>>> from cdiffer import dist
>>>
>>> dist('coffee', 'cafe')
3
>>> dist(list('coffee'), list('cafe'))
3
>>> dist(tuple('coffee'), tuple('cafe'))
3
>>> dist(iter('coffee'), iter('cafe'))
3
>>> dist(range(4), range(5))
1
>>> dist('coffee', 'xxxxxx')
6
>>> dist('coffee', 'coffee')
0

cdiffer.similar

Compute similarity of two strings.

Usage

similar(sequence, sequence)

The similarity is a number between 0 and 1, it's usually equal or somewhat higher than difflib.SequenceMatcher.ratio(), because it's based on real minimal edit distance.

Examples

>>> from cdiffer import similar
>>>
>>> similar('coffee', 'cafe')
0.6
>>> similar('hoge', 'bar')
0.0

cdiffer.differ

Find sequence of edit operations transforming one string to another.

Usage

differ(source_sequence, destination_sequence, diffonly=False)

Examples

>>> from cdiffer import differ
>>>
>>> for x in differ('coffee', 'cafe'):
...     print(x)
...
['equal',   0, 0,   'c', 'c']
['replace', 1, 1,   'o', 'a']
['equal',   2, 2,   'f', 'f']
['delete',  3, None,'f',None]
['delete',  4, None,'e',None]
['equal',   5, 3,   'e', 'e']
>>> for x in differ('coffee', 'cafe', diffonly=True):
...     print(x)
...
['replace', 1, 1,   'o', 'a']
['delete',  3, None,'f',None]
['delete',  4, None,'e',None]

Performance

C:\Windows\system>ipython
Python 3.7.7 (tags/v3.7.7:d7c567b08f, Mar 10 2020, 10:41:24) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.21.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from cdiffer import *

In [2]: %timeit dist('coffee', 'cafe')
   ...: %timeit dist(list('coffee'), list('cafe'))
   ...: %timeit dist(tuple('coffee'), tuple('cafe'))
   ...: %timeit dist(iter('coffee'), iter('cafe'))
   ...: %timeit dist(range(4), range(5))
   ...: %timeit dist('coffee', 'xxxxxx')
   ...: %timeit dist('coffee', 'coffee')
   ...:
173 ns ± 0.206 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
741 ns ± 2.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
702 ns ± 2.15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 7.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
882 ns ± 7.51 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
210 ns ± 0.335 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
51.8 ns ± 1.18 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [3]: %timeit similar('coffee', 'cafe')
   ...: %timeit similar(list('coffee'), list('cafe'))
   ...: %timeit similar(tuple('coffee'), tuple('cafe'))
   ...: %timeit similar(iter('coffee'), iter('cafe'))
   ...: %timeit similar(range(4), range(5))
   ...: %timeit similar('coffee', 'xxxxxx')
   ...: %timeit similar('coffee', 'coffee')
   ...:
186 ns ± 0.476 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
718 ns ± 0.878 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
691 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
706 ns ± 2.01 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
920 ns ± 8.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
223 ns ± 0.938 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
55 ns ± 0.308 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %timeit differ('coffee', 'cafe')
   ...: %timeit differ(list('coffee'), list('cafe'))
   ...: %timeit differ(tuple('coffee'), tuple('cafe'))
   ...: %timeit differ(iter('coffee'), iter('cafe'))
   ...: %timeit differ(range(4), range(5))
   ...: %timeit differ('coffee', 'xxxxxx')
   ...: %timeit differ('coffee', 'coffee')
   ...:
814 ns ± 2.79 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.36 µs ± 2.02 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.33 µs ± 4.19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.37 µs ± 4.64 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
2.03 µs ± 19.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
865 ns ± 1.89 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
724 ns ± 1.72 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [5]: a = dict(zip('012345', 'coffee'))
   ...: b = dict(zip('0123', 'cafe'))
   ...: %timeit dist(a, b)
   ...: %timeit similar(a, b)
   ...: %timeit differ(a, b)
320 ns ± 1.26 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
327 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
983 ns ± 17.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

cdiffer-0.1.5-cp36-cp36m-manylinux2014_aarch64.whl (52.8 kB view details)

Uploaded CPython 3.6m

cdiffer-0.1.5-cp36-cp36m-manylinux2010_x86_64.whl (54.5 kB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

cdiffer-0.1.5-cp36-cp36m-macosx_10_15_x86_64.whl (22.4 kB view details)

Uploaded CPython 3.6m macOS 10.15+ x86-64

cdiffer-0.1.5-cp27-cp27mu-manylinux2014_aarch64.whl (50.1 kB view details)

Uploaded CPython 2.7mu

cdiffer-0.1.5-cp27-cp27mu-manylinux2010_x86_64.whl (51.5 kB view details)

Uploaded CPython 2.7mu manylinux: glibc 2.12+ x86-64

cdiffer-0.1.5-cp27-cp27m-macosx_10_15_x86_64.whl (22.5 kB view details)

Uploaded CPython 2.7m macOS 10.15+ x86-64

File details

Details for the file cdiffer-0.1.5-cp36-cp36m-manylinux2014_aarch64.whl.

File metadata

  • Download URL: cdiffer-0.1.5-cp36-cp36m-manylinux2014_aarch64.whl
  • Upload date:
  • Size: 52.8 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.5-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 4a4984ef9a222b6eba06210731418b6182e57c58095cd2ee89ab18ae09485133
MD5 bc833249d3a8d49acacbaf128d2975b6
BLAKE2b-256 bcec41d019a136b9fe5c04d5fd9b092aa043f2e3e0b7a32c95ad016272d41a4a

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.5-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: cdiffer-0.1.5-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 54.5 kB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.5-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d6ee081181d09793d38c330d3b72a6c41d5f1f066a4bfe38d50453fc4f303856
MD5 16f70226b3bfb2bb51b474349370b3fd
BLAKE2b-256 453b92ca86ea6b05c1ce1fae42fcbe6a18bc4beeca2be1f5fe73b84dfe1198a5

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.5-cp36-cp36m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: cdiffer-0.1.5-cp36-cp36m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: CPython 3.6m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.5-cp36-cp36m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 c7cc39aaf8e484cd1d3cb008d6a2a9bb13bc570d720b9112e1612c6165be551b
MD5 105390fac2f0171f683b663a927e776b
BLAKE2b-256 468bbed2c15536c18db8fe441eae9ee74bd6fc914f3470a8040c6e83c8114670

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.5-cp27-cp27mu-manylinux2014_aarch64.whl.

File metadata

  • Download URL: cdiffer-0.1.5-cp27-cp27mu-manylinux2014_aarch64.whl
  • Upload date:
  • Size: 50.1 kB
  • Tags: CPython 2.7mu
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.5-cp27-cp27mu-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 302232ff8ecd4c1afec8d6c752286e826efda66778643491897f09c8fc497d07
MD5 9dba5aee66481b8d86e04121f5e8485d
BLAKE2b-256 13e72ff9709cde921187b763083cb5cf3fdd1a6913f98a75e9b92d05f4e3676f

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.5-cp27-cp27mu-manylinux2010_x86_64.whl.

File metadata

  • Download URL: cdiffer-0.1.5-cp27-cp27mu-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 51.5 kB
  • Tags: CPython 2.7mu, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.5-cp27-cp27mu-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f186490999a7c0f9c32fe82bdd9ee222e885c10ccde1d7dc266499a1ef807921
MD5 2c95f2bdcb20b24eca6da4ad53454969
BLAKE2b-256 7078f42f674eb12fa511082adaa9acddb3dec51fe1c62ac2815986f976611fbf

See more details on using hashes here.

File details

Details for the file cdiffer-0.1.5-cp27-cp27m-macosx_10_15_x86_64.whl.

File metadata

  • Download URL: cdiffer-0.1.5-cp27-cp27m-macosx_10_15_x86_64.whl
  • Upload date:
  • Size: 22.5 kB
  • Tags: CPython 2.7m, macOS 10.15+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.5

File hashes

Hashes for cdiffer-0.1.5-cp27-cp27m-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 88e2f59777546764a4c39cde9df10c89ad868957f780c2ca41d18fde6e643888
MD5 ca003a041e3e23b9d8705436c8936bfa
BLAKE2b-256 666d58cb1cb75a35d47dd0f6f96cf92696e1a5237c056d0c0cdb6174736c8862

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page