Skip to main content

C implementation of parts of difflib

Project description

cdifflib

Python difflib sequence matcher reimplemented in C.

Actually only contains reimplemented parts. Creates a CSequenceMatcher type which inherets most functions from difflib.SequenceMatcher.

cdifflib is about 4x the speed of the pure python difflib when diffing large streams.

Limitations

The C part of the code can only work on list rather than generic iterables, so anything that isn't a list will be converted to list in the CSequenceMatcher constructor. This may cause undesirable behavior if you're not expecting it.

Works with Python 2.7 and 3.6 (Should work on all 3.3+)

Usage

Can be used just like the difflib.SequenceMatcher as long as you pass lists. These examples are right out of the difflib docs:

>>> from cdifflib import CSequenceMatcher
>>> s = CSequenceMatcher(None, ' abcd', 'abcd abcd')
>>> s.find_longest_match(0, 5, 0, 9)
Match(a=1, b=0, size=4)
>>> s = CSequenceMatcher(lambda x: x == " ",
...                      "private Thread currentThread;",
...                      "private volatile Thread currentThread;")
>>> print round(s.ratio(), 3)
0.866

It's completely compatible, so you can replace the difflib version on startup and then other libraries will use CSequenceMatcher too, eg:

from cdifflib import CSequenceMatcher
import difflib
difflib.SequenceMatcher = CSequenceMatcher
import library_that_uses_difflib

# Now the library will transparantely be using the C SequenceMatcher - other
# things remain the same
library_that_uses_difflib.do_some_diffing()

Making

To install:

python setup.py install

To test:

python setup.py test

License etc

This code lives at https://github.com/mduggan. See LICENSE for the license.

Changelog

  • 1.1.0 - Added Python 3.6 support (thanks Bclavie)
  • 1.0.4 - Changes to make it compile on MSVC++ compiler, no change for other platforms
  • 1.0.2 - Bugfix - also replace set_seq1 implementation so difflib.compare works with a CSequenceMatcher
  • 1.0.1 - Implement more bits in c to squeeze a bit more speed out
  • 1.0.0 - First release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdifflib-1.2.2.macosx-10.14-intel.tar.gz (8.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cdifflib-1.2.2-py3.7-macosx-10.14-x86_64.egg (10.4 kB view details)

Uploaded Egg

cdifflib-1.2.2-py2.7-macosx-10.14-intel.egg (10.2 kB view details)

Uploaded Egg

cdifflib-1.2.2-cp37-cp37m-macosx_10_14_x86_64.whl (9.0 kB view details)

Uploaded CPython 3.7mmacOS 10.14+ x86-64

cdifflib-1.2.2-cp27-cp27m-macosx_10_14_intel.whl (8.9 kB view details)

Uploaded CPython 2.7mmacOS 10.14+ Intel (x86-64, i386)

File details

Details for the file cdifflib-1.2.2.macosx-10.14-intel.tar.gz.

File metadata

  • Download URL: cdifflib-1.2.2.macosx-10.14-intel.tar.gz
  • Upload date:
  • Size: 8.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.2.macosx-10.14-intel.tar.gz
Algorithm Hash digest
SHA256 80f39bcfe3480fc29b79f77f5fccea59e52ad72ce50e9f9aa167fcfde784381f
MD5 811f7546f0e198b7f114b26869378d76
BLAKE2b-256 7d1a3db2310b9cd7acef1333ed63d98f4a9d463370eaf65db447a10188a530ee

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.2-py3.7-macosx-10.14-x86_64.egg.

File metadata

  • Download URL: cdifflib-1.2.2-py3.7-macosx-10.14-x86_64.egg
  • Upload date:
  • Size: 10.4 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.2-py3.7-macosx-10.14-x86_64.egg
Algorithm Hash digest
SHA256 e28d06c6f68081b2be0079f86e6836ca1fc13851dbe84d1ec20df36ccb1e8875
MD5 8ca788310f3934c82ceca3f346027d39
BLAKE2b-256 733d57c9149b8c75408bffd1ec82be2affd7e530533c8b34d3429be0e85613b8

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.2-py2.7-macosx-10.14-intel.egg.

File metadata

  • Download URL: cdifflib-1.2.2-py2.7-macosx-10.14-intel.egg
  • Upload date:
  • Size: 10.2 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.2-py2.7-macosx-10.14-intel.egg
Algorithm Hash digest
SHA256 e6caa6a5734ad109e8fdb0fa181e6f39b07ca7ab8d8eb4379086fd63154138e0
MD5 a2a52b5dddae43d04f3726f8e82182c0
BLAKE2b-256 1aa834983e4a1a7dc182e46e58c6c04c5f3c6c0f6a1c0dc0e91cbeeccf9be388

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.2-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: cdifflib-1.2.2-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.2-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a6c1b5637ef74b0b998a970bb479db5fd981272d2b51c55197277e0e04ff051a
MD5 bef6f014036974d4a5e1b90838097603
BLAKE2b-256 a6bf2fe64ab9294015b8f8bddd5d680e73e6fa4852bb6757ac2dde56d4a2ee87

See more details on using hashes here.

File details

Details for the file cdifflib-1.2.2-cp27-cp27m-macosx_10_14_intel.whl.

File metadata

  • Download URL: cdifflib-1.2.2-cp27-cp27m-macosx_10_14_intel.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: CPython 2.7m, macOS 10.14+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.10

File hashes

Hashes for cdifflib-1.2.2-cp27-cp27m-macosx_10_14_intel.whl
Algorithm Hash digest
SHA256 8d2e7ca4948bf633b9399f42949de74885baabaa88c602809d62689cd6822ff7
MD5 d25efdea2f9006a0bb9fa032d8f8e286
BLAKE2b-256 e514d612cb0072e2e2e4659c259864680acdd370188af2ae00541ff6e4c7ac0e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page