Skip to main content

Python extension for computing string edit distances and similarities.

Project description

Maintainer wanted

MaintainerWanted

I am looking for a new maintainer to the project as it is apparent that I haven’t had the need for this particular library for well over 7 years now, due to it being a C-only library and its somewhat restrictive original license.

Introduction

The Levenshtein Python C extension module contains functions for fast computation of

  • Levenshtein (edit) distance, and edit operations

  • string similarity

  • approximate median strings, and generally string averaging

  • string sequence and set similarity

It supports both normal and Unicode strings.

Python 2.2 or newer is required; Python 3 is supported.

StringMatcher.py is an example SequenceMatcher-like class built on the top of Levenshtein. It misses some SequenceMatcher’s functionality, and has some extra OTOH.

Levenshtein.c can be used as a pure C library, too. You only have to define NO_PYTHON preprocessor symbol (-DNO_PYTHON) when compiling it. The functionality is similar to that of the Python extension. No separate docs are provided yet, RTFS. But they are not interchangeable:

  • C functions exported when compiling with -DNO_PYTHON (see Levenshtein.h) are not exported when compiling as a Python extension (and vice versa)

  • Unicode character type used with -DNO_PYTHON is wchar_t, Python extension uses Py_UNICODE, they may be the same but don’t count on it

Installation

pip install python-Levenshtein

Documentation

gendoc.sh generates HTML API documentation, you probably want a selfcontained instead of includable version, so run in ./gendoc.sh --selfcontained. It needs Levenshtein already installed and genextdoc.py.

License

Levenshtein is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

See the file COPYING for the full text of GNU General Public License version 2.

History

This package was long missing from the Python Package Index and available as source checkout only, but can now be found on PyPI again.

We needed to restore this package for Go Mobile for Plone and Pywurfl projects which depend on this.

Source code

Authors

  • Maintainer: Antti Haapala <antti@haapala.name>

  • Python 3 compatibility: Esa Määttä

  • Jonatas CD: Fixed documentation generation

  • Previous maintainer: Mikko Ohtamaa

  • Original code: David Necas (Yeti) <yeti at physics.muni.cz>

Changelog

0.12.1

  • Fixed handling of numerous possible wraparounds in calculating the size of memory allocations; incorrect handling of which could cause denial of service or even possible remote code execution in previous versions of the library.

0.12.0

  • Fixed a bug in StringMatcher.StringMatcher.get_matching_blocks / extract_editops for Python 3; now allow only str editops on both Python 2 and Python 3, for simpler and working code.

  • Added documentation in the source distribution and in GIT

  • Fixed the package layout: renamed the .so/.dll to _levenshtein, and made it reside inside a package, along with the StringMatcher class.

  • Fixed spelling errors.

0.11.2

  • Fixed a bug in setup.py: installation would fail on Python 3 if the locale did not specify UTF-8 charset (Felix Yan).

  • Added COPYING, StringMatcher.py, gendoc.sh and NEWS in MANIFEST.in, as they were missing from source distributions.

0.11.1

  • Added Levenshtein.h to MANIFEST.in

0.11.0

  • Python 3 support, maintainership passed to Antti Haapala

0.10.1 - 0.10.2

  • Made python-Lehvenstein Git compatible and use setuptools for PyPi upload

  • Created HISTORY.txt and made README reST compatible

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-Levenshtein-0.12.1.tar.gz (50.6 kB view details)

Uploaded Source

File details

Details for the file python-Levenshtein-0.12.1.tar.gz.

File metadata

  • Download URL: python-Levenshtein-0.12.1.tar.gz
  • Upload date:
  • Size: 50.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.5

File hashes

Hashes for python-Levenshtein-0.12.1.tar.gz
Algorithm Hash digest
SHA256 554e273a88060d177e7b3c1e6ea9158dde11563bfae8f7f661f73f47e5ff0911
MD5 f7e8eb7cc2fc8984e4b46edb8980a0b5
BLAKE2b-256 6bca1a9d7115f233d929d4f25a4021795cd97cc89eeb82723ea98dd44390a530

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page