Skip to main content

A fast C-implemented library for Levenshtein distance

Project description

Website:

https://ceptord.net/

Latest Release:

v0.8 (2022-10-02)

License:

MIT License

1. Introduction

polyleven is a Pythonic Levenshtein distance library that:

  • Is fast independent of input types, and hence can be used for both short (like English words) and long input types (like DNA sequences).

  • Can be used readily in a manner not covered by restrictive licenses such as GPL, hence can be used freely in private codes.

  • Supports Python 3.x.

2. How to install

The official package is available on PyPI:

$ pip install polyleven

3. How to use

Polyleven provides a single interface function levenshtein(). You can use this function to measure the similarity of two strings.

>>> from polyleven import levenshtein
>>> levenshtein('aaa', 'ccc')
3

If you only care about distances under a certain threshold, you can pass the max threshold to the third argument.

>>> levenshtein('acc', 'ccc', 1)
1
>>> levenshtein('aaa', 'ccc', 1)
2

In general, you can gain a noticeable speed boost with threshold \(k < 3\).

4. Benchmark

4.1 English Words

To compare Polyleven with other Pythonic edit distance libraries, a million word pairs was generated from SCOWL.

Each library was measured how long it takes to evaluate all of these words. The following table summarises the result:

Function Name

TIME[sec]

SPEED[pairs/s]

edlib

4.763

208216

editdistance

1.943

510450

jellyfish.levenshtein_distance

0.722

1374081

distance.levenshtein

0.623

1591396

Levenshtein.distance

0.500

1982764

polyleven.levenshtein

0.431

2303420

4.2. Longer Inputs

To evaluate the efficiency for longer inputs, I created 5000 pairs of random strings of size 16, 32, 64, 128, 256, 512 and 1024.

Each library was measured how fast it can process these entries. [1]

Library

N=16

N=32

N=64

N=128

N=256

N=512

N=1024

edlib

0.040

0.063

0.094

0.205

0.432

0.908

2.089

editdistance

0.027

0.049

0.086

0.178

0.336

0.740

58.139

jellyfish

0.009

0.032

0.118

0.470

1.874

8.877

42.848

distance

0.007

0.029

0.109

0.431

1.726

6.950

27.998

Levenshtein

0.006

0.022

0.085

0.336

1.328

5.286

21.097

polyleven

0.003

0.005

0.010

0.043

0.149

0.550

2.109

3.3. List of Libraries

Library

Version

URL

edlib

v1.2.1

https://github.com/Martinsos/edlib

editdistance

v0.4

https://github.com/aflc/editdistance

jellyfish

v0.5.6

https://github.com/jamesturk/jellyfish

distance

v0.1.3

https://github.com/doukremt/distance

Levenshtein

v0.12

https://github.com/ztane/python-Levenshtein

polyleven

v0.3

https://github.com/fujimotos/polyleven

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polyleven-0.8.tar.gz (6.4 kB view hashes)

Uploaded Source

Built Distributions

polyleven-0.8-pp39-pypy39_pp73-win_amd64.whl (10.6 kB view hashes)

Uploaded PyPy Windows x86-64

polyleven-0.8-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.8 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

polyleven-0.8-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (10.0 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

polyleven-0.8-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (7.0 kB view hashes)

Uploaded PyPy macOS 10.9+ x86-64

polyleven-0.8-pp38-pypy38_pp73-win_amd64.whl (10.6 kB view hashes)

Uploaded PyPy Windows x86-64

polyleven-0.8-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.8 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

polyleven-0.8-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (10.0 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

polyleven-0.8-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (7.0 kB view hashes)

Uploaded PyPy macOS 10.9+ x86-64

polyleven-0.8-pp37-pypy37_pp73-win_amd64.whl (10.6 kB view hashes)

Uploaded PyPy Windows x86-64

polyleven-0.8-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.8 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

polyleven-0.8-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (10.0 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

polyleven-0.8-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (7.0 kB view hashes)

Uploaded PyPy macOS 10.9+ x86-64

polyleven-0.8-cp311-cp311-win_amd64.whl (10.6 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

polyleven-0.8-cp311-cp311-win32.whl (11.2 kB view hashes)

Uploaded CPython 3.11 Windows x86

polyleven-0.8-cp311-cp311-musllinux_1_1_x86_64.whl (26.2 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

polyleven-0.8-cp311-cp311-musllinux_1_1_i686.whl (28.3 kB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ i686

polyleven-0.8-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21.0 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

polyleven-0.8-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (22.8 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

polyleven-0.8-cp311-cp311-macosx_10_9_x86_64.whl (7.8 kB view hashes)

Uploaded CPython 3.11 macOS 10.9+ x86-64

polyleven-0.8-cp310-cp310-win_amd64.whl (10.6 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

polyleven-0.8-cp310-cp310-win32.whl (11.2 kB view hashes)

Uploaded CPython 3.10 Windows x86

polyleven-0.8-cp310-cp310-musllinux_1_1_x86_64.whl (23.9 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

polyleven-0.8-cp310-cp310-musllinux_1_1_i686.whl (26.1 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ i686

polyleven-0.8-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.3 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

polyleven-0.8-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (21.2 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

polyleven-0.8-cp310-cp310-macosx_10_9_x86_64.whl (7.7 kB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

polyleven-0.8-cp39-cp39-win_amd64.whl (10.6 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

polyleven-0.8-cp39-cp39-win32.whl (11.2 kB view hashes)

Uploaded CPython 3.9 Windows x86

polyleven-0.8-cp39-cp39-musllinux_1_1_x86_64.whl (23.7 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

polyleven-0.8-cp39-cp39-musllinux_1_1_i686.whl (26.0 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ i686

polyleven-0.8-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.2 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

polyleven-0.8-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (21.0 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

polyleven-0.8-cp39-cp39-macosx_10_9_x86_64.whl (7.7 kB view hashes)

Uploaded CPython 3.9 macOS 10.9+ x86-64

polyleven-0.8-cp38-cp38-win_amd64.whl (10.6 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

polyleven-0.8-cp38-cp38-win32.whl (11.2 kB view hashes)

Uploaded CPython 3.8 Windows x86

polyleven-0.8-cp38-cp38-musllinux_1_1_x86_64.whl (23.9 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

polyleven-0.8-cp38-cp38-musllinux_1_1_i686.whl (26.2 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ i686

polyleven-0.8-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.8 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

polyleven-0.8-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (21.6 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

polyleven-0.8-cp38-cp38-macosx_10_9_x86_64.whl (7.7 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

polyleven-0.8-cp37-cp37m-win_amd64.whl (10.6 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

polyleven-0.8-cp37-cp37m-win32.whl (11.2 kB view hashes)

Uploaded CPython 3.7m Windows x86

polyleven-0.8-cp37-cp37m-musllinux_1_1_x86_64.whl (25.0 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.1+ x86-64

polyleven-0.8-cp37-cp37m-musllinux_1_1_i686.whl (27.2 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.1+ i686

polyleven-0.8-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.8 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

polyleven-0.8-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (21.6 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

polyleven-0.8-cp37-cp37m-macosx_10_9_x86_64.whl (7.7 kB view hashes)

Uploaded CPython 3.7m macOS 10.9+ x86-64

polyleven-0.8-cp36-cp36m-win_amd64.whl (11.4 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

polyleven-0.8-cp36-cp36m-win32.whl (12.1 kB view hashes)

Uploaded CPython 3.6m Windows x86

polyleven-0.8-cp36-cp36m-musllinux_1_1_x86_64.whl (24.1 kB view hashes)

Uploaded CPython 3.6m musllinux: musl 1.1+ x86-64

polyleven-0.8-cp36-cp36m-musllinux_1_1_i686.whl (26.3 kB view hashes)

Uploaded CPython 3.6m musllinux: musl 1.1+ i686

polyleven-0.8-cp36-cp36m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.8 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

polyleven-0.8-cp36-cp36m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (21.6 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

polyleven-0.8-cp36-cp36m-macosx_10_9_x86_64.whl (7.7 kB view hashes)

Uploaded CPython 3.6m macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page