Skip to main content

fuzzysearch is useful for finding approximate subsequence matches

Project description

Latest Version Build & Tests Status Test Coverage Downloads Wheels Supported Python versions Supported Python implementations License

Easy fuzzy search that just works, fast!

>>> find_near_matches('PATTERN', '---PATERN---', max_l_dist=1)
[Match(start=3, end=9, dist=1)]
  • Approximate sub-string searches

  • A single, simple function to use

    • Chooses the fastest available search mechanism based on the given input

  • Uses the Levenshtein Distance metric with configurable parameters

    • Separately configure the max. allowed distance, substitutions, deletions and insertions

  • Advanced algorithms with optional C and Cython optimizations

  • Extensively tested

  • Free software: MIT license

For more info, see the documentation.

Installation

$ pip install fuzzysearch

This will work even if installing the C and Cython extensions fails, using pure-Python fallbacks.

Usage

Just call find_near_matches() with the sub-sequence you’re looking for, the sequence to search, and the matching parameters:

>>> from fuzzysearch import find_near_matches
# search for 'PATTERN' with a maximum Levenshtein Distance of 1
>>> find_near_matches('PATTERN', '---PATERN---', max_l_dist=1)
[Match(start=3, end=9, dist=1)]
>>> sequence = '''\
GACTAGCACTGTAGGGATAACAATTTCACACAGGTGGACAATTACATTGAAAATCACAGATTGGTCACACACACA
TTGGACATACATAGAAACACACACACATACATTAGATACGAACATAGAAACACACATTAGACGCGTACATAGACA
CAAACACATTGACAGGCAGTTCAGATGATGACGCCCGACTGATACTCGCGTAGTCGTGGGAGGCAAGGCACACAG
GGGATAGG'''
>>> subsequence = 'TGCACTGTAGGGATAACAAT' # distance = 1
>>> find_near_matches(subsequence, sequence, max_l_dist=2)
[Match(start=3, end=24, dist=1)]

Matching Criteria

The search function supports four possible match criteria, which may be supplied in any combination:

  • maximum Levenshtein distance (max_l_dist)

  • maximum # of subsitutions

  • maximum # of deletions (“delete” = skip a character in the sub-sequence)

  • maximum # of insertions (“insert” = skip a character in the sequence)

Not supplying a criterion means that there is no limit for it. For this reason, one must always supply max_l_dist and/or all other criteria.

>>> find_near_matches('PATTERN', '---PATERN---', max_l_dist=1)
[Match(start=3, end=9, dist=1)]

# this will not match since max-deletions is set to zero
>>> find_near_matches('PATTERN', '---PATERN---', max_l_dist=1, max_deletions=0)
[]

# note that a deletion + insertion may be combined to match a substution
>>> find_near_matches('PATTERN', '---PAT-ERN---', max_deletions=1, max_insertions=1, max_substitutions=0)
[Match(start=3, end=10, dist=1)] # the Levenshtein distance is still 1

# ... but deletion + insertion may also match other, non-substitution differences
>>> find_near_matches('PATTERN', '---PATERRN---', max_deletions=1, max_insertions=1, max_substitutions=0)
[Match(start=3, end=10, dist=2)]

History

0.6.1 (2018-12-08)

  • Fixed some C compiler warnings for the C and Cython modules

0.6.0 (2018-12-07)

  • Dropped support for Python versions 2.6, 3.2 and 3.3

  • Added support and testing for Python 3.7

  • Optimized the n-grams Levenshtein search for long sub-sequences

  • Further optimized the n-grams Levenshtein search

  • Cython versions of the optimized parts of the n-grams Levenshtein search

0.5.0 (2017-09-05)

  • Fixed search_exact_byteslike() to support supplying start and end indexes

  • Added support for lists, tuples and other Sequence types to search_exact()

  • Fixed a bug where find_near_matches() could return a wrong Match.end with max_l_dist=0

  • Added more tests and improved some existing ones.

0.4.0 (2017-07-06)

  • Added support and testing for Python 3.5 and 3.6

  • Many small improvements to README, setup.py and CI testing

0.3.0 (2015-02-12)

  • Added C extensions for several search functions as well as internal functions

  • Use C extensions if available, or pure-Python implementations otherwise

  • setup.py attempts to build C extensions, but installs without if build fails

  • Added --noexts setup.py option to avoid trying to build the C extensions

  • Greatly improved testing and coverage

0.2.2 (2014-03-27)

  • Added support for searching through BioPython Seq objects

  • Added specialized search function allowing only subsitutions and insertions

  • Fixed several bugs

0.2.1 (2014-03-14)

  • Fixed major match grouping bug

0.2.0 (2013-03-13)

  • New utility function find_near_matches() for easier use

  • Additional documentation

0.1.0 (2013-11-12)

  • Two working implementations

  • Extensive test suite; all tests passing

  • Full support for Python 2.6-2.7 and 3.1-3.3

  • Bumped status from Pre-Alpha to Alpha

0.0.1 (2013-11-01)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fuzzysearch-0.6.2.tar.gz (99.3 kB view details)

Uploaded Source

Built Distributions

fuzzysearch-0.6.2-cp37-cp37m-win_amd64.whl (77.9 kB view details)

Uploaded CPython 3.7mWindows x86-64

fuzzysearch-0.6.2-cp37-cp37m-win32.whl (70.5 kB view details)

Uploaded CPython 3.7mWindows x86

fuzzysearch-0.6.2-cp37-cp37m-macosx_10_9_x86_64.whl (75.4 kB view details)

Uploaded CPython 3.7mmacOS 10.9+ x86-64

fuzzysearch-0.6.2-cp36-cp36m-win_amd64.whl (77.7 kB view details)

Uploaded CPython 3.6mWindows x86-64

fuzzysearch-0.6.2-cp36-cp36m-win32.whl (70.4 kB view details)

Uploaded CPython 3.6mWindows x86

fuzzysearch-0.6.2-cp36-cp36m-macosx_10_9_x86_64.whl (74.9 kB view details)

Uploaded CPython 3.6mmacOS 10.9+ x86-64

fuzzysearch-0.6.2-cp35-cp35m-win_amd64.whl (76.6 kB view details)

Uploaded CPython 3.5mWindows x86-64

fuzzysearch-0.6.2-cp35-cp35m-win32.whl (69.3 kB view details)

Uploaded CPython 3.5mWindows x86

fuzzysearch-0.6.2-cp35-cp35m-macosx_10_6_intel.whl (124.3 kB view details)

Uploaded CPython 3.5mmacOS 10.6+ Intel (x86-64, i386)

fuzzysearch-0.6.2-cp34-cp34m-win32.whl (64.9 kB view details)

Uploaded CPython 3.4mWindows x86

fuzzysearch-0.6.2-cp34-cp34m-macosx_10_6_intel.whl (123.4 kB view details)

Uploaded CPython 3.4mmacOS 10.6+ Intel (x86-64, i386)

fuzzysearch-0.6.2-cp27-cp27m-win_amd64.whl (67.1 kB view details)

Uploaded CPython 2.7mWindows x86-64

fuzzysearch-0.6.2-cp27-cp27m-win32.whl (64.1 kB view details)

Uploaded CPython 2.7mWindows x86

fuzzysearch-0.6.2-cp27-cp27m-macosx_10_9_x86_64.whl (71.9 kB view details)

Uploaded CPython 2.7mmacOS 10.9+ x86-64

File details

Details for the file fuzzysearch-0.6.2.tar.gz.

File metadata

  • Download URL: fuzzysearch-0.6.2.tar.gz
  • Upload date:
  • Size: 99.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2.tar.gz
Algorithm Hash digest
SHA256 1352840b157d708aa5f68ad47c7161f50f6c314e74f686fc59655fffeb338d58
MD5 495bb9ddb2df036679f065f0bdf43b7f
BLAKE2b-256 84a4ae12fef8f50332419291f40c0faeb2af1e24804faabcc6e386a9c854a4db

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 77.9 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 937b288867bfff2cd00c771457e5b8934df8a3ecdad9c318ae3008198fcd4d16
MD5 3296b87401d8b643a460cadb49f86c3a
BLAKE2b-256 8d16e5dfc9be4d600c2bd383879c66f5c4a3b07c141f96b0e1e920ec77b43f69

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp37-cp37m-win32.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 70.5 kB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 b538c723b2e0cc49ba6f972c061341316ffab72392e936a51605b1b41167de41
MD5 5830a29e51cb28bed2d7dd922c8e8976
BLAKE2b-256 22fc3fc848cc1f1f33410333f8e35a573b22fba4ced7ada7ee42cb7d0238552f

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 75.4 kB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for fuzzysearch-0.6.2-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f91e5846936b51626b00bfd6c20f4a291b8c3a5d2ce5114279e27590ed07a3ae
MD5 560d51fa5ae1745ed7862d3774d762ca
BLAKE2b-256 494d80eccf101b994d2f67bd56f147921557ad8bff6c158c2ee0c9f71de6ff03

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 77.7 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 644bccfdac9e3f859723d19606871c2363881b8d34eddb3402d21fba38f8f2ba
MD5 eba7b16b18cc7d1b42bfced4393873d8
BLAKE2b-256 a8fc821362563dfc9c7bb48e7e3dd9363ba772ec32929c721ccaf6f45d23d16c

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp36-cp36m-win32.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp36-cp36m-win32.whl
  • Upload date:
  • Size: 70.4 kB
  • Tags: CPython 3.6m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 05956ae6ee66abad512f879ad168f7deb1479b0e1f42406e840c66f480572cd7
MD5 fdd881635bd2636fa19e770cbb8cbac0
BLAKE2b-256 a426d3e2ba56476cf4f826e576245ab46dcea0b2eb092e0333762d2617567b07

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp36-cp36m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp36-cp36m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 74.9 kB
  • Tags: CPython 3.6m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for fuzzysearch-0.6.2-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 90f7e2f3fde3cb24151d826ff6f4d50192e7830bcf22c9d8ede23df6dd45abd5
MD5 76cb9031014c815860543d731ec210fa
BLAKE2b-256 f968b0b42068d5a30b173ed929eb30dc441083fb166bc6b1c082841b51b7d319

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 76.6 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 b8479a57232b309235a84450f4a37c3ee8ed69bd8c50460edd52e204a479dbee
MD5 98787d6e17578d09211cdc4aae30a9ae
BLAKE2b-256 42c273f584091a81d2a911f81b3515663b3212ac27bdb7e404e3c85562d76b67

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp35-cp35m-win32.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp35-cp35m-win32.whl
  • Upload date:
  • Size: 69.3 kB
  • Tags: CPython 3.5m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 3385da3c04f4e25b6e49a3f6cf4b3f270d5d808031d0f33f7673bb8578d3fd7b
MD5 3b9cd728e44bb57d64914ecbd4806c3c
BLAKE2b-256 a19a2cea7e0485774e9383bb3a3239bbf7dc083316157eda2630135d3faf3b96

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp35-cp35m-macosx_10_6_intel.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp35-cp35m-macosx_10_6_intel.whl
  • Upload date:
  • Size: 124.3 kB
  • Tags: CPython 3.5m, macOS 10.6+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for fuzzysearch-0.6.2-cp35-cp35m-macosx_10_6_intel.whl
Algorithm Hash digest
SHA256 ba0136309145a461755d16ec17b5c0be29819948944edfe06434a17e13d3ca45
MD5 d327dffb77e360cd737e33acdb560525
BLAKE2b-256 1de456f3f297086589174e1ef269d7c93aef03723243807a85c132005a83ac6e

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp34-cp34m-win32.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp34-cp34m-win32.whl
  • Upload date:
  • Size: 64.9 kB
  • Tags: CPython 3.4m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2-cp34-cp34m-win32.whl
Algorithm Hash digest
SHA256 925ae4b6bec9c09645221310e07b2ef86a0aa020d4b18eedbe798253a25d7d5d
MD5 db639108c457eca687b5cc23de5ac3cb
BLAKE2b-256 50f8e3fff7e60a7a347adf934be5a601681a102168555901ca24d6f9a1881087

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp34-cp34m-macosx_10_6_intel.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp34-cp34m-macosx_10_6_intel.whl
  • Upload date:
  • Size: 123.4 kB
  • Tags: CPython 3.4m, macOS 10.6+ Intel (x86-64, i386)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for fuzzysearch-0.6.2-cp34-cp34m-macosx_10_6_intel.whl
Algorithm Hash digest
SHA256 fd0dff10cfa1b1beb7c046b9ab6a4f10a5c6f027cd265fa4f30ea5e0d54cbbab
MD5 532786b02c008fd78df695dbea514259
BLAKE2b-256 1bbb94a5702efee5c082d3f46d3668892f3f538f097506d241531c6ba57d6aa3

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp27-cp27m-win_amd64.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp27-cp27m-win_amd64.whl
  • Upload date:
  • Size: 67.1 kB
  • Tags: CPython 2.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 5d6dafb14b989c5eab686a57355f1b2ba8d85dea164a455e4ea358cc6f7612de
MD5 834c729b49bcf05c39392488072a2397
BLAKE2b-256 18ee54e69e754948c3683c1f5b2ddea2db32e4d319e237b8c6035b039c192726

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp27-cp27m-win32.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp27-cp27m-win32.whl
  • Upload date:
  • Size: 64.1 kB
  • Tags: CPython 2.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for fuzzysearch-0.6.2-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 160dd0e1af8910a9cb0a70837247fe04f65faa0e4a9ba9a2aed9f862f2416559
MD5 595415c1efd0f9141c2c7e701457e12f
BLAKE2b-256 97913558fe9fce065cb93ac225b91492c63da9409d7311f619b91d39e460d4ee

See more details on using hashes here.

File details

Details for the file fuzzysearch-0.6.2-cp27-cp27m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: fuzzysearch-0.6.2-cp27-cp27m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 71.9 kB
  • Tags: CPython 2.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for fuzzysearch-0.6.2-cp27-cp27m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 0a33cfd97e15a71182cea638b15c9873b730f361cc2a13a4c36152d667ef918e
MD5 536fedb3f462a0c5f658eabc870cf96b
BLAKE2b-256 91af4bebf200cbba71588b78515bd5ab4480d716f406749f2a997df959c4bddb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page