Skip to main content

fuzzysearch is useful for finding approximate subsequence matches

Project description

https://badge.fury.io/py/fuzzysearch.png https://travis-ci.org/taleinat/fuzzysearch.png?branch=master https://pypip.in/d/fuzzysearch/badge.png

fuzzysearch is useful for finding approximate subsequence matches

Features

  • Fuzzy sub-sequence search: Find parts of a sequence which match a given sub-sequence up to a given maximum Levenshtein distance.

Simple Example

You can usually just use the find_near_matches() utility function, which chooses a suitable fuzzy search implementation according to the given parameters:

>>> from fuzzysearch import find_near_matches
>>> find_near_matches('PATTERN', 'aaaPATERNaaa', max_l_dist=1)
[Match(start=3, end=9, dist=1)]

Advanced Example

If needed you can choose a specific search implementation, such as find_near_matches_with_ngrams():

>>> sequence = '''\
GACTAGCACTGTAGGGATAACAATTTCACACAGGTGGACAATTACATTGAAAATCACAGATTGGTCACACACACA
TTGGACATACATAGAAACACACACACATACATTAGATACGAACATAGAAACACACATTAGACGCGTACATAGACA
CAAACACATTGACAGGCAGTTCAGATGATGACGCCCGACTGATACTCGCGTAGTCGTGGGAGGCAAGGCACACAG
GGGATAGG'''
>>> subsequence = 'TGCACTGTAGGGATAACAAT' #distance 1
>>> max_distance = 2

>>> from fuzzysearch import find_near_matches_with_ngrams
>>> find_near_matches_with_ngrams(subsequence, sequence, max_distance)
[Match(start=3, end=24, dist=1)]

History

0.1.0 (2013-11-12)

  • Two working implementations

  • Extensive test suite; all tests passing

  • Full support for Python 2.6-2.7 and 3.1-3.3

  • Bumped status from Pre-Alpha to Alpha

0.0.1 (2013-11-01)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fuzzysearch-0.2.0.tar.gz (7.6 kB view details)

Uploaded Source

File details

Details for the file fuzzysearch-0.2.0.tar.gz.

File metadata

  • Download URL: fuzzysearch-0.2.0.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for fuzzysearch-0.2.0.tar.gz
Algorithm Hash digest
SHA256 21eb2ab04698c42740160a5fde7edfb14c4d1bbece0a24711c8e7e25e5b1afd1
MD5 6b411ac88406ffe52f21ff98cbfe61df
BLAKE2b-256 e9cc798296a5e5a65a10c89bfd298c96f4c936cfc574dd014f60216c823937cd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page