Skip to main content

fuzzysearch is useful for finding approximate subsequence matches

Project description

https://badge.fury.io/py/fuzzysearch.png https://travis-ci.org/taleinat/fuzzysearch.png?branch=master https://pypip.in/d/fuzzysearch/badge.png

fuzzysearch is useful for finding approximate subsequence matches

Features

  • Fuzzy sub-sequence search: Find parts of a sequence which match a given sub-sequence up to a given maximum Levenshtein distance.

Simple Example

You can usually just use the find_near_matches() utility function, which chooses a suitable fuzzy search implementation according to the given parameters:

>>> from fuzzysearch import find_near_matches
>>> find_near_matches('PATTERN', 'aaaPATERNaaa', max_l_dist=1)
[Match(start=3, end=9, dist=1)]

Advanced Example

If needed you can choose a specific search implementation, such as find_near_matches_with_ngrams():

>>> sequence = '''\
GACTAGCACTGTAGGGATAACAATTTCACACAGGTGGACAATTACATTGAAAATCACAGATTGGTCACACACACA
TTGGACATACATAGAAACACACACACATACATTAGATACGAACATAGAAACACACATTAGACGCGTACATAGACA
CAAACACATTGACAGGCAGTTCAGATGATGACGCCCGACTGATACTCGCGTAGTCGTGGGAGGCAAGGCACACAG
GGGATAGG'''
>>> subsequence = 'TGCACTGTAGGGATAACAAT' #distance 1
>>> max_distance = 2

>>> from fuzzysearch import find_near_matches_with_ngrams
>>> find_near_matches_with_ngrams(subsequence, sequence, max_distance)
[Match(start=3, end=24, dist=1)]

History

0.2.2 (2014-03-27)

  • Added support for searching through BioPython Seq objects

  • Added specialized search function allowing only subsitutions and insertions

  • Fixed several bugs

0.2.1 (2014-03-14)

  • Fixed major match grouping bug

0.2.0 (2013-03-13)

  • New utility function find_near_matches() for easier use

  • Additional documentation

0.1.0 (2013-11-12)

  • Two working implementations

  • Extensive test suite; all tests passing

  • Full support for Python 2.6-2.7 and 3.1-3.3

  • Bumped status from Pre-Alpha to Alpha

0.0.1 (2013-11-01)

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fuzzysearch-0.2.2.tar.gz (10.8 kB view details)

Uploaded Source

File details

Details for the file fuzzysearch-0.2.2.tar.gz.

File metadata

  • Download URL: fuzzysearch-0.2.2.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for fuzzysearch-0.2.2.tar.gz
Algorithm Hash digest
SHA256 d51e46f237679c4715d14111be37821381658ce4e6653333a50d6eecabe24f0f
MD5 0401949464be2ad04f4ab85910767e3f
BLAKE2b-256 2b24f6455fa0965a1501f77631544edaaf2b08c9d45e89c2c8c214dd65319b80

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page