Skip to main content

A `set` subclass providing fuzzy search based on N-grams.

Project description

Here is the documentation annd the tutorial.

How does it work?

The NGram class extents the Python set class with the ability to search for set members ranked by their N-Gram string similarity to the query. There are also methods for comparing a pair of strings.

The set stores arbitrary items by using a specified “key” function to produce the string representation of set members for the n-gram indexing.

N-grams are obtained by splitting strings into overlapping substrings of N (usually N=3) characters in length.

To find items similar to a query string, it splits the query into N-grams, collects all items sharing at least one N-gram with the query, and ranks the items by score based on the ratio of shared to unshared N-grams between strings.

Credits

The starting point was the Perl String::Trigram module by Tarek Ahmed. In 2007, Michel Albert (exhuma) wrote the ngram module and submitted 2.0.0b2 to Sourceforge. Since late 2008 python-ngram has been developed by Graham Poulter, adding features, documentation, performance improvements and Python 3 support.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ngram-3.2.tar.gz (8.2 kB view details)

Uploaded Source

File details

Details for the file ngram-3.2.tar.gz.

File metadata

  • Download URL: ngram-3.2.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for ngram-3.2.tar.gz
Algorithm Hash digest
SHA256 968783ead6efca1b469f32defe51eaaeea65ffaaefe32d9809c0b77d9d23e9d2
MD5 bcc2e195d547bbc9cff15113024819a6
BLAKE2b-256 0fd9f9539b46a3e625577f78928ff5eaaefc81bf3512e8b5c3417ceaabb844f3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page