Fuzzy string matching in Python
Project description
Fuzzy-Match
Fuzzy string matching in Python. By default it uses Trigrams to calculate a similarity score and find matches by splitting strings into ngrams with a length of 3. The length of the ngram can be altered if desired. Also, Cosine, Levenshtein Distance, and Jaro-Winkler Distance algorithims are also available as alternatives.
Usage
>>> from fuzzy_match import match
>>> from fuzzy_match import algorithims
Trigram
>>> algorithims.trigram("this is a test string", "this is another test string")
0.703704
Cosine
>>> algorithims.cosine("this is a test string", "this is another test string")
0.7999999999999998
Levenshtein
>>> algorithims.levenshtein("this is a test string", "this is another test string")
0.7777777777777778
Jaro-Winkler
>>> algorithims.jaro_winkler("this is a test string", "this is another test string")
0.798941798941799
Match
>>> choices = ["simple strings", "strings are simple", "sim string", "string to match", "matching simple strings", "matching strings again"]
>>> match.extract("simple string", choices, limit=2)
[('simple strings', 0.8), ('sim string', 0.642857)]
>>> match.extractOne("simple string", choices)
('simple strings', 0.8)
You can also pass additional arguments to extract
and extractOne
to set a score cutoff value or use one of the other algorithims mentioned above. Here is an example:
>>> match.extract("simple string", choices, match_type='levenshtein', score_cutoff=0.7)
[('simple strings', 0.9285714285714286), ('sim string', 0.7692307692307693)]
match_type
options include trigram
, cosine
, levenshtein
, jaro_winkler
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fuzzy-match-0.0.1.tar.gz
.
File metadata
- Download URL: fuzzy-match-0.0.1.tar.gz
- Upload date:
- Size: 4.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | be0c4568b555394947e1457cf21f9a965429f7d291f6990f1d4821de996bb6ae |
|
MD5 | dfcc75f5370489bb0243b4222595b219 |
|
BLAKE2b-256 | 2f887d2bc2d9d87d3c7437292def951ad8fbcb7956c3a8b9e250ec229623f01e |
File details
Details for the file fuzzy_match-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: fuzzy_match-0.0.1-py3-none-any.whl
- Upload date:
- Size: 5.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b0cc8eede335bfd7ab18509da593fef5b5336e2eec0757f7bb886c828d1ff849 |
|
MD5 | b1ac251e92c7a58060c94538eb7bd271 |
|
BLAKE2b-256 | e3aebe76d0df7d70f5912b0475bd06b48920d87fa18254666c86d8bdd4911678 |