The package includes a algorithm to calculate similarity between two strings
Project description
Simple Fuzzy Comparison
Calculating how much two strings are similar to each other by analysing the number of common and uncommon characters in the two strings. The algorithm used has time complexity O(n+m), where n and m are the lengths of the compared words. It uses two lists to keep track of the number of times a character is present in each word and later uses them to calculate the final matching score between 0 and 1, where 1 represents complete match and 0 represents no match.
How To Use
from fuzzy_compare import compare_english_words
print(compare_english_words("hello", "hello")) # Out: 1.0
print(compare_english_words("hello", "helllo")) # Out: 0.9090...
print(compare_english_words("hello", "hell")) # Out: 0.888...
print(compare_english_words("hello", "hallo")) # Out: 0.8
print(compare_english_words("hello", "world")) # Out: 0.4
print(compare_english_words("hello", "")) # Out: 0.0
Support For Other Languages
The algorithm allows to compare two strings containing any python supported characters. The compare_english_words function only keeps track of the Unicode characters with code range between 97 and 122, because 97 is unicode of 'a' and 122 of 'z'.
from fuzzy_compare import CompareStrings
_eng_words_comp_obj = CompareStrings(97, 122)
compare_english_words = _eng_words_comp_obj.compare_strings
print(compare_english_words("hello", "world")) # Out: 0.4
The class CompareStrings can be used to compare strings containing any unicode characters in any given code range. But, remember that the algorithm maintains two lists of length equal to the range of allowed Unicode characters. In case of compare_english_words, two lists of length 26 each are used to keep track of the character count.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for fuzzy_compare-1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac9fccdbe8321fe1f0c2f81a3447f6bc3044575779e9e806e1747cd02f46b7cf |
|
MD5 | 51bb26880c358f83f545f677125a6d31 |
|
BLAKE2b-256 | d403104a62a84633dfa756b2eaa227669a41f6430b0da12acb453b7731fc350c |