Skip to main content

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

Project description

string-similarity

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

📄 Table of Contents

🎮 Usage

In your code:

from  string_similarity  import  StringSimilarity



similarity = StringSimilarity.compareTwoStrings('french', 'quebec');



matches = StringSimilarity.findBestMatch('healed', ['edward', 'sealed', 'theatre']);

📚 API

StringSimilarity.compareTwoStrings(string, otherString)

Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case and diacritic sensitive.

Arguments

  • string (String): The first string
  • otherString (String): The second string

Order does not make a difference.

Returns

(double): A fraction from 0 to 1, both inclusive. Higher number indicates more similarity.

Examples

StringSimilarity.compareTwoStrings('healed', 'sealed'); // → 0.8



StringSimilarity.compareTwoStrings('france', 'FrancE'); // → 0.6



StringSimilarity.compareTwoStrings('x', null); // → 0.0



StringSimilarity.compareTwoStrings('Olive-green table for sale, in extremely good condition.', 'For sale: table in very good condition, olive green in colour.'); // → 0.6060606060606061

StringSimilarity.findBestMatch(string, targetStrings)

Compares mainString against each string in targetStrings.

Arguments

  • string (String): The main string to compare the targetStrings
  • targetStrings (List<String>): Each string in this array will be matched against the main string.

Returns

(BestMatch): An object with a ratings property, which gives a similarity rating for each target string, a bestMatch property, which specifies which target string was most similar to the main string, and a bestMatchIndex property, which specifies the index of the bestMatch in the targetStrings array.

Examples

StringSimilarity.findBestMatch('Olive-green table for sale, in extremely good condition.', [

'For sale: green Subaru Impreza, 210,000 miles',

'For sale: table in very good condition, olive green in colour.',

'Wanted: mountain bike with at least 21 gears.',

null

]);

// →

{ ratings:[

{ target: 'For sale: green Subaru Impreza, 210,000 miles', rating: 0.2558139534883721 },

{ target: 'For sale: table in very good condition, olive green in colour.', rating: 0.6060606060606061 },

{ target: 'Wanted: mountain bike with at least 21 gears.', rating: 0.1411764705882353 },

{ target: null, rating: 0.0 }

],

bestMatch: { target: 'For sale: table in very good condition, olive green in colour.', rating: 0.6060606060606061 },

bestMatchIndex: 1

}

🔮 Credit

based on 'string-similarity' Javascript project : https://github.com/aceakash/string-similarity

Developer

Malay Patel

Follow @Malay1121

LinkedIn: @Malay Patel

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

string_similarity-0.0.2.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

string_similarity-0.0.2-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file string_similarity-0.0.2.tar.gz.

File metadata

  • Download URL: string_similarity-0.0.2.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for string_similarity-0.0.2.tar.gz
Algorithm Hash digest
SHA256 461d99f5167d9ec176618203ba50649aed940837ab4b83915c0765efa7c2a480
MD5 aa377e8d5746fe6cc7df8f39dfee9457
BLAKE2b-256 d6ae78f01f9459fcaa79d29c8a9eda9db01f98eefb71ffeeecdb055f8dfd3470

See more details on using hashes here.

File details

Details for the file string_similarity-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for string_similarity-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2c833f7182a13e58967bf69338c59a24ce11ff96dceae721f0de7f1dfe09ce0a
MD5 ab1f738f251cdbc0d888e70840f5f75f
BLAKE2b-256 30020f84f3a33da33ac1f1ce239ae2081b5e72f9692ea0623f378bc7caa874a4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page