Skip to main content

A smart match package

Project description

Introduction

The smart-match module contains functions for calculating strings/sets similarity.

Concept

  1. similarity: A value in a range of [0, 1], which represents how similar the two strings are. The larger the value, the more similar the two strings are.

  2. dissimilarity: A value in a range of [0, 1], which represents how dissimilar the two strings are. The larger the value, the more dissimilar the two strings are. For a pair of strings, similarity = 1 - dissimilarity

  3. distance: How far the two strings are. Notice that not all the methods support distance method.

  4. score The larger the score, the more similar the two strings are. Notice not all the methods have score method.

We support three levels of string matching.

  1. char: Similarity computation based on characters in the strings.

  2. term: Similarity computation based on terms in the strings.

  3. gram: Similarity computation based on q-grams in the strings.

Methods

We support the following methods.

Method similarity dissimilarity distance score
Levenshtein (default)
Euclidean
Damerau Levenshtein
Block Distance
Cosine
Tanimoto Coefficient
Dice
Simon White
Longest Common Substring
Longest Common SubSequence
Overlap Coefficient
Generalized Overlap Coefficient
Jaccard
Generalized Jaccard
Hamming
Jaro
Jaro Winkler
Needleman Wunch
Smith Waterman
Smith Waterman Gotoh
Monge Elkan

Installation

pip install smart-match

Usage

import smart_match
print(smart_match.similarity('hello', 'hero'))
print(smart_match.dissimilarity('hello', 'hero'))
print(smart_match.distance('hello', 'hero'))

Output:

0.6
0.4
2

Check Wiki for more details.

License

smart-match is a free software. See the file LICENSE for the full text.

Authors

qrcode_for_wechat_official_account

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smart_match-0.1.1.tar.gz (22.1 kB view hashes)

Uploaded Source

Built Distribution

smart_match-0.1.1-py3-none-any.whl (40.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page