Skip to main content

A smart match package

Project description

Introduction

The smart-match module contains functions for calculating strings/sets similarity.

Concept

  1. similarity: A value in a range of [0, 1], which represents how similar the two strings are. The larger the value, the more similar the two strings are.

  2. dissimilarity: A value in a range of [0, 1], which represents how dissimilar the two strings are. The larger the value, the more dissimilar the two strings are. For a pair of strings, similarity = 1 - dissimilarity

  3. distance: How far the two strings are. Notice that not all the methods support distance method.

  4. score The larger the score, the more similar the two strings are. Notice not all the methods have score method.

We support three levels of string matching.

  1. char: Similarity computation based on characters in the strings.

  2. term: Similarity computation based on terms in the strings.

  3. gram: Similarity computation based on q-grams in the strings.

Methods

We support the following methods.

Method similarity dissimilarity distance score
Levenshtein (default)
Euclidean
Damerau Levenshtein
Block Distance
Cosine
Tanimoto Coefficient
Dice
Simon White
Longest Common Substring
Longest Common SubSequence
Overlap Coefficient
Generalized Overlap Coefficient
Jaccard
Generalized Jaccard
Hamming
Jaro
Jaro Winkler
Needleman Wunch
Smith Waterman
Smith Waterman Gotoh
Monge Elkan

Installation

pip install smart-match

Usage

import smart_match
print(smart_match.similarity('hello', 'hero'))
print(smart_match.dissimilarity('hello', 'hero'))
print(smart_match.distance('hello', 'hero'))

Output:

0.6
0.4
2

Check Wiki for more details.

License

smart-match is a free software. See the file LICENSE for the full text.

Authors

qrcode_for_wechat_official_account

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smart_match-0.1.1.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smart_match-0.1.1-py3-none-any.whl (40.3 kB view details)

Uploaded Python 3

File details

Details for the file smart_match-0.1.1.tar.gz.

File metadata

  • Download URL: smart_match-0.1.1.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.5

File hashes

Hashes for smart_match-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5b38be5f67c2990dad455bef08164c0a4bcb78e885c1e89ef19c9e95089ccdfc
MD5 8c2e3ab715c795939243439969d1d978
BLAKE2b-256 812157d19bbb5c884337946fd09b397b2ed8e9fa6d2778cf4ab7393332dd99c7

See more details on using hashes here.

File details

Details for the file smart_match-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: smart_match-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 40.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.5

File hashes

Hashes for smart_match-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e11a9db75442612f38fb0080ce474f555c7105eaaa8b6f00e8309d74f5349a6d
MD5 365afbf0f56761e91b4428aa330d5664
BLAKE2b-256 3f2a47991f181d7fe70ef072eb380bcc1ebea6dac7b98704d1f54cd9679a0635

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page