Skip to main content

Find the Jaro Winkler Distance which indicates the similarity score between two Strings

Project description

https://travis-ci.org/nap/jaro-winkler-distance.svg?branch=master https://coveralls.io/repos/nap/jaro-winkler-distance/badge.svg?branch=master&service=github https://img.shields.io/github/license/nap/jaro-winkler-distance.svg https://img.shields.io/pypi/pyversions/pyjarowinkler.svg

Find the Jaro Winkler Distance which indicates the similarity score between two Strings. The Jaro measure is the weighted sum of percentage of matched characters from each file and transposed characters. Winkler increased this measure for matching initial characters.

The Implementation

The original implementation is based on the Jaro Winkler Similarity Algorithm article that can be found on Wikipedia. This Python version of the original implementation is based on the Apache StringUtils library.

Correctness

Unittest similar to what you will find in the StringUtils library were used to validate implementation.

Note

A limit of shorter / 2 + 1 is used in StringUtils, this differs from Wikipedia and also Winkler’s paper, where a distance of longer / 2 - 1 is used, corresponding to positions of longer / 2. As of version 1.8, the algorithm now correctly works with the "CTRATE" - "TRACE" example from Wikipedia.

Example

>>> from pyjarowinkler import distance
>>> # Scaling is 0.1 by default
>>> print distance.get_jaro_distance("hello", "haloa", winkler=True, scaling=0.1)
0.76
>>> print distance.get_jaro_distance("hello", "haloa", winkler=False, scaling=0.1)
0.733333333333
Version:

1.8 of 2016-03-22

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyjarowinkler-1.8.tar.gz (4.6 kB view details)

Uploaded Source

Built Distribution

pyjarowinkler-1.8-py2.py3-none-any.whl (5.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pyjarowinkler-1.8.tar.gz.

File metadata

  • Download URL: pyjarowinkler-1.8.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pyjarowinkler-1.8.tar.gz
Algorithm Hash digest
SHA256 49828834eddae6a078ee1329dca572541192a3f49e407608f4063c692c1ef1df
MD5 82b244b397493e53a70cd05db498fb3c
BLAKE2b-256 04c2d560c1eebd87b668394daee4ac07959bc1a00db56364b86863470a8c23e4

See more details on using hashes here.

File details

Details for the file pyjarowinkler-1.8-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for pyjarowinkler-1.8-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 dc80f4e606a6384729a577d0a0dff5aceadb9efbe19bd0fc04e79d55ffd1e0aa
MD5 fc9a5bd0344c24c10cf57e7dce6e0370
BLAKE2b-256 b958b89073047b447e02b08d4f64fbb984e5a4dfef4134477350b256c625c779

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page