Skip to main content

Fuzzy string matching in python

Project description

https://travis-ci.org/seatgeek/fuzzywuzzy.svg?branch=master

FuzzyWuzzy

Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

Requirements

For testing

  • pycodestyle

  • hypothesis

  • pytest

Installation

Using PIP via PyPI

pip install fuzzywuzzy

or the following to install python-Levenshtein too

pip install fuzzywuzzy[speedup]

Using PIP via Github

pip install git+git://github.com/seatgeek/fuzzywuzzy.git@0.18.0#egg=fuzzywuzzy

Adding to your requirements.txt file (run pip install -r requirements.txt afterwards)

git+ssh://git@github.com/seatgeek/fuzzywuzzy.git@0.18.0#egg=fuzzywuzzy

Manually via GIT

git clone git://github.com/seatgeek/fuzzywuzzy.git fuzzywuzzy
cd fuzzywuzzy
python setup.py install

Usage

>>> from fuzzywuzzy import fuzz
>>> from fuzzywuzzy import process

Simple Ratio

>>> fuzz.ratio("this is a test", "this is a test!")
    97

Partial Ratio

>>> fuzz.partial_ratio("this is a test", "this is a test!")
    100

Token Sort Ratio

>>> fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
    91
>>> fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
    100

Token Set Ratio

>>> fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
    84
>>> fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
    100

Process

>>> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
>>> process.extract("new york jets", choices, limit=2)
    [('New York Jets', 100), ('New York Giants', 78)]
>>> process.extractOne("cowboys", choices)
    ("Dallas Cowboys", 90)

You can also pass additional parameters to extractOne method to make it use a specific scorer. A typical use case is to match file paths:

>>> process.extractOne("System of a down - Hypnotize - Heroin", songs)
    ('/music/library/good/System of a Down/2005 - Hypnotize/01 - Attack.mp3', 86)
>>> process.extractOne("System of a down - Hypnotize - Heroin", songs, scorer=fuzz.token_sort_ratio)
    ("/music/library/good/System of a Down/2005 - Hypnotize/10 - She's Like Heroin.mp3", 61)

Known Ports

FuzzyWuzzy is being ported to other languages too! Here are a few ports we know about:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fuzzywuzzy-0.18.0.tar.gz (28.9 kB view details)

Uploaded Source

Built Distribution

fuzzywuzzy-0.18.0-py2.py3-none-any.whl (18.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file fuzzywuzzy-0.18.0.tar.gz.

File metadata

  • Download URL: fuzzywuzzy-0.18.0.tar.gz
  • Upload date:
  • Size: 28.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.1

File hashes

Hashes for fuzzywuzzy-0.18.0.tar.gz
Algorithm Hash digest
SHA256 45016e92264780e58972dca1b3d939ac864b78437422beecebb3095f8efd00e8
MD5 29708593c35b1ca67c329f853d9abcd0
BLAKE2b-256 114b0a002eea91be6048a2b5d53c5f1b4dafd57ba2e36eea961d05086d7c28ce

See more details on using hashes here.

File details

Details for the file fuzzywuzzy-0.18.0-py2.py3-none-any.whl.

File metadata

  • Download URL: fuzzywuzzy-0.18.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.1

File hashes

Hashes for fuzzywuzzy-0.18.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 928244b28db720d1e0ee7587acf660ea49d7e4c632569cad4f1cd7e68a5f0993
MD5 237450dba93f7226c7dfbdd04a1355c6
BLAKE2b-256 43ff74f23998ad2f93b945c0309f825be92e04e0348e062026998b5eefef4c33

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page