Skip to main content

Extension for `pandas` to use `rapidfuzz` for fuzzy matching.

Project description

pandas-fuzz

PyPI - Python Version PyPI PyPI - Downloads PyPI - License GitHub Workflow Test) Website GitHub tag (with filter) codecov pre-commit


Extension for pandas to use rapidfuzz for fuzzy matching.

Requirements

Installation

pip install pandas_fuzz

Usage

To register the extension make sure to import pandas_fuzz before using it`.

import pandas as pd
import pandas_fuzz

Alternatively, you can import pandas from pandas_fuzz directly.

from pandas_fuzz import pandas as pd

rapidfuzz.fuzz

pandas_fuzz integrates the following functions from rapidfuzz.fuzz into pandas. These functions are available in the fuzz namespace for both pandas.Series and pandas.DataFrame.

  • rapidfuzz.fuzz.ratio
  • rapidfuzz.fuzz.partial_ratio
  • rapidfuzz.fuzz.partial_ratio_alignment
  • rapidfuzz.fuzz.token_sort_ratio
  • rapidfuzz.fuzz.token_set_ratio
  • rapidfuzz.fuzz.token_ratio
  • rapidfuzz.fuzz.partial_token_sort_ratio
  • rapidfuzz.fuzz.partial_token_set_ratio
  • rapidfuzz.fuzz.partial_token_ratio
  • rapidfuzz.fuzz.WRatio
  • rapidfuzz.fuzz.QRatio

pandas.Series

apply fuzz.ratio element wise to pd.Series.

>>> pd.Series(["this is a test", "this is a test!"]).fuzz.ratio("this is a test!")
0     96.551724
1    100.000000
dtype: float64

pandas.DataFrame

apply fuzz.ratio row wise to columns s1 and s2

>>> pd.DataFrame({
    "s1": ["this is a test", "this is a test!"],
    "s2": ["this is a test", "this is a test!"]
}).fuzz.ratio("s1", "s2")
0    100.0
1    100.0
dtype: float64

Dependencies

PyPI - pandas PyPI - Version


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_fuzz-0.1.4.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

pandas_fuzz-0.1.4-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file pandas_fuzz-0.1.4.tar.gz.

File metadata

  • Download URL: pandas_fuzz-0.1.4.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.10.12 Linux/6.5.0-1025-azure

File hashes

Hashes for pandas_fuzz-0.1.4.tar.gz
Algorithm Hash digest
SHA256 081dd8b00bb735f6f09e1ac48e20fce1c7e3951043296786a7733c505cdc65ac
MD5 2e089c25a7b8cc18394d2d99f6ceb14a
BLAKE2b-256 0731dec799c353332627e8e224166b1965b1d26a21b31b3d9a351c7e8732f4d4

See more details on using hashes here.

File details

Details for the file pandas_fuzz-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: pandas_fuzz-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.10.12 Linux/6.5.0-1025-azure

File hashes

Hashes for pandas_fuzz-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c510b25c7cdff3958b0c3fa823f63408e3821f81779eebb497e7c7c4e27aaa59
MD5 c2874c8b9700b48c878065333d53d0a7
BLAKE2b-256 365023ab0e7f2b6b8f87d4a95ec8efe88f5c851999e68a77506035db89e01f51

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page