Skip to main content

For cleaning and adjusting words with inconsistency.

Project description

Downloads PyPI badge License: MIT

string_treatment is a library for cleaning and adjusting words with inconsistency.

Overview

This library uses string similarity from rapidfuzz to group words with similar spelling into clusters, mapping each word to the most frequent (canonical) form within its cluster.

Since the clustering process may not always be perfectly accurate, the library can generate an interactive graph to help visualize the groupings.

example graph

Installation

Install the latest stable version from PyPI:

pip install string-treatment

Example

See the testing script in the root: test_standardize.py.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

string_treatment-2.0.0.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

string_treatment-2.0.0-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file string_treatment-2.0.0.tar.gz.

File metadata

  • Download URL: string_treatment-2.0.0.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for string_treatment-2.0.0.tar.gz
Algorithm Hash digest
SHA256 b607e761395e4b722bf185795ea0a72a31b2d5f22762a3a145801f5f5f9966a6
MD5 bcbfd32b51b942e56cf5dd2f831dca3b
BLAKE2b-256 ec0be4ca13f46e008818b680fc9a398ce9a986dc5c2b03d4cf61e682c4f2abb2

See more details on using hashes here.

File details

Details for the file string_treatment-2.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for string_treatment-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0189ea0f257b2adea1293a507849e2475de67ff8662881d37e3d8b0d02796ae5
MD5 a0d48b9f878b798e6f5d4895cf655171
BLAKE2b-256 400a1c55e78a75841570af80204b7fe59094787b4f52b90dca173785151ece4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page