Skip to main content

Text preprocessing package

Project description

Preprocess YourText

Preprocess YourText is a Python package for text preprocessing tasks, designed to simplify and streamline the process of cleaning and preparing text data for natural language processing (NLP) tasks.

Features

  • HTML Tag Removal: Easily remove HTML tags from text data.
  • URL Removal: Remove URLs from text data.
  • Email Removal: Remove email addresses from text data.
  • Special Character Removal: Remove special characters from text data.
  • Accent Removal: Remove accents from characters in text data.
  • Contractions Expansion: Expand contractions in text data (e.g., "don't" to "do not").
  • Lemmatization: Lemmatize words in text data to their base form.
  • Spelling Correction: Correct spelling mistakes in text data.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Installation

You can install the package via pip:

pip install mngdataclean

## Usage
import mngdataclean as mdc

# Example usage:
text = "This is an example text with HTML tags <b>and URLs</b>."
clean_text = mdc.get_clean(text)
print(clean_text)

#output is 
This is an example text with HTML tags and URLs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mngdataclean-0.4.2.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

mngdataclean-0.4.2-py3-none-any.whl (3.6 kB view details)

Uploaded Python 3

File details

Details for the file mngdataclean-0.4.2.tar.gz.

File metadata

  • Download URL: mngdataclean-0.4.2.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.4

File hashes

Hashes for mngdataclean-0.4.2.tar.gz
Algorithm Hash digest
SHA256 b0a30fcadf1a1669f2a9f23295abcc7eef177eaa343bca4897da8d4bd40f4e57
MD5 39ae576651d7f069ca3395ee7b0a9362
BLAKE2b-256 1d363abc07be11c27d744de165d982d9a21395db0b7639fa8cc2858417828a8d

See more details on using hashes here.

File details

Details for the file mngdataclean-0.4.2-py3-none-any.whl.

File metadata

File hashes

Hashes for mngdataclean-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9d9b694c4e4bf8b851a4e589fd7ce1918ead7f3bc2562f5de26b785729aff000
MD5 9aa4d1c01739061e3cc31b11411151fd
BLAKE2b-256 fdaab9f62a3066d2fca6398412c82753f6ae4b65fe3900f77b598ab51b109608

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page