Skip to main content

This package is for clean the text as text processing

Project description

Text Cleaning of English Language Python Package

Text Cleaning is a common preprocessing technique for almost all NLP task. Mainly I have designed the package for Text Classification Task. Also You can use it for other NLP task also. You are welcome to contribute the package.

Install the package

pip install eng_text_cleaner

There has number of methods to clean the text such as removing emoji, punctuation, html_tags, urls, characters not words or digits or underscore, digits, stopwords, spell correction, lemmatize the words. One Method named clean text will apply all the methods to clean the text at a glance. Let's explore the simple package.

from eng_text_cleaner import preprocessing 

Start by removing punctuation

text = "Neither too small nor too large, and nice resolution at a good price."
# create textcleaner instance
textcleaner = preprocessing.TextCleaner()
# remove punctuation
textcleaner.remove_punctuation(text)

Output:

Neither too small nor too large and nice resolution at a good price

For Clean the text totally

# fully clean the text
textcleaner.clean_text(text)

Output:

neither small large nice resolution good price

Author:

  • Md Abdullah Al Hasib

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eng_text_cleaner-0.0.4.tar.gz (4.1 kB view details)

Uploaded Source

File details

Details for the file eng_text_cleaner-0.0.4.tar.gz.

File metadata

  • Download URL: eng_text_cleaner-0.0.4.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for eng_text_cleaner-0.0.4.tar.gz
Algorithm Hash digest
SHA256 df26efc789367c9863e4b3f96d0a1eb70f3a15473fc652aee97f28496ad36df3
MD5 ecbf77c8fa227c09052c99c033146477
BLAKE2b-256 f40422983da95e3f046be1703af0ce44ea2d6022e8738c05187c8899c6cc5763

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page