Skip to main content

This package is for clean the text as text processing

Project description

Text Cleaning of English Language Python Package

Text Cleaning is a common preprocessing technique for almost all NLP task. Mainly I have designed the package for Text Classification Task. Also You can use it for other NLP task also. You are welcome to contribute the package.

Install the package

pip install eng-text-cleaner

There has number of methods to clean the text such as removing emoji, punctuation, html_tags, urls, characters not words or digits or underscore, digits, stopwords, spell correction, lemmatize the words. One Method named clean text will apply all the methods to clean the text at a glance. Let's explore the simple package.

from eng_text_cleaner import preprocessing 

Start by removing punctuation

text = "Neither too small nor too large, and nice resolution at a good price."
# create textcleaner instance
textcleaner = preprocessing.TextCleaner()
# remove punctuation
textcleaner.remove_punctuation(text)

Output:

Neither too small nor too large and nice resolution at a good price

For Clean the text totally

# fully clean the text
textcleaner.clean_text(text)

Output:

neither small large nice resolution good price

Author:

  • Md Abdullah Al Hasib

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eng_text_cleaner-0.0.5.tar.gz (4.1 kB view details)

Uploaded Source

File details

Details for the file eng_text_cleaner-0.0.5.tar.gz.

File metadata

  • Download URL: eng_text_cleaner-0.0.5.tar.gz
  • Upload date:
  • Size: 4.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for eng_text_cleaner-0.0.5.tar.gz
Algorithm Hash digest
SHA256 e9a66e2f87b0fd5c47f7012375a8d7f124e83357e7eda7047bb6633cf23898f2
MD5 610f7b7b9bd5d1e896ce54b29b85eaad
BLAKE2b-256 2201008d487986f6c038e59ebd81eef8f99e77820855a28b79f781639104754c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page