This package is for clean the text as text processing
Project description
Text Cleaning of English Language Python Package
Text Cleaning is a common preprocessing technique for almost all NLP task. Mainly I have designed the package for Text Classification Task. Also You can use it for other NLP task also. You are welcome to contribute the package.
Install the package
pip install eng_text_cleaner
There has number of methods to clean the text such as removing emoji, punctuation, html_tags, urls, characters not words or digits or underscore, digits, stopwords, spell correction, lemmatize the words. One Method named clean text will apply all the methods to clean the text at a glance. Let's explore the simple package.
from eng_text_cleaner import preprocessing
Start by removing punctuation
text = "Neither too small nor too large, and nice resolution at a good price."
# create textcleaner instance
textcleaner = preprocessing.TextCleaner()
# remove punctuation
textcleaner.remove_punctuation(text)
Output:
Neither too small nor too large and nice resolution at a good price
For Clean the text totally
# fully clean the text
textcleaner.clean_text(text)
Output:
neither small large nice resolution good price
Author:
- Md Abdullah Al Hasib
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file eng_text_cleaner-0.0.4.tar.gz
.
File metadata
- Download URL: eng_text_cleaner-0.0.4.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | df26efc789367c9863e4b3f96d0a1eb70f3a15473fc652aee97f28496ad36df3 |
|
MD5 | ecbf77c8fa227c09052c99c033146477 |
|
BLAKE2b-256 | f40422983da95e3f046be1703af0ce44ea2d6022e8738c05187c8899c6cc5763 |