Skip to main content

Text cleaning and feature extractions using NLP, Traditional approach.

Project description

favicon
NLPurify

Documentation Status GitHub Issues GitHub Forks GitHub Stars LICENSE File PyPI - Downloads PyPI Latest Release

GuardRails badge

A text cleaning and extraction engine was developed using a combination of traditional techniques like Unicode translations, cleaning using regular expressions, and modern tools like "natural language processing" and "large language models" to detect and clean long texts and create word vectors.

Getting Started

The source code is hosted at GitHub: sharkutilities/NLPurify. The binary installers for the latest release are available at the Python Package Index (PyPI).

pip install -U NLPurify

The module is currently under development, and new ideas are welcomed. Raise a new PR/issue for the same. The changes between each release are available here.


[!CAUTION] This code depreciates the existing GitHub Gist which was previously designed. Check #1 for more details.

[!NOTE] Legacy codes are available as a submodule. Check #5 for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlpurify-2.0.0a0.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

NLPurify-2.0.0a0-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file nlpurify-2.0.0a0.tar.gz.

File metadata

  • Download URL: nlpurify-2.0.0a0.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for nlpurify-2.0.0a0.tar.gz
Algorithm Hash digest
SHA256 341e810b7b0b433baeeb156b7e06b815df23b33a6d16fd1ff7a74ed0903c30b1
MD5 4dc197da12817c72036b21b4df8692b6
BLAKE2b-256 cba35b4867b7f8ae4fcef08d8ebf4f25b6cd238a1c9b71a4ae0aafb6bbdc9f4a

See more details on using hashes here.

File details

Details for the file NLPurify-2.0.0a0-py3-none-any.whl.

File metadata

  • Download URL: NLPurify-2.0.0a0-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for NLPurify-2.0.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 ebd9c8e1280c829bf3607dfaa6913a5df749a7396817cc04e7874ed259a6cc63
MD5 d130ef87526dc2e30dfa6f7ce5ca02ed
BLAKE2b-256 c508ac2b83d344ebb7ed0c435902875c28062ff249c594e4518b6e2c720d8b7d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page