Skip to main content

A lightweight text preprocessing toolkit with tokenization, stopword removal, stemming, lemmatization, and emoji/special character cleaning.

Project description

process-text

A simple and lightweight text preprocessing toolkit for NLP pipelines.

Features

  • Lowercasing
  • Punctuation removal
  • Tokenization
  • Stopword removal
  • Stemming
  • Lemmatization with POS tagging
  • Emoji removal
  • Special character filtering
  • Misspelling correction using TextBlob
  • Compose multiple transformations into a pipeline

Installation

pip install process-text

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

akoang_library-0.1.0.tar.gz (2.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

akoang_library-0.1.0-py3-none-any.whl (3.2 kB view details)

Uploaded Python 3

File details

Details for the file akoang_library-0.1.0.tar.gz.

File metadata

  • Download URL: akoang_library-0.1.0.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.21

File hashes

Hashes for akoang_library-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c3ddc302a1eb1ade89c6eeb8ed3d10664124bb1bf028a85e7d611b35c94c36bf
MD5 ac919be563409d08cb1714a63ea303ab
BLAKE2b-256 cf1654e976db147c160e33dce490bc710ee2252fb5d292d285588847652ef6b8

See more details on using hashes here.

File details

Details for the file akoang_library-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: akoang_library-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.21

File hashes

Hashes for akoang_library-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fdf40d1dffde3711d9ce7b05cb1970820f180d1ad0b5634de2f843199c7abd80
MD5 362237475b7a6807600dc7ac0dfe4261
BLAKE2b-256 d822d772af1cbadc6674c03e67b32fcd968d3f357a7b219c69c4096b534bf5af

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page