Skip to main content

This package contains preprocessing functions

Project description

NLPPREPROCESS

NLPPREPROCESS is a preprocessing package for NLP task. The main objective of the package is to reduce time consumed for preprocessing by using ready made functions.

Requirements

  • Python 3.4 or higher

Installation

Using PIP via PyPI

$ pip install nlppreprocess

Manually via GIT

$ git clone git://github.com/gaganmanku96/nlppreprocess
$ cd nlppreprocess
$ python setup.py install

Functionalities

  1. Replaces words
  2. Remove stopwords
  3. Remove numbers
  4. Remove HTML tags
  5. Remove punctations
  6. Lemmatize words either by Wordnet or Snowball

Usage

>>> from nlpuitls import NLP
>>> obj = NLP()

Parameters

>>> obj = NLP(
       replace_words=True,
       remove_stopwords=True,
       remove_numbers=True,
       remove_HTML_tags=True,
       remove_punctation=True,
       lemmatize=False,
       lemmatize_method='wordnet'
      )

Using with Pandas Library

>>> dataFrame['text'] = dataFrame['text].apply(obj.process)

Using with plain textx

>>> print(obj.process("Pass a text here"))

Add more stopwords

>>> obj = NLP()
>>> obj.add_stopword(['this', 'and this'])

Add more replace words

>>> obj = NLP()
>>> obj.add_replacement([this="by this", this="by this"])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

nlppreprocess-1.0.2-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file nlppreprocess-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: nlppreprocess-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 5.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for nlppreprocess-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d3eacb1bab2d240d03083d85cedf629a6aafe5b526f6ced3d3f8061bb4bd0a93
MD5 f03ade7b659e291ff51dbdce6b6aea0a
BLAKE2b-256 668d3a0584b924248c865a8e7ee04a93175551ebcaf156ee9b73346cd62446e6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page