preprocessing

pre-processing package for text strings

These details have not been verified by PyPI

Project links

Homepage

Project description

‘preprocessing’

Summary

Text pre-processing package to aid in NLP package development for Python3. With this package you can order text cleaning functions in the order you prefer rather than relying on the order of an arbitrary NLP package.

Installation

pip:

pip install preprocessing

PyPI - You can also download the source distribution from:

https://pypi.python.org/pypi/preprocessing/

You can then perform:

pip install <path_to_tar_file>

on the tar file, or

python setup.py install

on/inside, respectively, the extracted package to install preprocessing.

Example

Once you have the package installed, implementing it with Python3 takes the following form:

import preprocessing.text as ptext
from preprocessing.text import keyword_tokenize, remove_unbound_punct, remove_urls

text_string = "important string at: http://example.com"

clean_string = ptext.preprocess_text(text_string, [
    remove_urls,
    remove_unbound_punct,
    keyword_tokenize
])

>>> print(clean_string)
"important string"

Should the functions be performed in a different order (i.e. keyword_tokenize -> remove_urls -> remove_non_bound_punct) :

>>> print(clean_string)
"important string http example.com"

Organisation

This package is comprised of a single module with no intended subpackages currently. The preprocessing package is dependent on NLTK for tokenizers and stopwords. However, ignoring this, the package only has built-in dependencies from Python 3.

Contributing

If you feel like contributing:

Check for open issues or open a new issue
Fork the preprocessing repository to start making your changes
Write a test which shows the bug was fixed or that the feature works as expected
Send a pull request and remember to add yourself to CONTRIBUTORS.md

License

This project is licensed under the MIT license (see LICENSE)

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.13

Oct 25, 2017

0.1.12

Aug 30, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

preprocessing-0.1.13.tar.gz (14.8 kB view details)

Uploaded Oct 25, 2017 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

preprocessing-0.1.13-py3-none-any.whl (349.6 kB view details)

Uploaded Oct 25, 2017 Python 3

File details

Details for the file preprocessing-0.1.13.tar.gz.

File metadata

Download URL: preprocessing-0.1.13.tar.gz
Upload date: Oct 25, 2017
Size: 14.8 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for preprocessing-0.1.13.tar.gz
Algorithm	Hash digest
SHA256	`4c6ef9f4b94bf02664fc4c6bdc3814dfc17a94bbbde002f2a9113c91fdfe7f87`
MD5	`0e1a2b853c7f0e5312cf6c4af3ada664`
BLAKE2b-256	`e3ca102f0cb754c3dfdd095110711faa8566c66fa857fa0ffd2c3040ab2d8a81`

See more details on using hashes here.

File details

Details for the file preprocessing-0.1.13-py3-none-any.whl.

File metadata

Download URL: preprocessing-0.1.13-py3-none-any.whl
Upload date: Oct 25, 2017
Size: 349.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No

File hashes

Hashes for preprocessing-0.1.13-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7323b9bd514f676019b3bd5d97360df0cc7262a58fb7eee6e80e87a1894c7f15`
MD5	`4fb36e168ef5d18fdeff3036bf75966a`
BLAKE2b-256	`79f9cadc71dbd774398e486f0608fb6746de36f562edf32fc59ebbe94a589c79`

See more details on using hashes here.

preprocessing 0.1.13

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

‘preprocessing’

Summary

Installation

Example

Organisation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes