Skip to main content

Download and pre-processing data for nlp tasks

Project description



PyPI Download Build Last Commit

Feature

  • handle over 100 dataset
  • generate statistic report about processed dataset
  • support many pre-processing ways
  • Provide a panel for entering your parameters at runtime
  • easy to adapt your own dataset and pre-processing utility

Online Explorer

https://voidful.github.io/NLPrep-Datasets/

Documentation

Learn more from the docs.

Quick Start

Installing via pip

pip install nlprep

get one of the dataset

nlprep --dataset clas_udicstm --outdir sentiment

You can also try nlprep in Google Colab: Google Colab

Overview

$ nlprep
arguments:
  --dataset     which dataset to use     
  --outdir      processed result output directory       
  
optional arguments:
  -h, --help    show this help message and exit
  --util        data preprocessing utility, multiple utility are supported 
  --cachedir    dir for caching raw dataset
  --infile      local dataset path
  --report      generate a html statistics report

Contributing

Thanks for your interest.There are many ways to contribute to this project. Get started here.

License PyPI - License

Icons reference

Icons modify from Darius Dan from www.flaticon.com
Icons modify from Freepik from www.flaticon.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlprep-0.2.1.tar.gz (33.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

nlprep-0.2.1-py3.7.egg (112.2 kB view details)

Uploaded Egg

nlprep-0.2.1-py3-none-any.whl (51.5 kB view details)

Uploaded Python 3

File details

Details for the file nlprep-0.2.1.tar.gz.

File metadata

  • Download URL: nlprep-0.2.1.tar.gz
  • Upload date:
  • Size: 33.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/57.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for nlprep-0.2.1.tar.gz
Algorithm Hash digest
SHA256 331c14feb982fd74925213f71b107d73ec3b8a62e5f97e29a3ae1e0820cd8123
MD5 cbf23ae017f33adccc55a9df45203955
BLAKE2b-256 c5c5d8ce3379cb1f35625c2280486b83d274579c6dc948d0542fc8753d3f00cf

See more details on using hashes here.

File details

Details for the file nlprep-0.2.1-py3.7.egg.

File metadata

  • Download URL: nlprep-0.2.1-py3.7.egg
  • Upload date:
  • Size: 112.2 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/57.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for nlprep-0.2.1-py3.7.egg
Algorithm Hash digest
SHA256 4ffc137c3a2f6c154448fc894cc3498de2012a1eb9c2a9114693f816af676a22
MD5 2dc3a1f34ec6be007465966c2c731f72
BLAKE2b-256 8c98caf33600ddc7422e7f63aa38d4aeff537326939e9b6a4f6ef792689a0de3

See more details on using hashes here.

File details

Details for the file nlprep-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: nlprep-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 51.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/57.0.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8

File hashes

Hashes for nlprep-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 07fcb242bd2a95d79b23641e15a13d3839ac3fe8cb1061839de8324ee6bd6792
MD5 c8e0a3d37e016950b721edf5a160956f
BLAKE2b-256 8eb41460e80ffd9e1abc1e5449402b3513f6eff1d40aa651981eaed21616ac11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page