Skip to main content

Download and pre-processing data for nlp tasks

Project description

PyPI Download Build Last Commit


  • handle over 100 dataset
  • generate statistic report about processed dataset
  • support many pre-processing ways
  • Provide a panel for entering your parameters at runtime
  • easy to adapt your own dataset and pre-processing utility

Online Explorer


Learn more from the docs.

Quick Start

Installing via pip

pip install nlprep

get one of the dataset

nlprep --dataset clas_udicstm --outdir sentiment

You can also try nlprep in Google Colab: Google Colab


$ nlprep
  --dataset     which dataset to use     
  --outdir      processed result output directory       
optional arguments:
  -h, --help    show this help message and exit
  --util        data preprocessing utility, multiple utility are supported 
  --cachedir    dir for caching raw dataset
  --infile      local dataset path
  --report      generate a html statistics report


Thanks for your interest.There are many ways to contribute to this project. Get started here.

License PyPI - License

Icons reference

Icons modify from Darius Dan from
Icons modify from Freepik from

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlprep-0.2.1.tar.gz (33.5 kB view hashes)

Uploaded Source

Built Distributions

nlprep-0.2.1-py3.7.egg (112.2 kB view hashes)

Uploaded Source

nlprep-0.2.1-py3-none-any.whl (51.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page