Skip to main content

Download and pre-processing data for nlp tasks

Project description



PyPI Download Build Last Commit

Feature

  • handle over 100 dataset
  • generate statistic report about processed dataset
  • support many pre-processing ways
  • Provide a panel for entering your parameters at runtime
  • easy to adapt your own dataset and pre-processing utility

Online Explorer

https://voidful.github.io/NLPrep-Datasets/

Documentation

Learn more from the docs.

Quick Start

Installing via pip

pip install nlprep

get one of the dataset

nlprep --dataset clas_udicstm --outdir sentiment

You can also try nlprep in Google Colab: Google Colab

Overview

$ nlprep
arguments:
  --dataset     which dataset to use     
  --outdir      processed result output directory       

optional arguments:
  -h, --help    show this help message and exit
  --util        data preprocessing utility, multiple utility are supported 
  --cachedir    dir for caching raw dataset
  --infile      local dataset path
  --report      generate a html statistics report

Contributing

Thanks for your interest.There are many ways to contribute to this project. Get started here.

License PyPI - License

Icons reference

Icons modify from Darius Dan from www.flaticon.com
Icons modify from Freepik from www.flaticon.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlprep-0.2.0.tar.gz (29.4 kB view hashes)

Uploaded Source

Built Distributions

nlprep-0.2.0-py3.7.egg (109.9 kB view hashes)

Uploaded Source

nlprep-0.2.0-py3-none-any.whl (50.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page