Skip to main content

Pre-processing of NLP training corpora

Project description

Protogenie

Coverage Status Build Status PyPI

How to cite

@software{thibault_clerice_2020_3883586,
  author       = {Thibault Clérice},
  title        = {Protogenie, post-processing for NLP dataset},
  month        = jun,
  year         = 2020,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.3883585},
  url          = {https://doi.org/10.5281/zenodo.3883585}
}

Install from release

pip install protogenie

Install unstable

pip install --upgrade https://github.com/hipster-philology/protogenie/archive/master.zip

Install from source

Start by cloning the repository, and moving inside the created folder

git clone https://github.com/hipster-philology/protogenie.git
cd protogenie/

Create a virtual environment, source it and run

pip install -r requirements.txt

Configuration file

To configurate, you can have a look at the examples in ./tests/test_config but more generally you can and should use the schema: ./ppa_splitter/schema.rng

Workflow

What's the workflow ?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protogenie-0.0.4.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

protogenie-0.0.4-py2.py3-none-any.whl (22.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file protogenie-0.0.4.tar.gz.

File metadata

  • Download URL: protogenie-0.0.4.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for protogenie-0.0.4.tar.gz
Algorithm Hash digest
SHA256 dff3c0f6d58e049cf25b746455e8f78741a1a1375c568c5d5514d4d85e384871
MD5 dfaf744e742f042f0cb44376c8045713
BLAKE2b-256 1a888c0635ec22afad8c6bb9a710086eca7d1597f8072e7ca1b180897588a3c5

See more details on using hashes here.

File details

Details for the file protogenie-0.0.4-py2.py3-none-any.whl.

File metadata

  • Download URL: protogenie-0.0.4-py2.py3-none-any.whl
  • Upload date:
  • Size: 22.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for protogenie-0.0.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b3207afba27168d59178d1324f4579fd21c2740fbf9cb584161af295b19ee5a9
MD5 feb4ca12cfa1adf9853d73b525cf2a30
BLAKE2b-256 4d4f144ddc2e9c4dfa952a8afa8149a5fcfae94751ea4b56ce53944f4422982b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page