Skip to main content

Pre-processing of NLP training corpora

Project description

Protogenie

Coverage Status Build Status PyPI

How to cite

@software{thibault_clerice_2020_3883586,
  author       = {Thibault Clérice},
  title        = {Protogenie, post-processing for NLP dataset},
  month        = jun,
  year         = 2020,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.3883585},
  url          = {https://doi.org/10.5281/zenodo.3883585}
}

Install from release

pip install protogenie

Install unstable

pip install --upgrade https://github.com/hipster-philology/protogenie/archive/master.zip

Install from source

Start by cloning the repository, and moving inside the created folder

git clone https://github.com/hipster-philology/protogenie.git
cd protogenie/

Create a virtual environment, source it and run

pip install -r requirements.txt

Configuration file

To configurate, you can have a look at the examples in ./tests/test_config but more generally you can and should use the schema: ./ppa_splitter/schema.rng

Workflow

What's the workflow ?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protogenie-0.0.7.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

protogenie-0.0.7-py2.py3-none-any.whl (23.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file protogenie-0.0.7.tar.gz.

File metadata

  • Download URL: protogenie-0.0.7.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for protogenie-0.0.7.tar.gz
Algorithm Hash digest
SHA256 ebe84235365a35e0db00b456c10b054b7dd86f92f87a47ef4a0a8312a1dba747
MD5 b78eeb60ec945f21d2a04fe837b8a36e
BLAKE2b-256 c863c818d254373aef85e49399c2b018a4dcab7a2b15f308a9bfed5893ebef54

See more details on using hashes here.

File details

Details for the file protogenie-0.0.7-py2.py3-none-any.whl.

File metadata

  • Download URL: protogenie-0.0.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 23.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for protogenie-0.0.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2c5f04437489214d9d7b65d07e0677f3a804a2ee3f45dd293e0c4281f9d6dff1
MD5 2ace173b79d0ed56eba7b538c5f0b5d6
BLAKE2b-256 1e0c14900861ced5e358d82b12195b7d5d6b6d0c955e140bf9edb8d0182fdfa4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page