Skip to main content

Pre-processing of NLP training corpora

Project description

Protogenie

Coverage Status Build Status PyPI

How to cite

@software{thibault_clerice_2020_3883586,
  author       = {Thibault Clérice},
  title        = {Protogenie, post-processing for NLP dataset},
  month        = jun,
  year         = 2020,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.3883585},
  url          = {https://doi.org/10.5281/zenodo.3883585}
}

Install from release

pip install protogenie

Install unstable

pip install --upgrade https://github.com/hipster-philology/protogenie/archive/master.zip

Install from source

Start by cloning the repository, and moving inside the created folder

git clone https://github.com/hipster-philology/protogenie.git
cd protogenie/

Create a virtual environment, source it and run

pip install -r requirements.txt

Configuration file

To configurate, you can have a look at the examples in ./tests/test_config but more generally you can and should use the schema: ./ppa_splitter/schema.rng

Workflow

What's the workflow ?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protogenie-0.0.5.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

protogenie-0.0.5-py2.py3-none-any.whl (23.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file protogenie-0.0.5.tar.gz.

File metadata

  • Download URL: protogenie-0.0.5.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for protogenie-0.0.5.tar.gz
Algorithm Hash digest
SHA256 a2bb71805951c816058b90cc94d86938d4fdc59cf18a3a011f9fd50b98522524
MD5 1c7fbbd1db3c0f3d9c14f748b027af1b
BLAKE2b-256 da4b8edd326ca9f696c91133d496a49b258e3abde57ab03e2644fc34d014bb44

See more details on using hashes here.

File details

Details for the file protogenie-0.0.5-py2.py3-none-any.whl.

File metadata

  • Download URL: protogenie-0.0.5-py2.py3-none-any.whl
  • Upload date:
  • Size: 23.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.10

File hashes

Hashes for protogenie-0.0.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c56dd3812b7969ac6353b2bdc4338b7351c8e88a42e2ec7dc740495ddd15534c
MD5 7bd2ab467bac74842ab533330b2201ad
BLAKE2b-256 35e413759c6116bfd2a1f42945d3c935c02c78fe56c0ed7bee337ea972546be6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page