Skip to main content

Python wrapper for the CETEMPublico corpus

Project description

cetem-publico is a Python wrapper for the CETEMPublico corpus. It takes care of downloading, storing and importing the corpus into NLTK.

THIS IS STILL A WORK IN PROGRESS, API MIGHT BREAK WITHOUT WARNING.

Installing

Install and update using pip:

pip install [--user] cetem-publico

A Simple Example

import CETEMPublico

cp = CETEMPublico.load() # loads a small 10KB sample
# or
cp = CETEMPublico.load(full=True) # loads the full 12GB

print(cp.tagged_sents())

Acknowledgements

This module only exists thanks to the Publico newspaper and the team responsible for the CETEMPublico corpus.

Bugs and stuff

Open a GitHub issue or, preferably, send me a pull request.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cetem-publico-0.0.16.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

cetem_publico-0.0.16-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file cetem-publico-0.0.16.tar.gz.

File metadata

  • Download URL: cetem-publico-0.0.16.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.8.0

File hashes

Hashes for cetem-publico-0.0.16.tar.gz
Algorithm Hash digest
SHA256 7d72b1a0344ce810630e7c4c519d3789aedff6a232591c4265da0c8d2c51d4b7
MD5 315a9a9d3851c71ec3d949cc59926a89
BLAKE2b-256 b49422ffcc8179320b8d1c525d3b31b7eb12ea123ba843a85c5d30e9f17549c8

See more details on using hashes here.

File details

Details for the file cetem_publico-0.0.16-py3-none-any.whl.

File metadata

  • Download URL: cetem_publico-0.0.16-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.8.0

File hashes

Hashes for cetem_publico-0.0.16-py3-none-any.whl
Algorithm Hash digest
SHA256 11b151fedc2cf416920b657c9cb6016afbee03fb1aa0c9b7a0c695a1fcbad2d4
MD5 ff0fa27e153e979e7ee20e71eff5ef42
BLAKE2b-256 54ff9fd84e02d4219b9d7142599f85a99366936f143baed51636944a295c86fb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page