Skip to main content

Corpus mini-framework allowing for memory-efficient slicing and provides a standardised base corpus structure for the collection of ATAP tools.

Project description

ATAP Corpus

Provides a standardised base Corpus structure for ATAP tools.

Different Corpus can be sliced into subcorpus based on different criterias and will always return an subclass instance of BaseCorpus. The slicing criteria is highly flexible, it accepts a user defined function and comes with convenience slicing operations layered on top of it out-of-the-box. Subcorpus maintains a parent-child relationship with original corpus in a tree internally.

Corpus can also be serialised and deserialised which can be used to carry across different ATAP analytics notebooks.

pip install atap_corpus

Tests

To run all the unit tests, there is a script you can execute.

./scripts/run_tests.sh

This repo originated from Juxtorpus and is a decoupling effort. Juxtorpus repo may be accessed here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atap_corpus-0.1.4.tar.gz (24.3 kB view details)

Uploaded Source

Built Distribution

atap_corpus-0.1.4-py3-none-any.whl (29.3 kB view details)

Uploaded Python 3

File details

Details for the file atap_corpus-0.1.4.tar.gz.

File metadata

  • Download URL: atap_corpus-0.1.4.tar.gz
  • Upload date:
  • Size: 24.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.4.tar.gz
Algorithm Hash digest
SHA256 1361548c3fcafd5b9da721c998d19d70a190673b660911662e66f2b2a794ee3e
MD5 ee6fa0afd1c44d8e7f9fac8663b668bd
BLAKE2b-256 392089666e0ac9a699c27fc00c01c082dfe0447e8a29781d188e77cf1e67728c

See more details on using hashes here.

File details

Details for the file atap_corpus-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: atap_corpus-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 29.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4b2b75300ccad1675f1206eebcc59c5ce90039016bc8d7e2190a93e81fa5ed4b
MD5 3f648417d8087642d250a4a4a27a58ca
BLAKE2b-256 559dbb645b05305cdf2b94c41c83aa541b4c4bb3326ee42668b05fae3f12a50a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page