Skip to main content

Corpus mini-framework allowing for memory-efficient slicing and provides a standardised base corpus structure for the collection of ATAP tools.

Project description

ATAP Corpus

Provides a standardised base Corpus structure for ATAP tools.

Different Corpus can be sliced into subcorpus based on different criterias and will always return an subclass instance of BaseCorpus. The slicing criteria is flexible, it accepts a user defined function and comes with convenience slicing operations layered on top of it out-of-the-box. Subcorpus maintains a parent-child relationship with original corpus in a tree internally.

Corpus can also be serialised and deserialised which can be used to carry across different ATAP analytics notebooks.

pip install atap_corpus

Tests

To run all the unit tests, there is a script you can execute.

./scripts/run_tests.sh

This repo originated from Juxtorpus and is a decoupling effort. Juxtorpus repo may be accessed here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atap_corpus-0.1.11.tar.gz (25.6 kB view details)

Uploaded Source

Built Distribution

atap_corpus-0.1.11-py3-none-any.whl (30.7 kB view details)

Uploaded Python 3

File details

Details for the file atap_corpus-0.1.11.tar.gz.

File metadata

  • Download URL: atap_corpus-0.1.11.tar.gz
  • Upload date:
  • Size: 25.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.11.tar.gz
Algorithm Hash digest
SHA256 604036444ae3a9bca6a8beae5c478bd2623d440a9c609c5483c8ace3d1870233
MD5 cfdc9e0fe3646be9621a70183a0139f8
BLAKE2b-256 986dab6b7cc7c9960f6bddcf427deeb246057f633ff685a7951bd53ee9f57f0e

See more details on using hashes here.

File details

Details for the file atap_corpus-0.1.11-py3-none-any.whl.

File metadata

  • Download URL: atap_corpus-0.1.11-py3-none-any.whl
  • Upload date:
  • Size: 30.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 6d839fd8b7733ec6ed67963fb39f4e2668805b26dea1e716e5c6064208378745
MD5 672051e930b14a35edd1d4679f8fb98a
BLAKE2b-256 42fc60b1db42fa94c138fabc63749e190b96e467b6adb9fd348a9ae8528f5dc7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page