Skip to main content

Corpus mini-framework allowing for memory-efficient slicing and provides a standardised base corpus structure for the collection of ATAP tools.

Project description

ATAP Corpus

Provides a standardised base Corpus structure for ATAP tools.

Different Corpus can be sliced into subcorpus based on different criterias and will always return an subclass instance of BaseCorpus. The slicing criteria is flexible, it accepts a user defined function and comes with convenience slicing operations layered on top of it out-of-the-box. Subcorpus maintains a parent-child relationship with original corpus in a tree internally.

Corpus can also be serialised and deserialised which can be used to carry across different ATAP analytics notebooks.

pip install atap_corpus

Tests

To run all the unit tests, there is a script you can execute.

./scripts/run_tests.sh

This repo originated from Juxtorpus and is a decoupling effort. Juxtorpus repo may be accessed here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atap_corpus-0.1.13.tar.gz (25.7 kB view details)

Uploaded Source

Built Distribution

atap_corpus-0.1.13-py3-none-any.whl (30.7 kB view details)

Uploaded Python 3

File details

Details for the file atap_corpus-0.1.13.tar.gz.

File metadata

  • Download URL: atap_corpus-0.1.13.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.13.tar.gz
Algorithm Hash digest
SHA256 3710e18ef6540bfcb29bd7a7f6b685de96c388e43fe92a643e6cd299d52f9e87
MD5 9edda44fc5f090f5bcea4df0c5ef60c5
BLAKE2b-256 9f34930d38793cdd96cff1d546f6fca109be6bf2e32a234449dc61c168b1d570

See more details on using hashes here.

File details

Details for the file atap_corpus-0.1.13-py3-none-any.whl.

File metadata

  • Download URL: atap_corpus-0.1.13-py3-none-any.whl
  • Upload date:
  • Size: 30.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 cb1308bcbfe6ab9a999e0e6f734bc6f1215a9dba5119913275335c4c295f5bfc
MD5 ce4ccd0229430c355f62c4f6985e6e7a
BLAKE2b-256 8d5f835a318329e56d2ae9474c85c589fc10cb86ad18f2aad1c7e24c12243c6a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page