Skip to main content

Corpus mini-framework allowing for memory-efficient slicing and provides a standardised base corpus structure for the collection of ATAP tools.

Project description

ATAP Corpus

Provides a standardised base Corpus structure for ATAP tools.

Different Corpus can be sliced into subcorpus based on different criterias and will always return an subclass instance of BaseCorpus. The slicing criteria is flexible, it accepts a user defined function and comes with convenience slicing operations layered on top of it out-of-the-box. Subcorpus maintains a parent-child relationship with original corpus in a tree internally.

Corpus can also be serialised and deserialised which can be used to carry across different ATAP analytics notebooks.

pip install atap_corpus

Tests

To run all the unit tests, there is a script you can execute.

./scripts/run_tests.sh

This repo originated from Juxtorpus and is a decoupling effort. Juxtorpus repo may be accessed here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atap_corpus-0.1.15.tar.gz (25.5 kB view details)

Uploaded Source

Built Distribution

atap_corpus-0.1.15-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file atap_corpus-0.1.15.tar.gz.

File metadata

  • Download URL: atap_corpus-0.1.15.tar.gz
  • Upload date:
  • Size: 25.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.10 Darwin/23.6.0

File hashes

Hashes for atap_corpus-0.1.15.tar.gz
Algorithm Hash digest
SHA256 cd7a15177c4636899f90fed708de7d5280a89839258101b2ab3954e03580015e
MD5 570610343c3a54bdd09a2710eedf760d
BLAKE2b-256 3d90c14bd27250e7c9c44e06ac0083c4096310f2d7c63fb5f2364fa262ff41e4

See more details on using hashes here.

File details

Details for the file atap_corpus-0.1.15-py3-none-any.whl.

File metadata

  • Download URL: atap_corpus-0.1.15-py3-none-any.whl
  • Upload date:
  • Size: 30.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.10.10 Darwin/23.6.0

File hashes

Hashes for atap_corpus-0.1.15-py3-none-any.whl
Algorithm Hash digest
SHA256 dc7a06e315dbe7ef39992bc8d310874a175c7860e9e9b9dd184cc353d84a89d8
MD5 508cf0a6b6758ceea19b619e9e10c5ca
BLAKE2b-256 6790baa6c9391f44db3a23a0f66be2942b4415d82ca43ce4ec51d0e702105791

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page