Skip to main content

Corpus mini-framework allowing for memory-efficient slicing and provides a standardised base corpus structure for the collection of ATAP tools.

Project description

ATAP Corpus

Provides a standardised base Corpus structure for ATAP tools.

Different Corpus can be sliced into subcorpus based on different criterias and will always return an subclass instance of BaseCorpus. The slicing criteria is flexible, it accepts a user defined function and comes with convenience slicing operations layered on top of it out-of-the-box. Subcorpus maintains a parent-child relationship with original corpus in a tree internally.

Corpus can also be serialised and deserialised which can be used to carry across different ATAP analytics notebooks.

pip install atap_corpus

Tests

To run all the unit tests, there is a script you can execute.

./scripts/run_tests.sh

This repo originated from Juxtorpus and is a decoupling effort. Juxtorpus repo may be accessed here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atap_corpus-0.1.7.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

atap_corpus-0.1.7-py3-none-any.whl (30.5 kB view details)

Uploaded Python 3

File details

Details for the file atap_corpus-0.1.7.tar.gz.

File metadata

  • Download URL: atap_corpus-0.1.7.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.7.tar.gz
Algorithm Hash digest
SHA256 5fb555c0ed58816f7bc41e23d16d60d99cb3c5f3d62eaf3396086247e803b6ea
MD5 5a90c5d1d7ab26954e4d65c97535e065
BLAKE2b-256 cf3a65c196149070ed64d6356d214956f7439679cf188c10da4ab4b6c4a5ca74

See more details on using hashes here.

File details

Details for the file atap_corpus-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: atap_corpus-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 30.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 9690a40d1cbff1f1c219d4eaed6a234a73d3e470952d60228d6829c567d0c2b6
MD5 70066d3e628365c8d2fdcaf39bde9f9c
BLAKE2b-256 8c4b112af3a87096bb312362ed32a3f4e0a1c907180962ac2f35c94aa5ee9895

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page