Skip to main content

Corpus mini-framework allowing for memory-efficient slicing and provides a standardised base corpus structure for the collection of ATAP tools.

Project description

ATAP Corpus

Provides a standardised base Corpus structure for ATAP tools.

Different Corpus can be sliced into subcorpus based on different criterias and will always return an subclass instance of BaseCorpus. The slicing criteria is highly flexible, it accepts a user defined function and comes with convenience slicing operations layered on top of it out-of-the-box. Subcorpus maintains a parent-child relationship with original corpus in a tree internally.

Corpus can also be serialised and deserialised which can be used to carry across different ATAP analytics notebooks.

pip install atap_corpus

Extras: Viz:

Out of the box, Corpus also comes with simple and quick visualisations such as word clouds, timelines etc.

pip install atap_corpus[viz]

Tests

To run all the unit tests, there is a script you can execute.

./scripts/run_tests.sh

This repo originated from Juxtorpus and is a decoupling effort. Juxtorpus repo may be accessed here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atap_corpus-0.1.2.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

atap_corpus-0.1.2-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file atap_corpus-0.1.2.tar.gz.

File metadata

  • Download URL: atap_corpus-0.1.2.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.2.tar.gz
Algorithm Hash digest
SHA256 bad156018c80eef79483d6a6f3e577d14aa1984eb1b511d60379e27c3ee50bd0
MD5 5b89ccd6936b8a7df5d4656572e05a33
BLAKE2b-256 1713fe7c37f944a8b133446cacba4cfce9c17f68823544ae739be2e641c16a7d

See more details on using hashes here.

File details

Details for the file atap_corpus-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: atap_corpus-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 29.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6e35b4327ed3b62d37e4f4b3f01d8719b54c362abafe7daf381b5526414ef9ca
MD5 05211ac3ed2c823ed04b28bd69ab484d
BLAKE2b-256 ed2b86227315a358193611a5e391e6994475b6d80a0287f343b6e2c2b5e45d64

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page