Skip to main content

Corpus mini-framework allowing for memory-efficient slicing and provides a standardised base corpus structure for the collection of ATAP tools.

Project description

ATAP Corpus

Provides a standardised base Corpus structure for ATAP tools.

Different Corpus can be sliced into subcorpus based on different criterias and will always return an subclass instance of BaseCorpus. The slicing criteria is highly flexible, it accepts a user defined function and comes with convenience slicing operations layered on top of it out-of-the-box. Subcorpus maintains a parent-child relationship with original corpus in a tree internally.

Corpus can also be serialised and deserialised which can be used to carry across different ATAP analytics notebooks.

pip install atap_corpus

Extras: Viz:

Out of the box, Corpus also comes with simple and quick visualisations such as word clouds, timelines etc.

pip install atap_corpus[viz]

Tests

To run all the unit tests, there is a script you can execute.

./scripts/run_tests.sh

This repo originated from Juxtorpus and is a decoupling effort. Juxtorpus repo may be accessed here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atap_corpus-0.1.1.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

atap_corpus-0.1.1-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file atap_corpus-0.1.1.tar.gz.

File metadata

  • Download URL: atap_corpus-0.1.1.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.1.tar.gz
Algorithm Hash digest
SHA256 562d65480c0bba82e33e6dec12656f8749da85640c8bdf21bfe5b79c8c836883
MD5 a38d1ab8c0ede806859e76d486229100
BLAKE2b-256 3a44843537f80e399aa014d250e3fe090c563fccaa3784814cdfae4138a82e88

See more details on using hashes here.

File details

Details for the file atap_corpus-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: atap_corpus-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 29.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.10.11 Darwin/22.6.0

File hashes

Hashes for atap_corpus-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c1ed62bfee6b17c046c0d75e17d659be6cf3f00fe71b9c6937f51daf8d2a2b1c
MD5 56a4b760a4abce4261e756dc982412e1
BLAKE2b-256 b426171231d0f520b236c12b7aa09a0dd4d6753b5067b3e02528f60484dc5bc1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page