Skip to main content

Python package for the Conditional Topic Allocation (CTA)

Project description

CTApy

Python package for the "Conditional Topic Allocation" (CTA): a text-analysis method that identifies topics that correlate with numerical outcomes.

How does CTA work?

CTA finds topics by conditioning on observables. For example, do Republicans write differently about politics than Democrats? It consists of three steps:


1. Predict the outcome variable with text.
  • Uses DistilBERT to predict outcome.

2. Select words with high predictive power (positive or negative).
  • Calculates SHAP values for each word and select words with a statistically significant SHAP value.

3. Group words by semantic similarity.
  • Returns topics with either positive or negative correlation with the outcome.

CTA supports all languages.

Installation

CTApy requires Python 3.9 and pip.
It is highly recommended to use a virtual environment (or conda environment) for the installation.

# upgrade pip, wheel and setuptools
python -m pip install -U pip wheel setuptools

# install the package
python -m pip install -U CTApy

If you want to use Jupyter, make sure you have it installed in the current environment.

Quickstart

Please see the hands-on tutorials, which replicate the research paper: https://github.com/twekhof/CTA/tree/main/tutorials.

Author

CTApy was developed by

Tobias Wekhof, ETH Zurich

Disclaimer

This Python package is a research tool currently under development. The authors take no responsibility for the accuracy or reliability of the results produced by it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctapy-0.1.4.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

CTApy-0.1.4-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file ctapy-0.1.4.tar.gz.

File metadata

  • Download URL: ctapy-0.1.4.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for ctapy-0.1.4.tar.gz
Algorithm Hash digest
SHA256 166cf8ea9e2b8e93b07a2df359d995ea55ddb5cf2e288b6be50f5b93369b7de9
MD5 9723d6a0cb7c9a2e89bb7387fed20ef1
BLAKE2b-256 10f05911f184262209332830461edf159a2aff137e786675285fc3497cacf159

See more details on using hashes here.

File details

Details for the file CTApy-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: CTApy-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for CTApy-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 51e9eb7f901fbb50fec3f4b9e5a651daca2c848ce73240f805675cbda7af65bf
MD5 6ccd9081ea686a84ffaa127fec46a0b1
BLAKE2b-256 04a33142356f920c0532354531a37ed08e7f2c32d298eb323d6d9bb0182014ef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page