Skip to main content

Python package for the Conditional Topic Allocation (CTA)

Project description

CTApy

Python package for the "Conditional Topic Allocation" (CTA): a text-analysis method that identifies topics that correlate with numerical outcomes.

How does CTA work?

CTA finds topics by conditioning on observables. For example, do Republicans write differently about politics than Democrats? It consists of three steps:


1. Predict the outcome variable with text.
  • Uses DistilBERT to predict outcome.

2. Select words with high predictive power (positive or negative).
  • Calculates SHAP values for each word and select words with a statistically significant SHAP value.

3. Group words by semantic similarity.
  • Returns topics with either positive or negative correlation with the outcome.

CTA supports all languages.

Installation

CTApy requires Python 3.9 and pip.
It is highly recommended to use a virtual environment (or conda environment) for the installation.

# upgrade pip, wheel and setuptools
python -m pip install -U pip wheel setuptools

# install the package
python -m pip install -U CTApy

If you want to use Jupyter, make sure you have it installed in the current environment.

Quickstart

Please see the hands-on tutorials, which replicate the research paper: https://github.com/twekhof/CTA/tree/main/tutorials.

Author

CTApy was developed by

Tobias Wekhof, ETH Zurich

Disclaimer

This Python package is a research tool currently under development. The authors take no responsibility for the accuracy or reliability of the results produced by it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctapy-0.1.4.tar.gz (9.7 kB view hashes)

Uploaded Source

Built Distribution

CTApy-0.1.4-py3-none-any.whl (12.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page