Skip to main content

Python package for the Conditional Topic Allocation (CTA)

Project description

CTApy

Python package for the "Conditional Topic Allocation" (CTA): a text-analysis method that identifies topics that correlate with numerical outcomes.

How does CTA work?

CTA finds topics by conditioning on observables. For example, do Republicans write differently about politics than Democrats? It consists of three steps:


1. Predict the outcome variable with text.
  • Use DistillBERT to predict outcome.

2. Select words with high predictive power (positive or negative).
  • Calculate SHAP values for each word and select words with a statistically significant SHAP value.

3. Group words by semantic similarity.
  • Topics with either positive or negative correlation with the outcome.

CTA supports all languages.

Installation

Runs on Windows and requires Python 3.9 and pip.
It is highly recommended to use a virtual environment (or conda environment) for the installation.

# upgrade pip, wheel and setuptools
python -m pip install -U pip wheel setuptools

# install the package
python -m pip install -U CTApy

If you want to use Jupyter, make sure you have it installed in the current environment.

Quickstart

Please see the hands-on tutorials, which replicate the research paper: https://github.com/twekhof/CTA/tutorials. The paper uses the following package versions:
-torch: 2.4.0
-transformers: 4.32.1

Author

CTApy was developed by

Tobias Wekhof, ETH Zurich

Disclaimer

This Python package is a research tool currently under development. The authors take no responsibility for the accuracy or reliability of the results produced by it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctapy-0.1.2.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

CTApy-0.1.2-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file ctapy-0.1.2.tar.gz.

File metadata

  • Download URL: ctapy-0.1.2.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for ctapy-0.1.2.tar.gz
Algorithm Hash digest
SHA256 781913b4916cb78b4ee5c81ed1b74377232a7914d93d2a5d20c4cdb2a98304fe
MD5 a0cd7dcca24243b6f328379e1652e6eb
BLAKE2b-256 f150f2ab8b3f10a33f5697b23f72d4cbd3be5fd1c0389e7815a4a9533e6c197a

See more details on using hashes here.

File details

Details for the file CTApy-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: CTApy-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for CTApy-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 21aa4d062d64ce08b40a61a57c53dca1549cd5e206b10504ba581ab97a046028
MD5 c1d1da70b97b2836e33582f0ebafc3a1
BLAKE2b-256 7028f7a2de4e0cff788a7ff46e5d0ab269ebb2cf98f20646e5d13f62b5a3ede6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page