Skip to main content

Python package for the Conditional Topic Allocation (CTA)

Project description

CTApy

Python package for the "Conditional Topic Allocation" (CTA): a text-analysis method that identifies topics that correlate with numerical outcomes.

How does CTA work?

CTA finds topics by conditioning on observables. For example, do Republicans write differently about politics than Democrats? It consists of three steps:


1. Predict the outcome variable with text.
  • Uses DistilBERT to predict outcome.

2. Select words with high predictive power (positive or negative).
  • Calculates SHAP values for each word and select words with a statistically significant SHAP value.

3. Group words by semantic similarity.
  • Returns topics with either positive or negative correlation with the outcome.

CTA supports all languages.

Installation

Runs on Windows and requires Python 3.9 and pip.
It is highly recommended to use a virtual environment (or conda environment) for the installation.

# upgrade pip, wheel and setuptools
python -m pip install -U pip wheel setuptools

# install the package
python -m pip install -U CTApy

If you want to use Jupyter, make sure you have it installed in the current environment.

Quickstart

Please see the hands-on tutorials, which replicate the research paper: https://github.com/twekhof/CTA/tutorials. The paper uses the following package versions:
-torch: 2.4.0
-transformers: 4.32.1

Author

CTApy was developed by

Tobias Wekhof, ETH Zurich

Disclaimer

This Python package is a research tool currently under development. The authors take no responsibility for the accuracy or reliability of the results produced by it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctapy-0.1.3.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

CTApy-0.1.3-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file ctapy-0.1.3.tar.gz.

File metadata

  • Download URL: ctapy-0.1.3.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for ctapy-0.1.3.tar.gz
Algorithm Hash digest
SHA256 b1c97cc36597a5a92c744b7281fbe21113d76c1060d50880f94ba88465c81804
MD5 2054912fef6577c18441c3563132374b
BLAKE2b-256 ca3502cd7bc6f5e873e374a56707761b21556064dfddbc295921d5c860a163e2

See more details on using hashes here.

File details

Details for the file CTApy-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: CTApy-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.5

File hashes

Hashes for CTApy-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 567894debac13142ff89a7287ceb6e0f20dba57bf06cdd696ae7161611bddb8e
MD5 227e52be62c79f4f9a1a553a85123685
BLAKE2b-256 a7b245ab71b0d62f6a49a2e3c220cbba64d8beaea1066769d876d808e76d101d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page