Python package for the Conditional Topic Allocation (CTA)
Project description
CTApy
Python package for the "Conditional Topic Allocation" (CTA): a text-analysis method that identifies topics that correlate with numerical outcomes.
- Corresponding research paper: Conditional Topic Allocations for Open-Ended Survey Responses (2024).
How does CTA work?
CTA finds topics by conditioning on observables. For example, do Republicans write differently about politics than Democrats? It consists of three steps:
1. Predict the outcome variable with text.
- Uses DistilBERT to predict outcome.
2. Select words with high predictive power (positive or negative).
- Calculates SHAP values for each word and select words with a statistically significant SHAP value.
3. Group words by semantic similarity.
- Returns topics with either positive or negative correlation with the outcome.
CTA supports all languages.
Installation
CTApy requires Python 3.9 and pip.
It is highly recommended to use a virtual environment (or conda environment) for the installation.
# upgrade pip, wheel and setuptools
python -m pip install -U pip wheel setuptools
# install the package
python -m pip install -U CTApy
If you want to use Jupyter, make sure you have it installed in the current environment.
Quickstart
Please see the hands-on tutorials, which replicate the research paper: https://github.com/twekhof/CTA/tree/main/tutorials.
Author
CTApy was developed by
Tobias Wekhof, ETH Zurich
Disclaimer
This Python package is a research tool currently under development. The authors take no responsibility for the accuracy or reliability of the results produced by it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ctapy-0.1.4.tar.gz.
File metadata
- Download URL: ctapy-0.1.4.tar.gz
- Upload date:
- Size: 9.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
166cf8ea9e2b8e93b07a2df359d995ea55ddb5cf2e288b6be50f5b93369b7de9
|
|
| MD5 |
9723d6a0cb7c9a2e89bb7387fed20ef1
|
|
| BLAKE2b-256 |
10f05911f184262209332830461edf159a2aff137e786675285fc3497cacf159
|
File details
Details for the file CTApy-0.1.4-py3-none-any.whl.
File metadata
- Download URL: CTApy-0.1.4-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51e9eb7f901fbb50fec3f4b9e5a651daca2c848ce73240f805675cbda7af65bf
|
|
| MD5 |
6ccd9081ea686a84ffaa127fec46a0b1
|
|
| BLAKE2b-256 |
04a33142356f920c0532354531a37ed08e7f2c32d298eb323d6d9bb0182014ef
|