Visualization of Topic Modeling Results
Project description
tmplot
tmplot is a comprehensive Python package for topic modeling analysis and visualization. Built for data scientists and researchers, it provides powerful interactive reports and advanced analytics that extend beyond traditional LDAvis/pyLDAvis capabilities.
Analyze • Visualize • Compare multiple topic models with ease
Key Features
Interactive Visualization
- Topic scatter plots with customizable coordinates and sizing
- Term probability charts with relevance weighting
- Document analysis showing top documents per topic
- Interactive reports with real-time parameter adjustment
Advanced Analytics
- Topic stability analysis across multiple model runs
- Model comparison with sophisticated distance metrics
- Saliency calculations for term importance
- Entropy metrics for model optimization
Model Support
- tomotopy:
LDAModel,LLDAModel,CTModel,DMRModel,HDPModel,PTModel,SLDAModel,GDMRModel - gensim:
LdaModel,LdaMulticore - bitermplus:
BTM
Distance Metrics
- Kullback-Leibler (symmetric & non-symmetric)
- Jensen-Shannon divergence
- Jeffrey's divergence
- Hellinger & Bhattacharyya distances
- Total variation distance
- Jaccard index
Dimensionality Reduction
t-SNE, SpectralEmbedding, MDS, LocallyLinearEmbedding, Isomap
Donate
If you find this package useful, please consider donating any amount of money. This will help me spend more time on supporting open-source software.
Quick Start
Installation
# From PyPI (recommended)
pip install tmplot
# Development version
pip install git+https://github.com/maximtrp/tmplot.git
Basic Usage
import tmplot as tmp
# Load your topic model and documents
model = your_fitted_model # tomotopy, gensim, or bitermplus
docs = your_documents
# Create interactive report
tmp.report(model, docs=docs)
# Or create individual visualizations
coords = tmp.prepare_coords(model)
tmp.plot_scatter_topics(coords, size_col='size')
Advanced Examples
Get Stable Topics
import tmplot as tmp
# Find stable topics across multiple models
models = [model1, model2, model3, model4]
closest_topics, distances = tmp.get_closest_topics(models)
stable_topics, stable_distances = tmp.get_stable_topics(closest_topics, distances)
Analyze Model
# Calculate entropy for model selection
entropy_score = tmp.entropy(phi_matrix)
# Analyze topic stability
saliency = tmp.get_salient_terms(phi, theta)
Visualize
# Create topic distance matrix with different metrics
topic_dists = tmp.get_topics_dist(phi, method='jensen-shannon')
# Generate coordinates with custom algorithm
coords = tmp.get_topics_scatter(topic_dists, theta, method='tsne')
tmp.plot_scatter_topics(coords, topic=3) # Highlight topic 3
Documentation & Examples
- Complete Tutorial - Step-by-step guide
- API Reference - Full documentation
- Example Notebooks - Jupyter examples
Requirements
Core dependencies: numpy, scipy, scikit-learn, pandas, altair, ipywidgets
Optional models: tomotopy, gensim, bitermplus
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tmplot-0.3.0.tar.gz.
File metadata
- Download URL: tmplot-0.3.0.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0790ed843edfb97174229369fd28f5e9ab8f89558ceb9dc375169e955fec54af
|
|
| MD5 |
daba62f88fe59a97282ed25d519ddeeb
|
|
| BLAKE2b-256 |
7fc587420574f59d0e4e247cac102b58566a2da0fc7fabc61844440a89d29ca8
|
File details
Details for the file tmplot-0.3.0-py3-none-any.whl.
File metadata
- Download URL: tmplot-0.3.0-py3-none-any.whl
- Upload date:
- Size: 19.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6259be41693f8629cf86a68f9cdd812b646a2ea1f6c051d986868ede950bec3c
|
|
| MD5 |
3379e9445e8e938d4af9676d3101cfc7
|
|
| BLAKE2b-256 |
3971b1b7910e0ba22859aa72adad048a0406e14c46d4b1fd24f15e40dbebae44
|