Skip to main content

Interactive Dimensionality Reduction, Clustering, and Visualization

Project description

InterDim

Docs and Tests Python Versions License Docs

Interactive Dimensionality Reduction, Clustering, and Visualization

InterDim is a Python package for interactive exploration of latent data dimensions. It wraps existing tools for dimensionality reduction, clustering, and data visualization in a streamlined interface, allowing for quick and intuitive analysis of high-dimensional data.

Features

  • Easy-to-use pipeline for dimensionality reduction, clustering, and visualization
  • Interactive 3D scatter plots for exploring reduced data
  • Support for various dimensionality reduction techniques (PCA, t-SNE, UMAP, etc.)
  • Multiple clustering algorithms (K-means, DBSCAN, etc.)
  • Customizable point visualizations for detailed data exploration

Installation

You can install from PyPI via pip (recommended):

pip install interdim

Or from source:

git clone https://github.com/MShinkle/interdim.git
cd interdim
pip install .

Quick Start

Here's a basic example using the Iris dataset:

from sklearn.datasets import load_iris
from interdim import InterDimAnalysis

iris = load_iris()
analysis = InterDimAnalysis(iris.data, true_labels=iris.target)
analysis.reduce(method='tsne', n_components=3)
analysis.cluster(method='kmeans', n_clusters=3)
analysis.show(n_components=3, point_visualization='bar')

3D Scatter Plot with Interactive Bar Charts

This will reduce the Iris dataset to 3 dimensions using t-SNE, clusters the data using K-means, and displays an interactive 3D scatter plot with bar charts for each data point as you hover over them.

However, this is just a small example of what you can do with InterDim. You can use it to explore all sorts of data, including high-dimensional data like language model embeddings!

Demo Notebooks

For more in-depth examples and use cases, check out our demo notebooks:

  1. Iris Species Analysis: Basic usage with the classic Iris dataset. Iris Species Analysis

  2. DNN Latent Space Exploration: Visualizing deep neural network activations. DNN Latent Space Exploration

  3. LLM Token Analysis: Exploring language model token embeddings and layer activations. LLM Token Analysis

Documentation

For detailed API documentation and advanced usage, visit our GitHub Pages.

Contributing

We welcome discussion and contributions!

License

InterDim is released under the BSD 3-Clause License. See the LICENSE file for details.

Contact

For questions and feedback, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

interdim-1.0.1.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

interdim-1.0.1-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file interdim-1.0.1.tar.gz.

File metadata

  • Download URL: interdim-1.0.1.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for interdim-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e54430d193c13406c61ed3805756de69186271a268a392fb7d4be0b76f0595f4
MD5 8bf563274fa48b62eb0551f4c8b4283a
BLAKE2b-256 20332a31e33e85b411d87e4c339a6e0924452dce29c84584dd82208b26186980

See more details on using hashes here.

File details

Details for the file interdim-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: interdim-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for interdim-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b83217c01bf2ff2b4aaad1f1ca2c980469ad9e31f23256cc528e577d9181a744
MD5 c1e79f96a8a9c666bf6c62392b7382a5
BLAKE2b-256 6492a944c6e6f50008171e075bb928cc2682698576460908af4051819ad02b99

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page