Skip to main content

Interactive Dimensionality Reduction, Clustering, and Visualization

Project description

InterDim

Docs and Tests Python Versions License Docs

Interactive Dimensionality Reduction, Clustering, and Visualization

InterDim is a Python package for interactive exploration of latent data dimensions. It wraps existing tools for dimensionality reduction, clustering, and data visualization in a streamlined interface, allowing for quick and intuitive analysis of high-dimensional data.

Features

  • Easy-to-use pipeline for dimensionality reduction, clustering, and visualization
  • Interactive 3D scatter plots for exploring reduced data
  • Support for various dimensionality reduction techniques (PCA, t-SNE, UMAP, etc.)
  • Multiple clustering algorithms (K-means, DBSCAN, etc.)
  • Customizable point visualizations for detailed data exploration

Installation

You can install from PyPI via pip (recommended):

pip install interdim

Or from source:

git clone https://github.com/MShinkle/interdim.git
cd interdim
pip install .

Quick Start

Here's a basic example using the Iris dataset:

from sklearn.datasets import load_iris
from interdim import InterDimAnalysis

iris = load_iris()
analysis = InterDimAnalysis(iris.data, true_labels=iris.target)
analysis.reduce(method='tsne', n_components=3)
analysis.cluster(method='kmeans', n_clusters=3)
analysis.show(n_components=3, point_visualization='bar')

3D Scatter Plot with Interactive Bar Charts

This will reduce the Iris dataset to 3 dimensions using t-SNE, clusters the data using K-means, and displays an interactive 3D scatter plot with bar charts for each data point as you hover over them.

However, this is just a small example of what you can do with InterDim. You can use it to explore all sorts of data, including high-dimensional data like language model embeddings!

Demo Notebooks

For more in-depth examples and use cases, check out our demo notebooks:

  1. Iris Species Analysis: Basic usage with the classic Iris dataset. Iris Species Analysis

  2. DNN Latent Space Exploration: Visualizing deep neural network activations. DNN Latent Space Exploration

  3. LLM Token Analysis: Exploring language model token embeddings and layer activations. LLM Token Analysis

Documentation

For detailed API documentation and advanced usage, visit our GitHub Pages.

Contributing

We welcome discussion and contributions!

License

InterDim is released under the BSD 3-Clause License. See the LICENSE file for details.

Contact

For questions and feedback, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

interdim-1.0.2.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

interdim-1.0.2-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file interdim-1.0.2.tar.gz.

File metadata

  • Download URL: interdim-1.0.2.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for interdim-1.0.2.tar.gz
Algorithm Hash digest
SHA256 cfa03001bfcc8dcf6e3cc251f3f471b873d2f41dc1d7cb7063c529605377792f
MD5 2ec49670464be6513844154dff56b47f
BLAKE2b-256 f5dfba86d68e63ec0a6dbf04c4bb100c793fb978eab9509d0f04c0f5c59a78a6

See more details on using hashes here.

File details

Details for the file interdim-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: interdim-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 13.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for interdim-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b71309b19f34e17508cb40d714bf6326f2166689faa625d00b04a2e4f1456283
MD5 79dc7dc49e5df542566fdc3014284604
BLAKE2b-256 da8fde0fca95559f7a3bf377bdd6fdb580ddeb5e290b5d338e2c2a6f56952a3e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page