Interactive vector embedding visualization and transformation system
Project description
VectorScope
Interactive web-based system for exploring, transforming, and visualizing vector embeddings.
Overview
VectorScope is a visualization tool designed for exploring high-dimensional vector embeddings. It provides an interactive interface for:
- Loading and managing multiple embedding datasets
- Applying transformations (scaling, rotation, PCA-based affine)
- Creating projections to 2D/3D space (PCA, t-SNE, UMAP)
- Selecting and tracking specific points across views
- Building visual transformation pipelines via a graph editor
Features
- Multiple Data Sources: Load data from CSV, NumPy files (.npy, .npz), or built-in sklearn datasets
- Column Configuration: For tabular data, choose which columns are features vs labels
- Visual Transformation Graph: Build data pipelines by connecting layers, transformations, and views
- Interactive Projections: Configure PCA components, t-SNE perplexity, UMAP parameters, and more
- Linked Viewports: Synchronized selection across multiple views
- Session Persistence: Save and reload entire workspaces
Roadmap
โ Completed Features
-
Core Infrastructure
- FastAPI backend with REST API
- React + TypeScript frontend with Zustand state management
- Vite dev server with API proxy
- Session save/load functionality
-
Data Loading
- CSV file upload with column configuration
- NumPy file support (.npy, .npz)
- Built-in sklearn datasets (iris, digits, wine, etc.)
- Synthetic data generation (clustered Gaussian)
-
Projections
- PCA with configurable component selection
- t-SNE with perplexity, learning rate, iterations parameters
- UMAP with n_neighbors, min_dist, spread parameters
- Direct axes view (raw dimension values)
- Histogram view (per-dimension distribution)
- Box plot view (per-dimension by class)
- Corner plot (all axis pairs + diagonal histograms)
-
Transformations
- Scaling transformation with per-axis sliders (linked/unlinked modes)
- Rotation transformation with selectable rotation plane (any dimension pair)
- PCA-based affine transformation (with explained variance display)
-
Visualization
- Multiple synchronized viewports with Plotly (2D and 3D scatter)
- Linked selection across views
- View sets (save/load viewport configurations)
- Graph editor for transformation pipelines
- View editor with header bar layout (layer/view selection, add view)
- Configurable axis ranges (X, Y, Z for 3D views)
-
Annotations & Selections
- Interactive box selection (drag to select points)
- Additive selection (Shift + box select to add more points)
- Point toggling (Shift + click to add/remove individual points)
- Click-to-clear (click empty area to clear selection)
- Named selections (save selections for reuse)
- Selection management (apply, delete saved selections)
- Virtual points / Barycenters (create centroids from selections)
- Named barycenters with custom labels
- Auto-generate selections from class labels
- Auto-generate barycenters from class labels
๐ In Progress
- Custom axis projections (define axes from point pairs)
- Instance tracking panel
๐ Planned Features
-
Transformation Coefficient Visualization
- Visualize how output axes relate to input axes
- Stacked bar chart showing normalized coefficients per output dimension
- Color-coded by input dimension contribution
- Helps interpret PCA components and affine transforms
-
Phase 5: Polish
- Keyboard shortcuts
- Improved error handling & loading states
-
Phase 6: Onboarding & UX
- Interactive tutorial (React Joyride or Shepherd.js)
- Step-by-step guided tour for first-time users
- Contextual help tooltips
- In-app help panel
-
Future Ideas
- Export visualizations as images
- Collaborative sessions
- Plugin system for custom transformations
Quick Start
Option 1: Install from PyPI
Install the backend package:
pip install vectorscope
Then run the server:
uvicorn backend.main:app --port 8000
Note: The PyPI package includes the backend API. For the full interactive UI, use the development installation below.
Option 2: Development Installation
Prerequisites
- Pixi package manager (handles Python environment)
- Node.js 18+ (for frontend)
Installation
# Clone the repository
git clone https://github.com/cranmer/vectorscope.git
cd vectorscope
# Install Python dependencies
pixi install
# Install frontend dependencies
cd frontend && npm install && cd ..
Running
Start the backend and frontend:
# Start both backend and frontend
pixi run dev
# Or start them separately:
# Terminal 1: Backend (port 8000)
pixi run backend
# Terminal 2: Frontend (port 5173)
cd frontend && npm run dev
Open http://localhost:5173 in your browser.
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Frontend (React) โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโค
โ Viewports โ Graph Editorโ Config โ State (Zustand) โ
โ (Plotly) โ (ReactFlow) โ Panels โ โ
โโโโโโโโฌโโโโโโโดโโโโโโโฌโโโโโโโดโโโโโโโฌโโโโโโโดโโโโโโโโโฌโโโโโโโโโโ
โ โ โ โ
โโโโโโโโโโโโโโโดโโโโโโโฌโโโโโโโดโโโโโโโโโโโโโโโโ
โ REST API (Vite Proxy)
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Backend (FastAPI) โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโค
โ Data Store โ Transform โ Projection โ
โ (Layers) โ Engine โ Engine โ
โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโ
Core Concepts
- Layer: A collection of points with n-dimensional vectors. Source layers contain original data; derived layers are created by transformations.
- Point: A single data point with a vector, optional label, and metadata.
- Transformation: An operation that maps one layer to another (scaling, rotation, affine).
- Projection: A dimension reduction from n-D to 2D/3D for visualization (PCA, t-SNE).
Project Structure
vectorscope/
โโโ backend/ # FastAPI Python backend
โ โโโ main.py # Application entry point
โ โโโ models/ # Pydantic data models
โ โ โโโ layer.py # Layer, Point, PointData
โ โ โโโ transformation.py
โ โ โโโ projection.py
โ โโโ services/ # Business logic
โ โ โโโ data_store.py # In-memory layer storage
โ โ โโโ transform_engine.py # Transformation logic
โ โ โโโ projection_engine.py # PCA, t-SNE computation
โ โโโ routers/ # API endpoints
โ โ โโโ layers.py
โ โ โโโ transformations.py
โ โ โโโ projections.py
โ โ โโโ scenarios.py
โ โโโ fixtures.py # Test data loaders
โโโ frontend/ # React TypeScript frontend
โ โโโ src/
โ โ โโโ App.tsx # Main application component
โ โ โโโ components/
โ โ โ โโโ Viewport.tsx # Plotly scatter plot
โ โ โ โโโ GraphEditor.tsx # ReactFlow DAG editor
โ โ โ โโโ ConfigPanel.tsx # Node configuration UI
โ โ โ โโโ ViewportGrid.tsx # Multi-viewport layout
โ โ โโโ stores/
โ โ โ โโโ appStore.ts # Zustand state management
โ โ โโโ api/
โ โ โ โโโ client.ts # REST API client
โ โ โโโ types/
โ โ โโโ index.ts # TypeScript interfaces
โ โโโ vite.config.ts # Vite configuration with API proxy
โ โโโ package.json
โโโ scenarios/ # Saved scenario files
โโโ docs/ # Documentation (Sphinx)
โโโ pixi.toml # Pixi environment configuration
API Reference
Layers
| Endpoint | Method | Description |
|---|---|---|
/layers |
GET | List all layers |
/layers/{id} |
GET | Get layer by ID |
/layers/{id} |
PATCH | Update layer (name, columns) |
/layers/{id}/points |
GET | Get points in a layer |
/layers/upload |
POST | Upload data file (CSV, NPY, NPZ) |
/layers/synthetic |
POST | Generate synthetic dataset |
/layers/sklearn/{name} |
POST | Load sklearn dataset |
Projections
| Endpoint | Method | Description |
|---|---|---|
/projections |
GET | List all projections |
/projections |
POST | Create projection (PCA, t-SNE) |
/projections/{id} |
PATCH | Update projection parameters |
/projections/{id}/coordinates |
GET | Get 2D coordinates |
Transformations
| Endpoint | Method | Description |
|---|---|---|
/transformations |
GET | List all transformations |
/transformations |
POST | Create transformation |
/transformations/{id} |
PATCH | Update transformation parameters |
Scenarios
| Endpoint | Method | Description |
|---|---|---|
/scenarios |
GET | List available scenarios |
/scenarios/save |
POST | Save current state |
/scenarios/load/{name} |
POST | Load saved scenario |
/scenarios/upload |
POST | Upload scenario files |
Extending VectorScope
Adding a New Transformation Type
-
Define the transformation type in
backend/models/transformation.py:class TransformationType(str, Enum): SCALING = "scaling" ROTATION = "rotation" PCA = "pca" # Add your new type: MY_TRANSFORM = "my_transform"
-
Implement the transformation in
backend/services/transform_engine.py:def _apply_my_transform(self, vectors: np.ndarray, params: dict) -> np.ndarray: # Your transformation logic return transformed_vectors
-
Add to the apply method:
def apply_transformation(self, transformation, vectors): if transformation.type == TransformationType.my_transform: return self._apply_my_transform(vectors, transformation.parameters)
-
Update the frontend in
ConfigPanel.tsxto show UI controls for your transformation.
Adding a New Projection Type
-
Define the projection type in
backend/models/projection.py:class ProjectionType(str, Enum): pca = "pca" tsne = "tsne" # Add your new type: my_projection = "my_projection"
-
Implement the projection in
backend/services/projection_engine.py:def _compute_my_projection(self, vectors: np.ndarray, params: dict) -> np.ndarray: # Your projection logic (should return 2D or 3D coordinates) return coordinates
-
Add to compute method:
def _compute_projection(self, projection): if projection.type == ProjectionType.my_projection: coords = self._compute_my_projection(vectors, projection.parameters)
-
Update the frontend to show your projection type in dropdowns and add any parameter controls.
Tech Stack
- Backend: Python 3.11+, FastAPI, NumPy, scikit-learn, Pydantic
- Frontend: React 18, TypeScript, Plotly.js, ReactFlow, Zustand
- Build Tools: Vite, Pixi (Python environment management)
Development
Running Tests
# Backend tests
pixi run test-backend
# Frontend tests
cd frontend && npm test
Code Style
- Python: Follow PEP 8, use type hints
- TypeScript: Use strict mode, prefer interfaces over types
License
MIT License - see LICENSE for details.
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests
- Submit a pull request
Credits
Conceptualized by: Kyle Cranmer
Implemented by: Claude Code (Anthropic's AI coding assistant)
Acknowledgments
- Plotly.js for interactive visualizations
- ReactFlow for the graph editor
- scikit-learn for dimensionality reduction algorithms
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vectorscope-1.2.0.tar.gz.
File metadata
- Download URL: vectorscope-1.2.0.tar.gz
- Upload date:
- Size: 1.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3bffa920734eae0930e73d9b12784446e2319d8d23b9a12354d9cf57c1dad4c
|
|
| MD5 |
7596b6f17e89b4b4ec72870714840a18
|
|
| BLAKE2b-256 |
b8166e6564063613b02f38b0d05f9817e9368493c30082ca703454147e9430de
|
Provenance
The following attestation bundles were made for vectorscope-1.2.0.tar.gz:
Publisher:
publish.yml on cranmer/vectorscope
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vectorscope-1.2.0.tar.gz -
Subject digest:
e3bffa920734eae0930e73d9b12784446e2319d8d23b9a12354d9cf57c1dad4c - Sigstore transparency entry: 790915714
- Sigstore integration time:
-
Permalink:
cranmer/vectorscope@1723f1bbc952dfc90c5316275523908db5be5ccf -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/cranmer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1723f1bbc952dfc90c5316275523908db5be5ccf -
Trigger Event:
release
-
Statement type:
File details
Details for the file vectorscope-1.2.0-py3-none-any.whl.
File metadata
- Download URL: vectorscope-1.2.0-py3-none-any.whl
- Upload date:
- Size: 1.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b08e1d85a4cab6408cf19213d3a500b464bce650360805c053170a825aefb24
|
|
| MD5 |
95a2ef25b60fd30bc4c9936864095cbd
|
|
| BLAKE2b-256 |
ec40643c905ffc542eab53902c3ea0a142b7eceb400969e888705ab58a7da22e
|
Provenance
The following attestation bundles were made for vectorscope-1.2.0-py3-none-any.whl:
Publisher:
publish.yml on cranmer/vectorscope
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vectorscope-1.2.0-py3-none-any.whl -
Subject digest:
0b08e1d85a4cab6408cf19213d3a500b464bce650360805c053170a825aefb24 - Sigstore transparency entry: 790915715
- Sigstore integration time:
-
Permalink:
cranmer/vectorscope@1723f1bbc952dfc90c5316275523908db5be5ccf -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/cranmer
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@1723f1bbc952dfc90c5316275523908db5be5ccf -
Trigger Event:
release
-
Statement type: