Skip to main content

Interactive vector embedding visualization and transformation system

Project description

VectorScope

Interactive web-based system for exploring, transforming, and visualizing vector embeddings.

VectorScope Logo

Overview

VectorScope is a visualization tool designed for exploring high-dimensional vector embeddings. It provides an interactive interface for:

  • Loading and managing multiple embedding datasets
  • Applying transformations (scaling, rotation, PCA-based affine)
  • Creating projections to 2D/3D space (PCA, t-SNE, UMAP)
  • Selecting and tracking specific points across views
  • Building visual transformation pipelines via a graph editor

Features

  • Multiple Data Sources: Load data from CSV, NumPy files (.npy, .npz), or built-in sklearn datasets
  • Column Configuration: For tabular data, choose which columns are features vs labels
  • Visual Transformation Graph: Build data pipelines by connecting layers, transformations, and views
  • Interactive Projections: Configure PCA components, t-SNE perplexity, UMAP parameters, and more
  • Linked Viewports: Synchronized selection across multiple views
  • Session Persistence: Save and reload entire workspaces

Roadmap

โœ… Completed Features

  • Core Infrastructure

    • FastAPI backend with REST API
    • React + TypeScript frontend with Zustand state management
    • Vite dev server with API proxy
    • Session save/load functionality
  • Data Loading

    • CSV file upload with column configuration
    • NumPy file support (.npy, .npz)
    • Built-in sklearn datasets (iris, digits, wine, etc.)
    • Synthetic data generation (clustered Gaussian)
  • Projections

    • PCA with configurable component selection
    • t-SNE with perplexity, learning rate, iterations parameters
    • UMAP with n_neighbors, min_dist, spread parameters
    • Direct axes view (raw dimension values)
    • Histogram view (per-dimension distribution)
    • Box plot view (per-dimension by class)
    • Corner plot (all axis pairs + diagonal histograms)
  • Transformations

    • Scaling transformation with per-axis sliders (linked/unlinked modes)
    • Rotation transformation with selectable rotation plane (any dimension pair)
    • PCA-based affine transformation (with explained variance display)
  • Visualization

    • Multiple synchronized viewports with Plotly (2D and 3D scatter)
    • Linked selection across views
    • View sets (save/load viewport configurations)
    • Graph editor for transformation pipelines
    • View editor with header bar layout (layer/view selection, add view)
    • Configurable axis ranges (X, Y, Z for 3D views)
  • Annotations & Selections

    • Interactive box selection (drag to select points)
    • Additive selection (Shift + box select to add more points)
    • Point toggling (Shift + click to add/remove individual points)
    • Click-to-clear (click empty area to clear selection)
    • Named selections (save selections for reuse)
    • Selection management (apply, delete saved selections)
    • Virtual points / Barycenters (create centroids from selections)
    • Named barycenters with custom labels
    • Auto-generate selections from class labels
    • Auto-generate barycenters from class labels

๐Ÿ”„ In Progress

  • Custom axis projections (define axes from point pairs)
  • Instance tracking panel

๐Ÿ“‹ Planned Features

  • Transformation Coefficient Visualization

    • Visualize how output axes relate to input axes
    • Stacked bar chart showing normalized coefficients per output dimension
    • Color-coded by input dimension contribution
    • Helps interpret PCA components and affine transforms
  • Phase 5: Polish

    • Keyboard shortcuts
    • Improved error handling & loading states
  • Phase 6: Onboarding & UX

    • Interactive tutorial (React Joyride or Shepherd.js)
    • Step-by-step guided tour for first-time users
    • Contextual help tooltips
    • In-app help panel
  • Future Ideas

    • Export visualizations as images
    • Collaborative sessions
    • Plugin system for custom transformations

Quick Start

Option 1: Install from PyPI

Install the backend package:

pip install vectorscope

Then run the server:

uvicorn backend.main:app --port 8000

Note: The PyPI package includes the backend API. For the full interactive UI, use the development installation below.

Option 2: Development Installation

Prerequisites

  • Pixi package manager (handles Python environment)
  • Node.js 18+ (for frontend)

Installation

# Clone the repository
git clone https://github.com/cranmer/vectorscope.git
cd vectorscope

# Install Python dependencies
pixi install

# Install frontend dependencies
cd frontend && npm install && cd ..

Running

Start the backend and frontend:

# Start both backend and frontend
pixi run dev

# Or start them separately:
# Terminal 1: Backend (port 8000)
pixi run backend

# Terminal 2: Frontend (port 5173)
cd frontend && npm run dev

Open http://localhost:5173 in your browser.

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     Frontend (React)                         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Viewports  โ”‚ Graph Editorโ”‚   Config    โ”‚  State (Zustand) โ”‚
โ”‚  (Plotly)   โ”‚ (ReactFlow) โ”‚   Panels    โ”‚                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚             โ”‚             โ”‚               โ”‚
       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚ REST API (Vite Proxy)
                            โ–ผ
       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
       โ”‚           Backend (FastAPI)                 โ”‚
       โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
       โ”‚ Data Store  โ”‚ Transform   โ”‚ Projection     โ”‚
       โ”‚ (Layers)    โ”‚ Engine      โ”‚ Engine         โ”‚
       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Core Concepts

  • Layer: A collection of points with n-dimensional vectors. Source layers contain original data; derived layers are created by transformations.
  • Point: A single data point with a vector, optional label, and metadata.
  • Transformation: An operation that maps one layer to another (scaling, rotation, affine).
  • Projection: A dimension reduction from n-D to 2D/3D for visualization (PCA, t-SNE).

Project Structure

vectorscope/
โ”œโ”€โ”€ backend/               # FastAPI Python backend
โ”‚   โ”œโ”€โ”€ main.py           # Application entry point
โ”‚   โ”œโ”€โ”€ models/           # Pydantic data models
โ”‚   โ”‚   โ”œโ”€โ”€ layer.py      # Layer, Point, PointData
โ”‚   โ”‚   โ”œโ”€โ”€ transformation.py
โ”‚   โ”‚   โ””โ”€โ”€ projection.py
โ”‚   โ”œโ”€โ”€ services/         # Business logic
โ”‚   โ”‚   โ”œโ”€โ”€ data_store.py       # In-memory layer storage
โ”‚   โ”‚   โ”œโ”€โ”€ transform_engine.py # Transformation logic
โ”‚   โ”‚   โ””โ”€โ”€ projection_engine.py # PCA, t-SNE computation
โ”‚   โ”œโ”€โ”€ routers/          # API endpoints
โ”‚   โ”‚   โ”œโ”€โ”€ layers.py
โ”‚   โ”‚   โ”œโ”€โ”€ transformations.py
โ”‚   โ”‚   โ”œโ”€โ”€ projections.py
โ”‚   โ”‚   โ””โ”€โ”€ scenarios.py
โ”‚   โ””โ”€โ”€ fixtures.py       # Test data loaders
โ”œโ”€โ”€ frontend/             # React TypeScript frontend
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ App.tsx       # Main application component
โ”‚   โ”‚   โ”œโ”€โ”€ components/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ Viewport.tsx      # Plotly scatter plot
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ GraphEditor.tsx   # ReactFlow DAG editor
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ ConfigPanel.tsx   # Node configuration UI
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ ViewportGrid.tsx  # Multi-viewport layout
โ”‚   โ”‚   โ”œโ”€โ”€ stores/
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ appStore.ts       # Zustand state management
โ”‚   โ”‚   โ”œโ”€โ”€ api/
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ client.ts         # REST API client
โ”‚   โ”‚   โ””โ”€โ”€ types/
โ”‚   โ”‚       โ””โ”€โ”€ index.ts          # TypeScript interfaces
โ”‚   โ”œโ”€โ”€ vite.config.ts    # Vite configuration with API proxy
โ”‚   โ””โ”€โ”€ package.json
โ”œโ”€โ”€ scenarios/            # Saved scenario files
โ”œโ”€โ”€ docs/                 # Documentation (Sphinx)
โ””โ”€โ”€ pixi.toml            # Pixi environment configuration

API Reference

Layers

Endpoint Method Description
/layers GET List all layers
/layers/{id} GET Get layer by ID
/layers/{id} PATCH Update layer (name, columns)
/layers/{id}/points GET Get points in a layer
/layers/upload POST Upload data file (CSV, NPY, NPZ)
/layers/synthetic POST Generate synthetic dataset
/layers/sklearn/{name} POST Load sklearn dataset

Projections

Endpoint Method Description
/projections GET List all projections
/projections POST Create projection (PCA, t-SNE)
/projections/{id} PATCH Update projection parameters
/projections/{id}/coordinates GET Get 2D coordinates

Transformations

Endpoint Method Description
/transformations GET List all transformations
/transformations POST Create transformation
/transformations/{id} PATCH Update transformation parameters

Scenarios

Endpoint Method Description
/scenarios GET List available scenarios
/scenarios/save POST Save current state
/scenarios/load/{name} POST Load saved scenario
/scenarios/upload POST Upload scenario files

Extending VectorScope

Adding a New Transformation Type

  1. Define the transformation type in backend/models/transformation.py:

    class TransformationType(str, Enum):
        SCALING = "scaling"
        ROTATION = "rotation"
        PCA = "pca"
        # Add your new type:
        MY_TRANSFORM = "my_transform"
    
  2. Implement the transformation in backend/services/transform_engine.py:

    def _apply_my_transform(self, vectors: np.ndarray, params: dict) -> np.ndarray:
        # Your transformation logic
        return transformed_vectors
    
  3. Add to the apply method:

    def apply_transformation(self, transformation, vectors):
        if transformation.type == TransformationType.my_transform:
            return self._apply_my_transform(vectors, transformation.parameters)
    
  4. Update the frontend in ConfigPanel.tsx to show UI controls for your transformation.

Adding a New Projection Type

  1. Define the projection type in backend/models/projection.py:

    class ProjectionType(str, Enum):
        pca = "pca"
        tsne = "tsne"
        # Add your new type:
        my_projection = "my_projection"
    
  2. Implement the projection in backend/services/projection_engine.py:

    def _compute_my_projection(self, vectors: np.ndarray, params: dict) -> np.ndarray:
        # Your projection logic (should return 2D or 3D coordinates)
        return coordinates
    
  3. Add to compute method:

    def _compute_projection(self, projection):
        if projection.type == ProjectionType.my_projection:
            coords = self._compute_my_projection(vectors, projection.parameters)
    
  4. Update the frontend to show your projection type in dropdowns and add any parameter controls.

Tech Stack

  • Backend: Python 3.11+, FastAPI, NumPy, scikit-learn, Pydantic
  • Frontend: React 18, TypeScript, Plotly.js, ReactFlow, Zustand
  • Build Tools: Vite, Pixi (Python environment management)

Development

Running Tests

# Backend tests
pixi run test-backend

# Frontend tests
cd frontend && npm test

Code Style

  • Python: Follow PEP 8, use type hints
  • TypeScript: Use strict mode, prefer interfaces over types

License

MIT License - see LICENSE for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests
  5. Submit a pull request

Credits

Conceptualized by: Kyle Cranmer

Implemented by: Claude Code (Anthropic's AI coding assistant)

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectorscope-1.2.3.tar.gz (1.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vectorscope-1.2.3-py3-none-any.whl (1.7 MB view details)

Uploaded Python 3

File details

Details for the file vectorscope-1.2.3.tar.gz.

File metadata

  • Download URL: vectorscope-1.2.3.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectorscope-1.2.3.tar.gz
Algorithm Hash digest
SHA256 379746e7ece8f31fba164864447c0ed5df9400a2c1589018382f115b25ca0138
MD5 4e32bc0eaa4f0ec444b504f814cfb8a5
BLAKE2b-256 232b5b69fa7736402d5225c3f361be4cd144f9e2f47c456ceda41918bd41c3fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for vectorscope-1.2.3.tar.gz:

Publisher: publish.yml on cranmer/vectorscope

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vectorscope-1.2.3-py3-none-any.whl.

File metadata

  • Download URL: vectorscope-1.2.3-py3-none-any.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectorscope-1.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 78aa906cb9c11eb6ba7c742e6f2b7ac6f2e51576314448a099c7064440cd0edb
MD5 bf78b5ede4d92105b3dfaae442786edd
BLAKE2b-256 433fa164b485117d09cc5096fb09db52aba5d9e315995722260f6e776abcb3d4

See more details on using hashes here.

Provenance

The following attestation bundles were made for vectorscope-1.2.3-py3-none-any.whl:

Publisher: publish.yml on cranmer/vectorscope

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page