A streamlined and fast implementation of parametric UMAP using PyTorch and FAISS

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Parametric UMAP

A PyTorch implementation of Parametric UMAP (Uniform Manifold Approximation and Projection) for learning low-dimensional parametric embeddings of high-dimensional data.

Install

pip install parametric-umap

Or install the latest version from the repository:

pip install git+https://github.com/fcarli/parametric_umap.git

GPU acceleration

The pip install pulls the default PyTorch build from PyPI. If you need a specific CUDA version, install PyTorch first following the official instructions, then install this package.

For developers using uv, CUDA version selection is built in via extras:

# macOS / Windows (CPU automatic, no extra needed)
uv sync --extra dev --extra test --extra examples

# Linux — pick your CUDA version
uv sync --extra dev --extra test --extra examples --extra cu126

# Linux — CPU only
uv sync --extra dev --extra test --extra examples --extra cpu

Available CUDA extras: cu118, cu121, cu124, cu126, cu128.

Apple Silicon Macs are automatically detected and use the MPS backend — no extra configuration needed. You can also pass device='mps' explicitly.

Overview

Parametric UMAP (original paper) extends the original UMAP algorithm by learning a neural network that can map new data points to the lower-dimensional space without having to rerun the entire optimization. This (unofficial) implementation provides a flexible and efficient way to perform parametric dimensionality reduction leveraging PyTorch and FAISS.

Features

Neural network-based parametric mapping
Efficient nearest neighbor computation using FAISS
Sparse matrix operations for memory efficiency
GPU acceleration support
Model saving and loading capabilities
Correlation loss term to preserve distance relationships

Quick Start

from parametric_umap import ParametricUMAP
from sklearn.datasets import make_swiss_roll
import numpy as np

# Create sample data
n_samples = 1000
X, color = make_swiss_roll(n_samples=n_samples, random_state=42)

# Initialize and fit the model (auto-detects CUDA / MPS / CPU)
pumap = ParametricUMAP(
    n_components=2,
    hidden_dim=128,
    n_layers=3,
    n_epochs=10,
)

# Fit and transform the data
embeddings = pumap.fit_transform(X)

# Transform new data
X_new = np.random.rand(100, 3)
new_embeddings = pumap.transform(X_new)

You can also specify the device explicitly:

pumap = ParametricUMAP(device='cuda:0')   # specific CUDA GPU
pumap = ParametricUMAP(device='mps')      # Apple Silicon GPU
pumap = ParametricUMAP(device='cpu')      # force CPU

Note that by default the data is moved to the specified device before training to accelerate training process. However, if your GPU card cannot fit the entire dataset in memory you can override this behavior by setting the low_memory argument to true as follows:

embeddings = pumap.fit_transform(X, low_memory=True)

Similarly, transform() sends the entire input to the device in a single forward pass. For very large inputs that don't fit in memory, pass batch_size to process in chunks:

new_embeddings = pumap.transform(X_new, batch_size=4096)

Key Parameters

UMAP parameters

n_neighbors: Number of nearest neighbors for the UMAP knn graph (default: 15)
a: Parameter for scaling distances between embedded points (default: 0.1)
b: Parameter for controlling sharpness of the curve's transition between attraction and repulsion (default: 1.0)

Parametric model

device: Compute device — auto-detected by default (CUDA > MPS > CPU). Pass a specific device like 'cuda:1' or 'mps' to override
n_components: Dimension of the output embedding (default: 2)
hidden_dim: Dimension of hidden layers in the MLP (default: 1024)
n_layers: Number of hidden layers (default: 3)
correlation_weight: Weight of the correlation loss term (default: 0.1)
learning_rate: Learning rate for optimization (default: 1e-4)
n_epochs: Number of training epochs (default: 10)
batch_size: Training batch size (default: 32)
use_batchnorm: Whether to use batch normalization in the embedding MLP (default: False)
use_dropout: Whether to use dropout in the embedding MLP (default: False)
compile_model: Apply torch.compile to the MLP for faster training on PyTorch 2.x (default: False). Adds a one-time compilation delay on the first forward pass

Development

See CONTRIBUTING.md for development setup and guidelines.

make install    # Install all dependencies (CPU torch)
make test       # Run tests
make lint       # Lint checks
make format     # Format code

Citation

If you use this package in your research, please cite the original Parametric UMAP paper:

@article{sainburg2021parametric,
  title={Parametric UMAP Embeddings for Representation and Semisupervised Learning},
  author={Sainburg, Tim and McInnes, Leland and Gentner, Timothy Q},
  journal={Neural Computation},
  volume={33},
  number={11},
  pages={2881--2907},
  year={2021},
  publisher={MIT Press}
}

License

BSD License

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mrfcharles

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Feb 22, 2026

0.1.2

Mar 4, 2025

0.1.1

Nov 22, 2024

0.1.0

Nov 21, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parametric_umap-0.2.0.tar.gz (520.5 kB view details)

Uploaded Feb 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

parametric_umap-0.2.0-py3-none-any.whl (20.6 kB view details)

Uploaded Feb 22, 2026 Python 3

File details

Details for the file parametric_umap-0.2.0.tar.gz.

File metadata

Download URL: parametric_umap-0.2.0.tar.gz
Upload date: Feb 22, 2026
Size: 520.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for parametric_umap-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`a6c9d81a1af7222235ece054cee868fd1fae559fb01e045370050cd6a0f5c165`
MD5	`1859d407e38b325aa3053074a90826b8`
BLAKE2b-256	`7ffa9a367309003a17822bc8c59e283674d1ece3a855bf2b3502829386c030ca`

See more details on using hashes here.

Provenance

The following attestation bundles were made for parametric_umap-0.2.0.tar.gz:

Publisher: release.yml on fcarli/parametric_umap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: parametric_umap-0.2.0.tar.gz
- Subject digest: a6c9d81a1af7222235ece054cee868fd1fae559fb01e045370050cd6a0f5c165
- Sigstore transparency entry: 976715033
- Sigstore integration time: Feb 22, 2026
Source repository:
- Permalink: fcarli/parametric_umap@8edcbe3e7bfea442ff9bf7ff39738943dc31e1c6
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/fcarli
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@8edcbe3e7bfea442ff9bf7ff39738943dc31e1c6
- Trigger Event: push

File details

Details for the file parametric_umap-0.2.0-py3-none-any.whl.

File metadata

Download URL: parametric_umap-0.2.0-py3-none-any.whl
Upload date: Feb 22, 2026
Size: 20.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for parametric_umap-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1949f234eafd45e9dcb3099414560e87193aed45824e816e3ffd0d17e6c2c297`
MD5	`9858a0ec1f45ef6f7e83a24227b28a1b`
BLAKE2b-256	`1b07b0c47d9aee62d1cd84b02935eff20c3c890da63ac678b000afa0e7dc0690`

See more details on using hashes here.

Provenance

The following attestation bundles were made for parametric_umap-0.2.0-py3-none-any.whl:

Publisher: release.yml on fcarli/parametric_umap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: parametric_umap-0.2.0-py3-none-any.whl
- Subject digest: 1949f234eafd45e9dcb3099414560e87193aed45824e816e3ffd0d17e6c2c297
- Sigstore transparency entry: 976715037
- Sigstore integration time: Feb 22, 2026
Source repository:
- Permalink: fcarli/parametric_umap@8edcbe3e7bfea442ff9bf7ff39738943dc31e1c6
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/fcarli
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@8edcbe3e7bfea442ff9bf7ff39738943dc31e1c6
- Trigger Event: push

parametric-umap 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Parametric UMAP

Install

GPU acceleration

Overview

Features

Quick Start

Key Parameters

Development

Citation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance