PyTorch and RAPIDS (cuVS/cuML) accelerated dimensionality reduction

These details have been verified by PyPI

Owner

The University of Austin

Maintainers

Sasha_Kolpakov

These details have not been verified by PyPI

Project links

Project description

DiRe-RAPIDS logo

DiRe Rapids

GPU-accelerated implementation of DiRe using PyTorch and optionally NVIDIA RAPIDS for massive-scale datasets.

What is DiRe?

DiRe (Dimensionality Reduction) is a dimensionality reduction algorithm based on force-directed graph layout. Unlike methods that focus solely on local neighborhood preservation, DiRe preserves both local and global structure of the data manifold, with theoretical guarantees for homological stability -- the topology (connected components, loops) of the original point cloud is faithfully reflected in the low-dimensional embedding. See the paper on arXiv for details.

Performance

DiRe is 9--42x faster than UMAP on CPU while delivering competitive or better embedding quality (neighborhood preservation). On GPU it leverages torch.compile for kernel fusion, pushing throughput even further.

Dataset	N	D	DiRe (s)	UMAP (s)	Speedup
digits	5,620	64	1.3	11.9	9.2x
mnist_784	10,000	784	2.5	49.4	19.8x
Fashion-MNIST	10,000	784	2.3	46.6	20.3x
har	10,299	561	2.4	101.0	42.1x
covertype	20,000	54	3.9	43.9	11.3x

Benchmarks on OpenML datasets; times are wall-clock on a single CPU core.

At large scale (500K+ points), DiRe also beats cuML UMAP on embedding quality (neighborhood preservation), making it the best choice for both speed and fidelity on big data.

Topological Preservation

DiRe is designed to preserve the topology of the original data manifold. We measure this by computing Betti curves on the original point cloud and on the 2D embedding, then comparing them via DTW distance (lower = better preservation):

Dataset	Topology	DiRe DTW β₀	cuML DTW β₀	DiRe DTW β₁	cuML DTW β₁
circle (S¹)	β₀=1, β₁=1	56	76	29	47
torus (T²)	β₀=1, β₁=2	38	48	36	41
linked rings	β₀=2, β₁=2	66	65	17	42
5 blobs (R¹⁰)	β₀=5, β₁=0	33	37	378	370

DiRe wins 6 out of 8 comparisons, preserving both connected components (β₀) and loops (β₁) significantly better than cuML UMAP -- consistent with DiRe's theoretical guarantees for homological stability.

Installation

From PyPI (stable)

# Basic installation (CPU + PyTorch)
python -m pip install "dire-rapids==0.3.1"

# With PyKeOps support for the optional PyKeOps k-NN engine
python -m pip install "dire-rapids[keops]==0.3.1"

# With CUDA CuPy support
python -m pip install "dire-rapids[cuda]==0.3.1"

From Repository (development)

git clone https://github.com/sashakolpakov/dire-rapids.git
cd dire-rapids

python -m pip install -e .            # CPU + PyTorch
python -m pip install -e ".[cuda]"    # With CUDA CuPy support
python -m pip install -e ".[keops]"   # With PyKeOps support
python -m pip install -e ".[dev]"     # Development (testing + dev tools)

With RAPIDS Support (Optional, GPU only)

Use a clean virtual environment. The rapids extra installs cuML/cuVS/cuDF from the NVIDIA index and PyTorch from the matching CUDA wheel index.

python -m pip install \
  --extra-index-url https://pypi.nvidia.com \
  --extra-index-url https://download.pytorch.org/whl/cu128 \
  "dire-rapids[rapids,keops]==0.3.1"

# From a clone:
python -m pip install \
  --extra-index-url https://pypi.nvidia.com \
  --extra-index-url https://download.pytorch.org/whl/cu128 \
  -e ".[rapids,keops]"

Quick Start

from dire_rapids import DiRePyTorch, DiRePyTorchMemoryEfficient
from sklearn.datasets import make_blobs

# Generate sample data
X, _ = make_blobs(n_samples=1_000, centers=12, n_features=10, random_state=42)

# Standard PyTorch backend
reducer = DiRePyTorch(n_components=2, n_neighbors=16, verbose=True)
X_embedded = reducer.fit_transform(X)

# Memory-efficient backend (recommended for large datasets)
reducer = DiRePyTorchMemoryEfficient(n_components=2, n_neighbors=16, verbose=True)
X_embedded = reducer.fit_transform(X)

12 blobs with 100k points embedded in dimension 2

Custom Distance Metrics

DiRe Rapids supports custom distance metrics for k-nearest neighbor computation while keeping layout forces Euclidean:

# L1 (Manhattan) distance for k-NN
reducer = DiRePyTorch(metric='(x - y).abs().sum(-1)', n_neighbors=32, knn_backend='pytorch')
X_embedded = reducer.fit_transform(X)

# Cosine distance via callable
def cosine_distance(x, y):
    return 1 - (x * y).sum(-1) / (x.norm(dim=-1, keepdim=True) * y.norm(dim=-1, keepdim=True) + 1e-8)

reducer = DiRePyTorch(metric=cosine_distance, n_neighbors=32, knn_backend='pytorch')
X_embedded = reducer.fit_transform(X)

Supported metric types: None / 'euclidean' / 'l2' (default), string tensor expressions, or callable functions taking (x, y) tensors.

Custom metric expressions and callables run on the PyTorch/PyKeOps k-NN paths. cuVS supports named native metrics only; forcing knn_backend='cuvs' with a custom expression/callable raises.

Available Backends

DiRePyTorch -- Standard PyTorch implementation with adaptive chunking
DiRePyTorchMemoryEfficient -- FP16 support, point-by-point force computation, optional PyKeOps lazy tensors for repulsion
DiReCuVS -- RAPIDS cuVS backend for massive-scale datasets

backend selects the DiRe implementation. knn_backend selects the k-nearest-neighbor engine used inside that implementation. Leave knn_backend='auto' to use the built-in heuristics, or set it explicitly to 'pytorch', 'pykeops', or 'cuvs'. Explicit k-NN backend requests are strict: unsupported engines raise instead of silently falling back.

Backend and k-NN Engine Selection

from dire_rapids import create_dire

# Auto-select reducer implementation and k-NN engine
# Implementation priority: cuVS > PyTorchMemoryEfficient > PyTorch > CPU
reducer = create_dire(n_neighbors=32, verbose=True)
X_embedded = reducer.fit_transform(X)

# Force memory-efficient backend with FP16
reducer = create_dire(memory_efficient=True, use_fp16=True)
X_embedded = reducer.fit_transform(X)

# Force the k-NN engine independently of the reducer implementation
reducer = create_dire(backend='pytorch_cpu', knn_backend='pytorch')

# Force PyKeOps or cuVS for k-NN when those optional dependencies are available
reducer = create_dire(knn_backend='pykeops')
reducer = create_dire(knn_backend='cuvs')

Betti Curves / Topology

The betti_curve module computes filtered Betti curves that track topological features across filtration thresholds. It prefers ripser when available; otherwise it builds a kNN atlas complex and updates Betti numbers incrementally with union-find for beta_0 and GF(2) bitset elimination for beta_1.

from dire_rapids.betti_curve import compute_betti_curve

# Automatic backend selection: ripser, then GPU atlas, then CPU atlas
result = compute_betti_curve(X, k_neighbors=20, n_steps=50)

print(result['filtration_values'])  # filtration thresholds
print(result['beta_0'])             # connected components at each step
print(result['beta_1'])             # 1-cycles (loops) at each step

The atlas fallback uses GPU kNN when cuVS/cuML is available, then performs the set-heavy atlas merge and incremental rank update on CPU.

ReducerRunner Framework

General-purpose framework for running and comparing dimensionality reduction algorithms. See benchmarking/dire_rapids_benchmarks.ipynb for complete examples.

from dire_rapids.utils import ReducerRunner, ReducerConfig
from dire_rapids import create_dire

config = ReducerConfig(
    name="DiRe",
    reducer_class=create_dire,
    reducer_kwargs={"n_neighbors": 16},
    visualize=True,
    max_points=10000
)

runner = ReducerRunner(config=config)
result = runner.run("sklearn:digits")
result = runner.run("openml:mnist_784")

Data sources: sklearn:name, openml:name, cytof:name, dire:name (geometric datasets), file:path (.csv, .npy, .npz, .parquet).

Metrics Module

Evaluation metrics for dimensionality reduction quality:

from dire_rapids.metrics import evaluate_embedding

results = evaluate_embedding(data, layout, labels, compute_topology=True)
print(f"Stress: {results['local']['stress']:.4f}")
print(f"SVM accuracy: {results['context']['svm'][1]:.4f}")
print(f"DTW beta_0: {results['topology']['metrics']['dtw_beta0']:.6f}")
print(f"DTW beta_1: {results['topology']['metrics']['dtw_beta1']:.6f}")
print(results['topology']['protocol'])

Topology protocol parameters are exposed as topology_n_steps, topology_k_neighbors, topology_density_threshold, topology_overlap_factor, and topology_metrics_only.

Metrics: distortion (stress, neighborhood preservation), context (SVM/kNN accuracy), topology (DTW distances between Betti curves). See METRICS_README.md for details.

Testing

# CPU tests (CI)
pytest tests/test_cpu_basic.py tests/test_reducer_runner.py -v

# Full test suite
pytest tests/ -v

Citation

If you use this work, please cite:

@misc{kolpakov-rivin-2025dimensionality,
  title={Dimensionality reduction for homological stability and global structure preservation},
  author={Kolpakov, Alexander and Rivin, Igor},
  year={2025},
  eprint={2503.03156},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2503.03156}
}

Requirements

Python 3.10+
PyTorch 2.0+
NumPy, SciPy, scikit-learn
(Optional) PyKeOps 2.1+ (python -m pip install "dire-rapids[keops]==0.3.1")
(Optional) CUDA 12.x+ for GPU acceleration
(Optional) RAPIDS 26.2+ for the cuVS k-NN engine
(Optional) CuPy for GPU-accelerated Betti curves

Project details

These details have been verified by PyPI

Owner

The University of Austin

Maintainers

Sasha_Kolpakov

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.1

May 31, 2026

0.3.0

May 3, 2026

0.2.0

Oct 28, 2025

0.1.5

Sep 18, 2025

0.1.0

Sep 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dire_rapids-0.3.1.tar.gz (103.0 kB view details)

Uploaded May 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dire_rapids-0.3.1-py3-none-any.whl (74.9 kB view details)

Uploaded May 31, 2026 Python 3

File details

Details for the file dire_rapids-0.3.1.tar.gz.

File metadata

Download URL: dire_rapids-0.3.1.tar.gz
Upload date: May 31, 2026
Size: 103.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for dire_rapids-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`53bbee8935b4320c9735e19527b00dc20ccec0ab5ee9c234758dd7d723acfe72`
MD5	`090106364132719bd90bc7b8f02bc21a`
BLAKE2b-256	`7fdc894d479333b2221738dd720694777480707129e145a3637c62fd9290a254`

See more details on using hashes here.

File details

Details for the file dire_rapids-0.3.1-py3-none-any.whl.

File metadata

Download URL: dire_rapids-0.3.1-py3-none-any.whl
Upload date: May 31, 2026
Size: 74.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for dire_rapids-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f45987b19184da9b127ab5ce5dd3d5f8791c5d5112f5aa19cae9091e36effcb3`
MD5	`c8bd61a58f6686580b6e0e1e81b8f32c`
BLAKE2b-256	`a8bb03f032b2eb6065d00ae71150c59823b07c8eabeda5da6918e81d6bc321e0`

See more details on using hashes here.

dire-rapids 0.3.1

Navigation

Verified details

Owner

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

DiRe Rapids

What is DiRe?

Performance

Topological Preservation

Installation

From PyPI (stable)

From Repository (development)

With RAPIDS Support (Optional, GPU only)

Quick Start

Custom Distance Metrics

Available Backends

Backend and k-NN Engine Selection

Betti Curves / Topology

ReducerRunner Framework

Metrics Module

Testing

Citation

Requirements

Project details

Verified details

Owner

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes