An independently maintained, heavily modified fork of TopOMetry: geometry-first topological dimensionality reduction and manifold learning.
Project description
About topometry-nosc
Provenance.
topometry-noscis an independently maintained, heavily modified fork of TopOMetry by David S Oliveira (original copyright and MIT license preserved). Its API, internals and behaviour may differ substantially from upstream. It is not an official release of, affiliated with, or endorsed by the original project. The import package is still namedtopo, so it cannot be installed alongside the upstreamtopometrydistribution in the same environment. Please cite both this fork and the original work (seeCITATION.cff).
topometry-nosc is a Python toolkit for Manifold Learning, Dimensionality Reduction, and Spectral Clustering. It explores high-dimensional data by approximating the Laplace-Beltrami Operator (LBO) via Continuous k-Nearest Neighbors (CkNN) and Diffusion Maps. The pipeline learns neighborhood graphs → Laplace–Beltrami–type operators → spectral scaffolds → refined graphs to find clusters and build low-dimensional layouts.
- scikit-learn–style transformers compatible with standard machine learning workflows
- Diffusion Maps and multiscale spectral scaffolds for geometry preservation
- Operator-native metrics to quantify geometry preservation and Riemannian diagnostics to evaluate distortion in visualizations
- Designed for large, diverse datasets
For background, see the original paper: https://doi.org/10.7554/eLife.100361.2
Geometry-first rationale (short)
We approximate the Laplace–Beltrami operator (LBO) by learning well-weighted similarity graphs and their Laplacian/diffusion operators. The eigenfunctions of these operators form an orthonormal basis—the spectral scaffold—that captures the dataset’s intrinsic geometry across scales. This view connects to Diffusion Maps, Laplacian Eigenmaps, and related kernel eigenmaps, and enables downstream tasks such as clustering and graph-layout optimization with geometry preserved.
When to use TopoMetry
Use TopoMetry when you want:
- Geometry-faithful representations beyond variance maximization (e.g., PCA)
- Robust low-dimensional views and clustering from operator-grounded features
- Quantitative operator-native metrics to compare methods and parameter choices
- Reproducible, non-destructive pipelines
Empirically, TopoMetry often outperforms PCA-based pipelines and stand-alone layouts. Still, let the data decide—TopoMetry includes metrics and reports to support evidence-based choices.
For practical guidance on UMAP vs. TopoMetry, target-aware embedding checks, and
layout troubleshooting, see docs/faq.md.
When not to use TopoMetry
- Very small sample sizes where the manifold hypothesis is weak
- Workflows needing streaming/online updates or inverse transforms (embedding new points without recomputing operators is not currently supported). If that’s critical, consider UMAP or parametric/autoencoder approaches—and you can still use TopoMetry to audit geometry or estimate intrinsic dimensionality to guide model design.
Installation
[!WARNING] Do not install alongside the original
topometry. This fork ships the same import package name (topo). Installing bothtopometryandtopometry-noscin one environment makes them overwrite each other's files.import topowill raise an error if it detects both. Use a fresh virtualenv, orpip uninstall topometryfirst.
topometry-nosc is a standard, pip-installable package. The core install depends on numpy, scipy, scikit-learn, numba, joblib, matplotlib, pandas, and jupyterlab:
pip install topometry-nosc # core
pip install "topometry-nosc[all]" # core + ANN backends and extra layouts
Optional features are grouped into extras — install only what you need:
| Extra | Adds |
|---|---|
ann |
hnswlib (fast approximate nearest neighbors) |
amg |
pyamg (algebraic-multigrid eigensolver='amg') |
layouts |
pacmap, pymde, trimap (extra projections) |
all |
everything above |
Missing an optional dependency raises a clear message telling you which extra to
install (e.g. pip install topometry-nosc[layouts]).
Development install
This project uses uv:
uv sync --all-extras # package + all extras + dev tooling
uv run pytest -q # run the tests
Tutorials and documentation
Documentation for this fork: https://HauserGroup.github.io/topometryNoSC/ (installation, quickstart, concepts, and an auto-generated API reference).
The original upstream project's (OLD) documentation may still be useful for background, but describes the upstream API, which differs from this fork.
Minimal example
import topo as tp
from sklearn.datasets import make_swiss_roll
X, color = make_swiss_roll(n_samples=2000, noise=0.5, random_state=42)
# Fit runs the whole pipeline: kNN -> kernel -> eigenbasis -> scaffold ->
# refined graph -> 2-D layouts (defaults: MAP + PaCMAP).
tg = tp.TopOGraph()
tg.fit(X)
# Layouts computed during fit, available as attributes:
print(tg.TopoMAP.shape) # (2000, 2)
print(tg.msTopoPaCMAP.shape) # (2000, 2)
# Compute another layout on demand from the same fitted model:
emb = tg.project(projection_method="PaCMAP")
Step-by-step (under the hood)
TopOGraph.fit above chains four building blocks. Each is a standalone,
scikit-learn-style estimator you can use on its own — swap one out, stop early,
or feed your own matrices in:
from sklearn.datasets import make_swiss_roll
from topo.base.ann import kNN
from topo.tpgraph.kernels import Kernel
from topo.spectral.eigen import EigenDecomposition
from topo.layouts.projector import Projector
X, color = make_swiss_roll(n_samples=2000, noise=0.5, random_state=42)
# 1. k-nearest-neighbor graph (sparse). Useful on its own.
knn_graph = kNN(X, n_neighbors=15, metric="euclidean")
# 2. Affinity kernel -> Laplace-Beltrami-type operator.
kernel = Kernel(n_neighbors=15, metric="euclidean").fit(X)
affinity = kernel.K # sparse affinity matrix
operator = kernel.P # diffusion operator
# 3. Spectral scaffold: eigendecompose the operator (diffusion maps).
eig = EigenDecomposition(n_components=20, method="DM").fit(kernel)
scaffold = eig.transform(kernel) # (2000, ~20) eigen-coordinates
# 4. 2-D layout from the scaffold (any feature matrix works here).
emb = Projector(n_components=2, projection_method="MAP").fit_transform(scaffold)
print(emb.shape) # (2000, 2)
Each step maps to a documented class: kNN, Kernel, EigenDecomposition,
Projector. See the API reference
and the step-by-step tutorial.
Output
Example TopOGraph fit:
Changelog
0.2.0 — Core-only release
- Removed single-cell / scanpy / AnnData wrappers (now a standalone geometry toolkit)
- Core API unchanged:
TopOGraph, spectral scaffolds, graph operators, layouts, metrics, plotting
Citation
@article{Oliveira_2024,
title={TopoMetry systematically learns and evaluates the latent geometry of single-cell data},
volume={13},
ISSN={2050-084X},
url={http://dx.doi.org/10.7554/eLife.100361.2},
DOI={10.7554/elife.100361.2},
journal={eLife},
publisher={eLife Sciences Publications, Ltd},
author={Oliveira, David S and Domingos, Ana I and Velloso, Licio A},
year={2024},
month=aug
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file topometry_nosc-0.3.0.tar.gz.
File metadata
- Download URL: topometry_nosc-0.3.0.tar.gz
- Upload date:
- Size: 437.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2989e63c48262fda797897028b5350b9be8e0c78ec15b826f3d6ea2fe149b4aa
|
|
| MD5 |
d033281c1770ee77ba28f6915664d738
|
|
| BLAKE2b-256 |
03e1feb18fd1ff12ad0dc979ac5a62fa7720d892eced00a105b5800850f8764a
|
File details
Details for the file topometry_nosc-0.3.0-py3-none-any.whl.
File metadata
- Download URL: topometry_nosc-0.3.0-py3-none-any.whl
- Upload date:
- Size: 429.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fac62c81682580f30d629ce51d81acef723babff21d314f078564076d5dceb6b
|
|
| MD5 |
ce48ed5ab7cefe3a07dd3dcf2b31ce10
|
|
| BLAKE2b-256 |
99d3a3920aead944ee923e3239c223cb1cacdaa8ea99601cc965e2abf187d2d8
|