Skip to main content

An independently maintained, heavily modified fork of TopOMetry: geometry-first topological dimensionality reduction and manifold learning.

Project description

License: MIT Open In Colab uv pre-commit Orcid: Jakob

About topometry-nosc

Provenance. topometry-nosc is an independently maintained, heavily modified fork of TopOMetry by David S Oliveira (original copyright and MIT license preserved). Its API, internals and behaviour may differ substantially from upstream. It is not an official release of, affiliated with, or endorsed by the original project. The import package is still named topo, so it cannot be installed alongside the upstream topometry distribution in the same environment. Please cite both this fork and the original work (see CITATION.cff).

topometry-nosc is a Python toolkit for Manifold Learning, Dimensionality Reduction, and Spectral Clustering. It explores high-dimensional data by approximating the Laplace-Beltrami Operator (LBO) via Continuous k-Nearest Neighbors (CkNN) and Diffusion Maps. The pipeline learns neighborhood graphs → Laplace–Beltrami–type operators → spectral scaffolds → refined graphs to find clusters and build low-dimensional layouts.

  • scikit-learn–style transformers compatible with standard machine learning workflows
  • Diffusion Maps and multiscale spectral scaffolds for geometry preservation
  • Operator-native metrics to quantify geometry preservation and Riemannian diagnostics to evaluate distortion in visualizations
  • Designed for large, diverse datasets

For background, see the original paper: https://doi.org/10.7554/eLife.100361.2

Geometry-first rationale (short)

We approximate the Laplace–Beltrami operator (LBO) by learning well-weighted similarity graphs and their Laplacian/diffusion operators. The eigenfunctions of these operators form an orthonormal basis—the spectral scaffold—that captures the dataset’s intrinsic geometry across scales. This view connects to Diffusion Maps, Laplacian Eigenmaps, and related kernel eigenmaps, and enables downstream tasks such as clustering and graph-layout optimization with geometry preserved.

When to use TopoMetry

Use TopoMetry when you want:

  • Geometry-faithful representations beyond variance maximization (e.g., PCA)
  • Robust low-dimensional views and clustering from operator-grounded features
  • Quantitative operator-native metrics to compare methods and parameter choices
  • Reproducible, non-destructive pipelines

Empirically, TopoMetry often outperforms PCA-based pipelines and stand-alone layouts. Still, let the data decide—TopoMetry includes metrics and reports to support evidence-based choices.

For practical guidance on UMAP vs. TopoMetry, target-aware embedding checks, and layout troubleshooting, see docs/faq.md.

When not to use TopoMetry

  • Very small sample sizes where the manifold hypothesis is weak
  • Workflows needing streaming/online updates or inverse transforms (embedding new points without recomputing operators is not currently supported). If that’s critical, consider UMAP or parametric/autoencoder approaches—and you can still use TopoMetry to audit geometry or estimate intrinsic dimensionality to guide model design.

Installation

[!WARNING] Do not install alongside the original topometry. This fork ships the same import package name (topo). Installing both topometry and topometry-nosc in one environment makes them overwrite each other's files. import topo will raise an error if it detects both. Use a fresh virtualenv, or pip uninstall topometry first.

topometry-nosc is a standard, pip-installable package. The core install depends on numpy, scipy, scikit-learn, numba, joblib, matplotlib, pandas, and jupyterlab:

pip install topometry-nosc            # core
pip install "topometry-nosc[all]"     # core + ANN backends and extra layouts

Optional features are grouped into extras — install only what you need:

Extra Adds
ann hnswlib (fast approximate nearest neighbors)
amg pyamg (algebraic-multigrid eigensolver='amg')
layouts pacmap, pymde, trimap (extra projections)
all everything above

Missing an optional dependency raises a clear message telling you which extra to install (e.g. pip install topometry-nosc[layouts]).

Development install

This project uses uv:

uv sync --all-extras   # package + all extras + dev tooling
uv run pytest -q       # run the tests

Tutorials and documentation

Documentation for this fork: https://HauserGroup.github.io/topometryNoSC/ (installation, quickstart, concepts, and an auto-generated API reference).

The original upstream project's (OLD) documentation may still be useful for background, but describes the upstream API, which differs from this fork.

Minimal example

import topo as tp
from sklearn.datasets import make_swiss_roll

X, color = make_swiss_roll(n_samples=2000, noise=0.5, random_state=42)

# Fit runs the whole pipeline: kNN -> kernel -> eigenbasis -> scaffold ->
# refined graph -> 2-D layouts (defaults: MAP + PaCMAP).
tg = tp.TopOGraph()
tg.fit(X)

# Layouts computed during fit, available as attributes:
print(tg.TopoMAP.shape)         # (2000, 2)
print(tg.msTopoPaCMAP.shape)    # (2000, 2)

# Compute another layout on demand from the same fitted model:
emb = tg.project(projection_method="PaCMAP")

Step-by-step (under the hood)

TopOGraph.fit above chains four building blocks. Each is a standalone, scikit-learn-style estimator you can use on its own — swap one out, stop early, or feed your own matrices in:

from sklearn.datasets import make_swiss_roll

from topo.base.ann import kNN
from topo.tpgraph.kernels import Kernel
from topo.spectral.eigen import EigenDecomposition
from topo.layouts.projector import Projector

X, color = make_swiss_roll(n_samples=2000, noise=0.5, random_state=42)

# 1. k-nearest-neighbor graph (sparse). Useful on its own.
knn_graph = kNN(X, n_neighbors=15, metric="euclidean")

# 2. Affinity kernel -> Laplace-Beltrami-type operator.
kernel = Kernel(n_neighbors=15, metric="euclidean").fit(X)
affinity = kernel.K              # sparse affinity matrix
operator = kernel.P              # diffusion operator

# 3. Spectral scaffold: eigendecompose the operator (diffusion maps).
eig = EigenDecomposition(n_components=20, method="DM").fit(kernel)
scaffold = eig.transform(kernel) # (2000, ~20) eigen-coordinates

# 4. 2-D layout from the scaffold (any feature matrix works here).
emb = Projector(n_components=2, projection_method="MAP").fit_transform(scaffold)
print(emb.shape)                 # (2000, 2)

Each step maps to a documented class: kNN, Kernel, EigenDecomposition, Projector. See the API reference and the step-by-step tutorial.

Output

Example TopOGraph fit:

TopOMetry fit example

Changelog

0.2.0 — Core-only release

  • Removed single-cell / scanpy / AnnData wrappers (now a standalone geometry toolkit)
  • Core API unchanged: TopOGraph, spectral scaffolds, graph operators, layouts, metrics, plotting

Citation


@article{Oliveira_2024,
	title={TopoMetry systematically learns and evaluates the latent geometry of single-cell data},
	volume={13},
	ISSN={2050-084X},
	url={http://dx.doi.org/10.7554/eLife.100361.2},
	DOI={10.7554/elife.100361.2},
	journal={eLife},
	publisher={eLife Sciences Publications, Ltd},
	author={Oliveira, David S and Domingos, Ana I and Velloso, Licio A},
	year={2024},
	month=aug
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

topometry_nosc-0.3.0.tar.gz (437.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

topometry_nosc-0.3.0-py3-none-any.whl (429.3 kB view details)

Uploaded Python 3

File details

Details for the file topometry_nosc-0.3.0.tar.gz.

File metadata

  • Download URL: topometry_nosc-0.3.0.tar.gz
  • Upload date:
  • Size: 437.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for topometry_nosc-0.3.0.tar.gz
Algorithm Hash digest
SHA256 2989e63c48262fda797897028b5350b9be8e0c78ec15b826f3d6ea2fe149b4aa
MD5 d033281c1770ee77ba28f6915664d738
BLAKE2b-256 03e1feb18fd1ff12ad0dc979ac5a62fa7720d892eced00a105b5800850f8764a

See more details on using hashes here.

File details

Details for the file topometry_nosc-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: topometry_nosc-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 429.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for topometry_nosc-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fac62c81682580f30d629ce51d81acef723babff21d314f078564076d5dceb6b
MD5 ce48ed5ab7cefe3a07dd3dcf2b31ce10
BLAKE2b-256 99d3a3920aead944ee923e3239c223cb1cacdaa8ea99601cc965e2abf187d2d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page