Tree-based visualization for high-dimensional data
Project description
TMAP2
Tree-based visualization for high-dimensional data. Organizes similar items into interactive tree structures. Ideal for chemical space, protein embeddings, single-cell data, or any high-dimensional dataset.
Why Trees?
Most dimensionality reduction tools (UMAP, t-SNE) produce point clouds. TMAP produces a tree, a connected structure where every point is linked to its neighbors through branches. This makes the layout itself explorable: you can follow branches, trace paths between any two points, and discover how regions connect.
For example, in a TMAP of pet breed images, following the branch from terriers toward cats reveals that the bridge between the two groups runs through chihuahuas and sphynx cats (the bald ones) which is both hilarious and logical; both are small, short-haired, big-eyed. The tree doesn't just cluster similar things it also shows you how dissimilar things are connected.
Because the layout is a tree, you get operations that point clouds can't support:
path = model.path(idx_a, idx_b) # nodes along the tree path
d = model.distance(idx_a, idx_b # sum of edge weights along the path
pseudotime = model.distances_from(idx) # tree distance from one point to all others
Installation
pip install tmap2
Optional extras:
pip install rdkit # chemistry helpers (fingerprints_from_smiles, molecular_properties)
pip install jupyter-scatter # notebook interactive widgets
Note: The import name is
tmap, nottmap2.
Quick Start
import numpy as np
from tmap import TMAP
# Binary fingerprints (Jaccard)
X = np.random.randint(0, 2, (1000, 2048), dtype=np.uint8)
model = TMAP(metric="jaccard", n_neighbors=20, seed=42).fit(X)
model.to_html("map.html")
# Dense embeddings (cosine / euclidean)
X = np.random.random((1000, 128)).astype(np.float32)
model = TMAP(metric="cosine", n_neighbors=20).fit(X)
new_coords = model.transform(X[:10])
# Interactive notebook widget
model.plot(color_by="label", data=df, tooltip_properties=["name", "score"])
Key Features
- Tree structure: follow branches, trace paths, compute pseudotime
- Deterministic: same input + seed = same output
- Multiple metrics:
jaccard,cosine,euclidean,precomputed - Incremental:
add_points()andtransform()for new data - Model persistence:
save()/load() - Three viz backends: interactive HTML, jupyter-scatter, matplotlib
Visualization
Interactive HTML: lasso selection, light/dark theme, filter and search panels, pinned metadata cards, binary mode for large datasets.
Notebook widgets: color switching, categorical filtering, and lasso selection with pandas-backed metadata:
viz = model.to_tmapviz()
viz.add_color_layout("Molecular Weight", mw.tolist(), categorical=False)
viz.add_color_layout("Scaffold", scaffolds, categorical=True, color="tab10")
viz.add_label("SMILES", smiles_list)
viz.show(width=1000, height=620, controls=True)
Static plots — matplotlib for publication figures: model.plot_static(color_by=labels)
Domain Utilities
Built-in helpers for common scientific workflows:
from tmap.utils.chemistry import fingerprints_from_smiles, molecular_properties
from tmap.utils.proteins import fetch_uniprot, sequence_properties
from tmap.utils.singlecell import from_anndata
| Domain | Metric | Utilities |
|---|---|---|
| Chemoinformatics | jaccard |
fingerprints_from_smiles, molecular_properties, murcko_scaffolds |
| Proteins | cosine / euclidean |
fetch_uniprot, fetch_alphafold, read_fasta, sequence_properties |
| Single-cell | cosine / euclidean |
from_anndata, cell_metadata, marker_scores |
| Generic embeddings | cosine / euclidean / precomputed |
No domain utils needed |
Notebooks
| Notebook | Topic |
|---|---|
| 01 Quick Start | End-to-end walkthrough |
| 02 MinHash Deep Dive | Encoding methods and when to use each |
| 03 Legacy LSH Pipeline | Lower-level MinHash + LSHForest + layout workflow |
| 04 Notebook Widgets | Selection, filtering, zoom, export |
| 05 Single-Cell | RNA-seq with PBMC 3k, pseudotime, UMAP comparison |
| 06 Metric Guide | Choosing the right metric |
| 07 FAQ | Troubleshooting and common questions |
| 08 Cheminformatics | Molecules, fingerprints, SAR |
| 09 Protein Analysis | FASTA, ESM embeddings, AlphaFold |
| 11 Card Configuration | Pinned card layout, fields, and links |
| 11 Default Params Benchmark | Defaults across dataset sizes and types |
| 12 USearch Jaccard | Binary Jaccard with USearch backend |
Lower-Level Pipeline
For direct control over indexing, hashing, and layout, see the legacy pipeline notebook. The main building blocks:
from tmap.index import USearchIndex # dense / binary kNN
from tmap import MinHash, LSHForest # Jaccard on sets / strings
from tmap.layout import LayoutConfig, layout_from_lsh_forest
Your Data
├─→ Binary matrix ─────────→ USearch (Jaccard / cosine / euclidean)
└─→ Sets / strings ───────→ MinHash → LSHForest
↓
k-NN Graph → MST → OGDF Tree Layout → Interactive Visualization
Development
git clone https://github.com/afloresep/tmap2.git
cd tmap2
pip install ".[dev]"
pytest -v
License
MIT License - see LICENSE for details.
Based on the original TMAP by Daniel Probst and Jean-Louis Reymond.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tmap2-0.2.0.tar.gz.
File metadata
- Download URL: tmap2-0.2.0.tar.gz
- Upload date:
- Size: 4.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2e2f006d66be9d546799b36085131c73a6c6957e4fee465a27fec15131227140
|
|
| MD5 |
7d208de15fd1dc1a760f3cc9eed46eba
|
|
| BLAKE2b-256 |
acc1ec3c7738accbf4a0273428c7c23182e9ffaa74b743d8798a0034ea3dcc23
|
Provenance
The following attestation bundles were made for tmap2-0.2.0.tar.gz:
Publisher:
publish.yml on afloresep/tmap2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tmap2-0.2.0.tar.gz -
Subject digest:
2e2f006d66be9d546799b36085131c73a6c6957e4fee465a27fec15131227140 - Sigstore transparency entry: 1238737495
- Sigstore integration time:
-
Permalink:
afloresep/tmap2@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/afloresep
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Trigger Event:
push
-
Statement type:
File details
Details for the file tmap2-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: tmap2-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.13, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4bad90ab13d09f0066f42fedea85ed4150218c1134e800a628fb4555502baaf6
|
|
| MD5 |
4f1ee486ce21116315e5ad1429080207
|
|
| BLAKE2b-256 |
7bdaac924a6bc601ded35e9bc4b9746cea821eee44a43053c49867044b975851
|
Provenance
The following attestation bundles were made for tmap2-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:
Publisher:
publish.yml on afloresep/tmap2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tmap2-0.2.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl -
Subject digest:
4bad90ab13d09f0066f42fedea85ed4150218c1134e800a628fb4555502baaf6 - Sigstore transparency entry: 1238737534
- Sigstore integration time:
-
Permalink:
afloresep/tmap2@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/afloresep
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Trigger Event:
push
-
Statement type:
File details
Details for the file tmap2-0.2.0-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: tmap2-0.2.0-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 910.0 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79ede702abd2964ae966311686fe6736dfcd71f71f73b92fafc0f52ab48b875e
|
|
| MD5 |
a60b9539baf4ac56193e60a60c14fe93
|
|
| BLAKE2b-256 |
1d53195c22d2a0985c50da4dc42401df22fe944ac8586f46b00b6bc5a56eb55a
|
Provenance
The following attestation bundles were made for tmap2-0.2.0-cp313-cp313-macosx_11_0_arm64.whl:
Publisher:
publish.yml on afloresep/tmap2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tmap2-0.2.0-cp313-cp313-macosx_11_0_arm64.whl -
Subject digest:
79ede702abd2964ae966311686fe6736dfcd71f71f73b92fafc0f52ab48b875e - Sigstore transparency entry: 1238737546
- Sigstore integration time:
-
Permalink:
afloresep/tmap2@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/afloresep
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Trigger Event:
push
-
Statement type:
File details
Details for the file tmap2-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: tmap2-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.12, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0533af3d11eef77df58f0025c28e4ef2cea57b4757c6b0a57200ac392187b06f
|
|
| MD5 |
da8ef496307c4038365cb92eef8e03f1
|
|
| BLAKE2b-256 |
5261d2e21dbd703d26daffa124c21333ec147a08c5fdcab6f23a2dfc13dba5d1
|
Provenance
The following attestation bundles were made for tmap2-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:
Publisher:
publish.yml on afloresep/tmap2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tmap2-0.2.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl -
Subject digest:
0533af3d11eef77df58f0025c28e4ef2cea57b4757c6b0a57200ac392187b06f - Sigstore transparency entry: 1238737511
- Sigstore integration time:
-
Permalink:
afloresep/tmap2@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/afloresep
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Trigger Event:
push
-
Statement type:
File details
Details for the file tmap2-0.2.0-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: tmap2-0.2.0-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 909.9 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed03fd77ef6e8c14101a3244bd8d6584b1eaf2574cdfe750355fd34bce29daaf
|
|
| MD5 |
3eaf803d3202fb67c3be43c70401df14
|
|
| BLAKE2b-256 |
93272215d65a9106d08ba04c2c7b7c7b084c584eabdc9de389560b026edee5e5
|
Provenance
The following attestation bundles were made for tmap2-0.2.0-cp312-cp312-macosx_11_0_arm64.whl:
Publisher:
publish.yml on afloresep/tmap2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tmap2-0.2.0-cp312-cp312-macosx_11_0_arm64.whl -
Subject digest:
ed03fd77ef6e8c14101a3244bd8d6584b1eaf2574cdfe750355fd34bce29daaf - Sigstore transparency entry: 1238737523
- Sigstore integration time:
-
Permalink:
afloresep/tmap2@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/afloresep
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Trigger Event:
push
-
Statement type:
File details
Details for the file tmap2-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: tmap2-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.11, manylinux: glibc 2.27+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7483d1b7f79e3598096d5ea530c1971aa38fca98e730d290ca4ceb545ee2fa26
|
|
| MD5 |
72d009881780d60df6c497dbfbbe7c75
|
|
| BLAKE2b-256 |
a85fdd4ef9cc918defce017e563065029b8cf03a2bec35a2e4b716fbbe50efc6
|
Provenance
The following attestation bundles were made for tmap2-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:
Publisher:
publish.yml on afloresep/tmap2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tmap2-0.2.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl -
Subject digest:
7483d1b7f79e3598096d5ea530c1971aa38fca98e730d290ca4ceb545ee2fa26 - Sigstore transparency entry: 1238737507
- Sigstore integration time:
-
Permalink:
afloresep/tmap2@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/afloresep
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Trigger Event:
push
-
Statement type:
File details
Details for the file tmap2-0.2.0-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: tmap2-0.2.0-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 909.5 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c46083235bef76c288a24a508cd6b5d2148451f35ad4c968d944535309671d5
|
|
| MD5 |
b8693a706e4f5695f7566c7b50eaf094
|
|
| BLAKE2b-256 |
f1bdf322b108f50e39c88f9736a69305aa4b746f2812cc42e14837e138979d79
|
Provenance
The following attestation bundles were made for tmap2-0.2.0-cp311-cp311-macosx_11_0_arm64.whl:
Publisher:
publish.yml on afloresep/tmap2
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tmap2-0.2.0-cp311-cp311-macosx_11_0_arm64.whl -
Subject digest:
1c46083235bef76c288a24a508cd6b5d2148451f35ad4c968d944535309671d5 - Sigstore transparency entry: 1238737540
- Sigstore integration time:
-
Permalink:
afloresep/tmap2@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/afloresep
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dbefb89aa81180bfdd5f69749abd261ee8c7f0aa -
Trigger Event:
push
-
Statement type: