Skip to main content

Tree-based visualization for high-dimensional data

Project description

Tests PyPI Python 3.11+

TMAP2

Tree-based visualization for high-dimensional data. Organizes similar items into interactive tree structures. Ideal for chemical space, protein embeddings, single-cell data, or any high-dimensional dataset.

Interactive HTML export AlphaFold protein clusters

Why Trees?

Most dimensionality reduction tools (UMAP, t-SNE) produce point clouds. TMAP produces a tree, a connected structure where every point is linked to its neighbors through branches. This makes the layout itself explorable: you can follow branches, trace paths between any two points, and discover how regions connect.

For example, in a TMAP of pet breed images, following the branch from terriers toward cats reveals that the bridge between the two groups runs through chihuahuas and sphynx cats (the bald ones) which is both hilarious and logical; both are small, short-haired, big-eyed. The tree doesn't just cluster similar things it also shows you how dissimilar things are connected.

Exploring pet breed tree

Because the layout is a tree, you get operations that point clouds can't support:

path = model.path(idx_a, idx_b) # nodes along the tree path
d = model.distance(idx_a, idx_b # sum of edge weights along the path
pseudotime = model.distances_from(idx) # tree distance from one point to all others

Installation

pip install tmap2

Optional extras:

pip install rdkit # chemistry helpers (fingerprints_from_smiles, molecular_properties)
pip install jupyter-scatter # notebook interactive widgets

Note: The import name is tmap, not tmap2.

Quick Start

import numpy as np
from tmap import TMAP

# Binary fingerprints (Jaccard)
X = np.random.randint(0, 2, (1000, 2048), dtype=np.uint8)
model = TMAP(metric="jaccard", n_neighbors=20, seed=42).fit(X)
model.to_html("map.html")
# Dense embeddings (cosine / euclidean)
X = np.random.random((1000, 128)).astype(np.float32)
model = TMAP(metric="cosine", n_neighbors=20).fit(X)
new_coords = model.transform(X[:10])
# Interactive notebook widget
model.plot(color_by="label", data=df, tooltip_properties=["name", "score"])

Key Features

  • Tree structure: follow branches, trace paths, compute pseudotime
  • Deterministic: same input + seed = same output
  • Multiple metrics: jaccard, cosine, euclidean, precomputed
  • Incremental: add_points() and transform() for new data
  • Model persistence: save() / load()
  • Three viz backends: interactive HTML, jupyter-scatter, matplotlib

Visualization

Interactive HTML: lasso selection, light/dark theme, filter and search panels, pinned metadata cards, binary mode for large datasets.

Notebook widgets: color switching, categorical filtering, and lasso selection with pandas-backed metadata:

viz = model.to_tmapviz()
viz.add_color_layout("Molecular Weight", mw.tolist(), categorical=False)
viz.add_color_layout("Scaffold", scaffolds, categorical=True, color="tab10")
viz.add_label("SMILES", smiles_list)
viz.show(width=1000, height=620, controls=True)

Static plots — matplotlib for publication figures: model.plot_static(color_by=labels)

Domain Utilities

Built-in helpers for common scientific workflows:

from tmap.utils.chemistry import fingerprints_from_smiles, molecular_properties
from tmap.utils.proteins import fetch_uniprot, sequence_properties
from tmap.utils.singlecell import from_anndata
Domain Metric Utilities
Chemoinformatics jaccard fingerprints_from_smiles, molecular_properties, murcko_scaffolds
Proteins cosine / euclidean fetch_uniprot, fetch_alphafold, read_fasta, sequence_properties
Single-cell cosine / euclidean from_anndata, cell_metadata, marker_scores
Generic embeddings cosine / euclidean / precomputed No domain utils needed

Notebooks

Notebook Topic
01 Quick Start End-to-end walkthrough
02 MinHash Deep Dive Encoding methods and when to use each
03 Legacy LSH Pipeline Lower-level MinHash + LSHForest + layout workflow
04 Notebook Widgets Selection, filtering, zoom, export
05 Single-Cell RNA-seq with PBMC 3k, pseudotime, UMAP comparison
06 Metric Guide Choosing the right metric
07 FAQ Troubleshooting and common questions
08 Cheminformatics Molecules, fingerprints, SAR
09 Protein Analysis FASTA, ESM embeddings, AlphaFold
11 Card Configuration Pinned card layout, fields, and links
11 Default Params Benchmark Defaults across dataset sizes and types
12 USearch Jaccard Binary Jaccard with USearch backend

Lower-Level Pipeline

For direct control over indexing, hashing, and layout, see the legacy pipeline notebook. The main building blocks:

from tmap.index import USearchIndex           # dense / binary kNN
from tmap import MinHash, LSHForest           # Jaccard on sets / strings
from tmap.layout import LayoutConfig, layout_from_lsh_forest
Your Data
   ├─→ Binary matrix ─────────→ USearch        (Jaccard / cosine / euclidean)
   └─→ Sets / strings ───────→ MinHash → LSHForest
                ↓
             k-NN Graph → MST → OGDF Tree Layout → Interactive Visualization

Development

git clone https://github.com/afloresep/tmap2.git
cd tmap2
pip install ".[dev]"
pytest -v

License

MIT License - see LICENSE for details.

Based on the original TMAP by Daniel Probst and Jean-Louis Reymond.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tmap2-0.2.1.tar.gz (4.4 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

tmap2-0.2.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

tmap2-0.2.1-cp313-cp313-macosx_11_0_arm64.whl (910.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

tmap2-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

tmap2-0.2.1-cp312-cp312-macosx_11_0_arm64.whl (910.6 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

tmap2-0.2.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

tmap2-0.2.1-cp311-cp311-macosx_11_0_arm64.whl (910.2 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file tmap2-0.2.1.tar.gz.

File metadata

  • Download URL: tmap2-0.2.1.tar.gz
  • Upload date:
  • Size: 4.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tmap2-0.2.1.tar.gz
Algorithm Hash digest
SHA256 bb5b7b77db471f3eeb76677f6abaf3fa7595a3732f0f6d605cfd057ae97a0763
MD5 d134b62b8ebc8dfd4a1cbacaf61cab24
BLAKE2b-256 3ee2fb99f8ca6db4e6e8e158c441d750b7d42aa8b75f12a762a9cd28bad0ee3f

See more details on using hashes here.

Provenance

The following attestation bundles were made for tmap2-0.2.1.tar.gz:

Publisher: publish.yml on afloresep/tmap2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tmap2-0.2.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for tmap2-0.2.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 deee9d1492cb529f9b7fa36e27f19f9aa189964503afd247ea2ab1e752fd1168
MD5 1298a1125e59942b36f81971aa613fc3
BLAKE2b-256 bfc4bf9b888cd05986d00beb45501877f2f43f2ed46c55dda05fb557eab74179

See more details on using hashes here.

Provenance

The following attestation bundles were made for tmap2-0.2.1-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on afloresep/tmap2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tmap2-0.2.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tmap2-0.2.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f896f422f6475b22603ae9dbede6fdf0a19b74102a5b8c5b617169131c655237
MD5 50d6f2f9c815dece854500eda54cde76
BLAKE2b-256 6e341b0562a344f10174aa863bf64990e104726bc2ca737899115afc0b6f0877

See more details on using hashes here.

Provenance

The following attestation bundles were made for tmap2-0.2.1-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: publish.yml on afloresep/tmap2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tmap2-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for tmap2-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4fc101fde7c0fe457fe18cfdc3c7fa6433ceb48309c3928bf6d0b8f593e301e8
MD5 3501f15fe32684d789651bdfcb667808
BLAKE2b-256 955ad019ded85818adf19ca6a158e450aa51d9da8c4c1a72fec254bdc3b9b683

See more details on using hashes here.

Provenance

The following attestation bundles were made for tmap2-0.2.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on afloresep/tmap2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tmap2-0.2.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tmap2-0.2.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dac0640c8e983fc589b8336deb9bf741812f655eb340b8712eefb352c62b09e5
MD5 b5af7e07ead0613b878682fd2f2ba376
BLAKE2b-256 8f2c816ceae06f8a795c2b203cb2fa59038a222c404371507042af222a8b1d1a

See more details on using hashes here.

Provenance

The following attestation bundles were made for tmap2-0.2.1-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: publish.yml on afloresep/tmap2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tmap2-0.2.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for tmap2-0.2.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3ed78e1181beedcc303a8891afe200cb59cf71508ada916468e9695fdd149981
MD5 bc150ab874de36ed092500d4858f1492
BLAKE2b-256 aa2208e062a15e0fff669bd53e4c1324fb84187e04bab624ba3cc4f57b156505

See more details on using hashes here.

Provenance

The following attestation bundles were made for tmap2-0.2.1-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on afloresep/tmap2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tmap2-0.2.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tmap2-0.2.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 38f1a5da0242903afa597c2829743c5b05937f58c5b56cdf705d62dee467b646
MD5 4c6c50f5665676380574530cc31c0d43
BLAKE2b-256 cddb70c182c1cdb097c3836d02b0c466294601054d4e9b9977eea738cb7218fd

See more details on using hashes here.

Provenance

The following attestation bundles were made for tmap2-0.2.1-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: publish.yml on afloresep/tmap2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page