Optimal Transport-based Matrix Factorization for spatial transcriptomics deconvolution.
Project description
spOT-NMF
Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics Abdelkareem, A.O. et al.(2025)
spOT-NMF is a Python package for unsupervised deconvolution and discovery of gene programs in spatial transcriptomics. It integrates Optimal Transport (OT) into a non-negative matrix factorization (NMF) framework, enabling robust topic modeling, high-resolution spatial deconvolution, and rich biological annotation.
This package supports the analyses in: spOT-NMF: Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics — bioRxiv (2025). DOI: 10.1101/2025.08.02.668292
✨ Key Features
- OT-NMF Deconvolution: Reference-free topic modeling with OT-regularized NMF.
- HVG Selection: Flexible, batch-aware highly variable gene selection.
- Biological Annotation: Automated enrichment and gene-set overlap of inferred programs.
- Spatial Visualization: Publication-quality spatial plots for topic/program usage.
- Scalable & Modular: Built for large datasets and multi-sample workflows.
- CLI & Python API: Run from the command line or import in notebooks.
📦 Installation
spOT-NMF requires Python ≥ 3.12. We recommend uv for a fast,
reproducible setup. PyTorch is installed separately so you can pick the build (CPU or CUDA) for your platform.
Recommended: uv
# 1. Create and activate an isolated environment (uv fetches Python 3.12 if needed)
uv venv --python 3.12
# Linux/macOS: source .venv/bin/activate
# Windows: .venv\Scripts\activate
# 2. Install PyTorch for your platform (see pytorch.org)
# CPU-only:
uv pip install torch --index-url https://download.pytorch.org/whl/cpu
# CUDA 12.x (Linux/Windows with NVIDIA GPUs):
# uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
# 3. Install spOT-NMF
uv pip install spot-nmf
Alternative: pip
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install spot-nmf
From source (development)
git clone https://github.com/MorrissyLab/spOT-NMF.git
cd spOT-NMF
uv venv --python 3.12
uv pip install torch --index-url https://download.pytorch.org/whl/cpu
uv pip install -e ".[dev]" # editable install with test dependencies
uv run pytest -q # run the test suite
Verify the install
spotnmf --help
If no GPU is available, spOT-NMF automatically runs on CPU.
🚀 Quick Start
Command Line
Full pipeline (deconvolution → annotation → spatial plots → networks):
spotnmf spotnmf \
--sample_name SAMPLE1 \
--adata_path ./data/sample1.h5ad \
--data_mode h5ad \
--results_dir ./results \
--k 5 \
--genome GRCh38
--data_modeselects how the input is read:h5adfor a single AnnData.h5adfile,visium(the default) for a Space Ranger output directory, orvisium_hdfor Visium HD. Pass--data_mode h5adwhenever--adata_pathpoints to a.h5adfile.
Other commands:
spotnmf deconvolve --sample_name SAMPLE1 --adata_path ./data/sample1.h5ad --data_mode h5ad --results_dir ./results --k 5
spotnmf plot --sample_name SAMPLE1 --adata_path ./data/sample1.h5ad --data_mode h5ad --results_dir ./results
spotnmf annotate --sample_name SAMPLE1 --results_dir ./results --genome GRCh38
spotnmf network --sample_name SAMPLE1 --results_dir ./results --usage_threshold 0 --n_bins 1000 --edge_threshold 0.199
The
networkcommand reuses the per-spot usages written bydeconvolve. On small datasets no topic pairs may pass--n_bins/--edge_threshold; in that case it prints a notice and skips plotting — lower the thresholds to force a graph.
Python / Notebooks
from pathlib import Path
import spotnmf as spot
# === Configuration === #
DATA_PATH = Path("data/test_data/dataset10_adata_spatial.h5ad")
RESULTS_DIR = Path(r"/data/test_results/")
SAMPLE_NAME = "TestSample"
GENOME = "mm10"
# === Read Data === #
adata = spot.io.read_adata(
data_path=DATA_PATH,
data_mode="h5ad"
)
# === Model Parameters === #
model_params = {
"lr": 0.001, # Learning rate
"h": 0.01, # H regularization
"w": 0.01, # W regularization
"eps": 0.05, # Epsilon
"normalize_rows": True,
}
# === Run Factorization === #
results = spot.cli.run_experiment(
adata_spatial=adata,
k=5, # Number of ranks
sample_name=SAMPLE_NAME,
results_dir=str(RESULTS_DIR),
genome=GENOME,
annotate=False,
plot=False,
network=False,
is_visium=True,
model_params=model_params,
)
# === Annotate Programs === #
spot.cli.annotate_programs(
results_dir=str(RESULTS_DIR),
sample_name=SAMPLE_NAME,
genome=GENOME,
)
📓 Tutorials
A fully worked, well-commented notebook runs the entire pipeline end-to-end on the small example dataset that ships with the repo (CPU-only, ~1 minute) — loading data, selecting HVGs, running the OT-NMF deconvolution, mapping programs spatially, extracting marker genes, and validating the recovered programs against ground-truth cell types. All figures are pre-rendered in the notebook.
- Full pipeline tutorial → (
docs/source/tutorials/full_pipeline.ipynb)
GitHub renders the notebook (with figures) directly in the browser — just click the link.
⚙️ CLI Overview
| Command | Description |
|---|---|
spotnmf |
Full pipeline: deconvolution → annotation → spatial plotting |
deconvolve |
Run OT-NMF and save results |
plot |
Visualize spatial topic/program usage |
annotate |
Enrich and annotate gene programs |
network |
Visualize niche networks based on topic interactions |
Run spotnmf <command> --help for per-command options.
📁 Outputs
topics_per_spot_{sample}.csv— topic/program usage per spotgenescores_per_topic_{sample}.csv— gene scores per topicranked_genescores_{sample}.csv— ranked marker genes per topic- Pathway enrichment and gene-set overlap tables
- Spatial plots & QC visualizations
- Network plots of topic–topic interactions
🔬 Reproducibility (Manuscript Notebooks)
The main branch provides the reusable software package.
The original Jupyter notebooks used to reproduce manuscript figures are maintained in the manuscript branch:
git fetch origin
git checkout manuscript
Notebooks are in:
scripts/manuscript_notebooks/
Use manuscript to regenerate paper figures; use main for running the package on your data.
🧾 Citation
Please cite:
Abdelkareem, A.O., Gill, G.S., Manoharan, V.T., Verhey, T.B., & Morrissy, A.S. spOT-NMF: Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics. bioRxiv (2025). https://doi.org/10.1101/2025.08.02.668292
@article{abdelkareem2025spotnmf,
title = {spOT-NMF: Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics},
author = {Abdelkareem, Aly O. and Gill, Gurveer S. and Manoharan, Varsha Thoppey and Verhey, Theodore B. and Morrissy, A. Sorana},
journal = {bioRxiv},
year = {2025},
doi = {10.1101/2025.08.02.668292},
url = {https://www.biorxiv.org/content/10.1101/2025.08.02.668292v1},
note = {Preprint}
}
🤝 Contributing
We welcome ideas, bug reports, and feature requests—please open a GitHub Issue: https://github.com/MorrissyLab/spOT-NMF/issues
📜 License
GPL-3.0. See LICENSE for details.
💬 Support
Questions or need help? Open an Issue: https://github.com/MorrissyLab/spOT-NMF/issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spot_nmf-0.1.2.tar.gz.
File metadata
- Download URL: spot_nmf-0.1.2.tar.gz
- Upload date:
- Size: 412.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
678af8b97f8150c58984a1afb78cfce32500bf9f04f8e9bfdac920b743747b49
|
|
| MD5 |
56e9c8387482698899639d85b44dc3b2
|
|
| BLAKE2b-256 |
df52e188f916c0ea2bb012b08f093c04130301b3984c5cfd20d0cfb5bff1dace
|
Provenance
The following attestation bundles were made for spot_nmf-0.1.2.tar.gz:
Publisher:
publish.yml on MorrissyLab/spOT-NMF
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spot_nmf-0.1.2.tar.gz -
Subject digest:
678af8b97f8150c58984a1afb78cfce32500bf9f04f8e9bfdac920b743747b49 - Sigstore transparency entry: 1942203887
- Sigstore integration time:
-
Permalink:
MorrissyLab/spOT-NMF@3787c272c9267f625764afde0a7921f78dc39648 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/MorrissyLab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3787c272c9267f625764afde0a7921f78dc39648 -
Trigger Event:
release
-
Statement type:
File details
Details for the file spot_nmf-0.1.2-py3-none-any.whl.
File metadata
- Download URL: spot_nmf-0.1.2-py3-none-any.whl
- Upload date:
- Size: 405.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4bf5fb1f3b106e1aefbdaddd2b12295aa7e4320169c07d6cd03d0740500f93cd
|
|
| MD5 |
3568f6ddd47e7dc60e357410fe29a387
|
|
| BLAKE2b-256 |
064a35a097bdbd264de3ee42f9dad67ad3b5b093b911b797d3e6410baba48b77
|
Provenance
The following attestation bundles were made for spot_nmf-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on MorrissyLab/spOT-NMF
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spot_nmf-0.1.2-py3-none-any.whl -
Subject digest:
4bf5fb1f3b106e1aefbdaddd2b12295aa7e4320169c07d6cd03d0740500f93cd - Sigstore transparency entry: 1942204124
- Sigstore integration time:
-
Permalink:
MorrissyLab/spOT-NMF@3787c272c9267f625764afde0a7921f78dc39648 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/MorrissyLab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3787c272c9267f625764afde0a7921f78dc39648 -
Trigger Event:
release
-
Statement type: