Skip to main content

Optimal Transport-based Matrix Factorization for spatial transcriptomics deconvolution.

Project description

spOT-NMF

Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics Abdelkareem, A.O. et al.(2025)

spOT-NMF is a Python package for unsupervised deconvolution and discovery of gene programs in spatial transcriptomics. It integrates Optimal Transport (OT) into a non-negative matrix factorization (NMF) framework, enabling robust topic modeling, high-resolution spatial deconvolution, and rich biological annotation.

This package supports the analyses in: spOT-NMF: Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics — bioRxiv (2025). DOI: 10.1101/2025.08.02.668292


✨ Key Features

  • OT-NMF Deconvolution: Reference-free topic modeling with OT-regularized NMF.
  • HVG Selection: Flexible, batch-aware highly variable gene selection.
  • Biological Annotation: Automated enrichment and gene-set overlap of inferred programs.
  • Spatial Visualization: Publication-quality spatial plots for topic/program usage.
  • Scalable & Modular: Built for large datasets and multi-sample workflows.
  • CLI & Python API: Run from the command line or import in notebooks.

📦 Installation

  1. Install PyTorch (CPU or CUDA) for your platform (see pytorch.org). Examples:
# CPU-only
pip install torch --index-url https://download.pytorch.org/whl/cpu
# CUDA 11.8 (Linux/Windows with NVIDIA GPUs)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
  1. Install spOT-NMF:
pip install spot-nmf
  1. Verify the CLI:
spotnmf --help

Conda users:

conda create -n spotnmf python=3.12
conda activate spotnmf
# install torch as above, then:
pip install spot-nmf

🚀 Quick Start

Command Line

Full pipeline (deconvolution → annotation → spatial plots):

spotnmf spotnmf \
  --sample_name SAMPLE1 \
  --adata_path ./data/sample1.h5ad \
  --results_dir ./results \
  --k 5

Other commands:

spotnmf deconvolve --sample_name SAMPLE1 --adata_path ./data/sample1.h5ad --results_dir ./results --k 5
spotnmf plot       --sample_name SAMPLE1 --adata_path ./data/sample1.h5ad --results_dir ./results
spotnmf annotate   --sample_name SAMPLE1 --results_dir ./results --genome GRCh38
spotnmf network    --sample_name SAMPLE1 --results_dir ./results --usage_threshold 0 --n_bins 1000 --edge_threshold 0.199

Python / Notebooks

import spotnmf as spot

# === Configuration === #
DATA_PATH = Path("data/test_data/dataset10_adata_spatial.h5ad")
RESULTS_DIR = Path(r"/data/test_results/")
SAMPLE_NAME = "TestSample"
GENOME = "mm10"

# === Read Data === #
adata = spot.io.read_adata(
    data_path=DATA_PATH,
    data_mode="h5ad"
)

# === Model Parameters === #
model_params = {
    "lr": 0.001,         # Learning rate
    "h": 0.01,           # H regularization
    "w": 0.01,           # W regularization
    "eps": 0.05,         # Epsilon
    "normalize_rows": True,
}

# === Run Factorization === #
results = spot.cli.run_experiment(
    adata_spatial=adata,
    k=5,                        # Number of ranks
    sample_name=SAMPLE_NAME,
    results_dir=str(RESULTS_DIR),
    genome=GENOME,
    annotate=False,
    plot=False,
    network=False,
    is_visium=True,
    model_params=model_params,
)

# === Annotate Programs === #
spot.cli.annotate_programs(
    results_dir=str(RESULTS_DIR),
    sample_name=SAMPLE_NAME,
    genome=GENOME,
)

⚙️ CLI Overview

Command Description
spotnmf Full pipeline: deconvolution → annotation → spatial plotting
deconvolve Run OT-NMF and save results
plot Visualize spatial topic/program usage
annotate Enrich and annotate gene programs
network Visualize niche networks based on topic interactions

Run spotnmf <command> --help for per-command options.


📁 Outputs

  • topics_per_spot_{sample}.csv — topic/program usage per spot
  • genescores_per_topic_{sample}.csv — gene scores per topic
  • ranked_genescores_{sample}.csv — ranked marker genes per topic
  • Pathway enrichment and gene-set overlap tables
  • Spatial plots & QC visualizations
  • Network plots of topic–topic interactions

🔬 Reproducibility (Manuscript Notebooks)

The main branch provides the reusable software package. The original Jupyter notebooks used to reproduce manuscript figures are maintained in the manuscript branch:

git fetch origin
git checkout manuscript

Notebooks are in:

scripts/manuscript_notebooks/

Use manuscript to regenerate paper figures; use main for running the package on your data.


🧾 Citation

Please cite:

Abdelkareem, A.O., Gill, G.S., Manoharan, V.T., Verhey, T.B., & Morrissy, A.S. spOT-NMF: Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics. bioRxiv (2025). https://doi.org/10.1101/2025.08.02.668292

@article{abdelkareem2025spotnmf,
  title   = {spOT-NMF: Optimal Transport-Based Matrix Factorization for Accurate Deconvolution of Spatial Transcriptomics},
  author  = {Abdelkareem, Aly O. and Gill, Gurveer S. and Manoharan, Varsha Thoppey and Verhey, Theodore B. and Morrissy, A. Sorana},
  journal = {bioRxiv},
  year    = {2025},
  doi     = {10.1101/2025.08.02.668292},
  url     = {https://www.biorxiv.org/content/10.1101/2025.08.02.668292v1},
  note    = {Preprint}
}

🤝 Contributing

We welcome ideas, bug reports, and feature requests—please open a GitHub Issue: https://github.com/MorrissyLab/spOT-NMF/issues


📜 License

GPL-3.0. See LICENSE for details.


💬 Support

Questions or need help? Open an Issue: https://github.com/MorrissyLab/spOT-NMF/issues

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spot_nmf-0.1.0.tar.gz (397.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spot_nmf-0.1.0-py3-none-any.whl (75.5 kB view details)

Uploaded Python 3

File details

Details for the file spot_nmf-0.1.0.tar.gz.

File metadata

  • Download URL: spot_nmf-0.1.0.tar.gz
  • Upload date:
  • Size: 397.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for spot_nmf-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f7e246de8eab1f08da2d5e28752442817056786c32ae4e74f014f0cd649507a0
MD5 7095400337d9864347d9004201347e51
BLAKE2b-256 bc4b8135b175d6df3a9e32b47a4b1b14198854abf9bd96936d7f7f1f026b4a7b

See more details on using hashes here.

File details

Details for the file spot_nmf-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: spot_nmf-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 75.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for spot_nmf-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 547642dfbf4f5b7d6a7ada2cb0e9d6cc0dc11a16edd5fe2d0a4d7087cf8ec573
MD5 a0fe16041f618be20f57b18d1bf427e5
BLAKE2b-256 12f95ba3d9e4b17fd4f4d5dc4b4129523acb7a79b78afc6e09f8206238292f72

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page