Skip to main content

SpatialFusion: Deep learning models for spatial omics data analysis.

Project description

SpatialFusion

Docs

SpatialFusion is a Python package for deep learningโ€“based analysis of spatial omics data. It provides a lightweight framework that integrates spatial transcriptomics (ST) with H&E histopathology to learn joint multimodal embeddings of cellular neighborhoods and group them into spatial niches.

The method operates at single-cell resolution, and can be applied to:

  • paired ST + H&E datasets
  • H&E whole-slide images alone

By combining molecular and morphological features, SpatialFusion captures coordinated patterns of tissue architecture and gene expression. A key design principle is a biologically informed definition of niches: not simply spatial neighborhoods, but reproducible microenvironments characterized by pathway-level activation signatures and functional coherence across tissues. To reflect this prior, the latent space of the model is trained to encode biologically meaningful pathway activations, enabling robust discovery of integrated niches.

The method is described in the paper: XXX (citation forthcoming).

You can find detailed documentation at https://uhlerlab.github.io/spatialfusion/


Installation

We provide pretrained weights for the multimodal autoencoder (AE) and graph convolutional masked autoencoder (GCN) under data/.

SpatialFusion depends on PyTorch and DGL, which have different builds for CPU and GPU systems. You can install it using pip or inside a conda/mamba environment.


1. Create mamba environment

mamba create -n spatialfusion python=3.10 -y
mamba activate spatialfusion
# Then install GPU or CPU version below

2. Install platform-specific libraries (GPU vs CPU)

GPU (CUDA 12.4)

pip install "torch==2.4.1" "torchvision==0.19.1" \
  --index-url https://download.pytorch.org/whl/cu124
conda install -c dglteam/label/th24_cu124 dgl

Note: TorchText issues exist for this version: https://github.com/pytorch/text/issues/2272 โ€” this may affect scGPT.


GPU (CUDA 12.1) โ€” Recommended if using scGPT

pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 \
  --index-url https://download.pytorch.org/whl/cu121
conda install -c dglteam/label/th21_cu121 dgl

# Optional: embeddings used by scGPT
pip install --no-cache-dir torchtext==0.18.0 torchdata==0.9.0

# Optional: UNI (H&E embedding model)
pip install timm

CPU-only

pip install "torch==2.4.1" "torchvision==0.19.1" \
  --index-url https://download.pytorch.org/whl/cpu
pip install dgl -f https://data.dgl.ai/wheels/torch-2.4/repo.html

# Optional, used for scGPT
pip install --no-cache-dir torchtext==0.18.0 torchdata==0.9.0

# Optional, used for UNI
pip install timm

๐Ÿ’ก Replace cu124 with the CUDA version matching your system (e.g., cu121).


3. Install SpatialFusion package

Basic installation โ€” Recommended for users

cd spatialfusion/
pip install -e .

Developer installation - Recommended for contributors

Includes: pytest, black, ruff, sphinx, matplotlib, seaborn.

cd spatialfusion/
pip install -e ".[dev,docs]"

4. Verify Installation

python - <<'PY'
import torch, dgl, spatialfusion
print("Torch:", torch.__version__, "CUDA available:", torch.cuda.is_available())
print("DGL:", dgl.__version__)
print("SpatialFusion OK")
PY

5. Notes

  • Default output directory is:

    $HOME/spatialfusion_runs
    

    Override with:

    export SPATIALFUSION_ROOT=/your/path
    
  • CPU installations work everywhere but are significantly slower.


Usage Example

A minimal example showing how to embed a dataset using the pretrained AE and GCN:

from spatialfusion.embed.embed import AEInputs, run_full_embedding
import pandas as pd
import scanpy as sc

# Load external embeddings (UNI + scGPT)
uni_df = pd.read_parquet('UNI.parquet')
scgpt_df = pd.read_parquet('scGPT.parquet')

# Load AnnData object
adata = sc.read_h5ad("object.h5ad")

# Mapping sample_name -> AEInputs
sample_name = 'sample1'
ae_inputs_by_sample = {
    sample_name: AEInputs(
        adata=adata,
        z_uni=uni_df,
        z_scgpt=scgpt_df,
    ),
}

# Run the multimodal embedding pipeline
emb_df = run_full_embedding(
    ae_inputs_by_sample=ae_inputs_by_sample,
    device="cuda:0", # if cpu, "cpu"
    combine_mode="average",
    spatial_key='spatial',
    celltype_key='major_celltype',
    save_ae_dir=None,  # optional
)

This produces a DataFrame containing the final integrated embedding for all cells/nuclei.


Required Inputs

SpatialFusion operates on a single-cell AnnData object paired with an H&E whole-slide image. It also accepts only a WSI with cell coordinates in the H&E only mode.

AnnData fields

Key Description
adata.obsm['spatial'] X/Y centroid coordinates of each cell/nucleus in WSI pixel space.
adata.X Raw counts (cell ร— gene). Must be single-cell resolution.
adata.obs['celltype'] (optional) Annotated cell types (major_celltype in examples).

Whole-Slide Image (WSI)

A high-resolution H&E image corresponding to the same tissue section used for ST. Used to compute morphology embeddings such as UNI.


Typical Workflow

  1. Prepare ST AnnData and the matched H&E WSI

  2. Run scGPT to compute molecular embeddings

  3. Run UNI to compute morphology embeddings

  4. Run SpatialFusion to integrate all modalities into joint embeddings

  5. Cluster & visualize

    • Leiden clustering
    • UMAP
    • Spatial niche maps

Tutorials

A complete tutorial notebook is available at:

tutorials/embed-and-finetune-sample.ipynb

Additional required packages (scGPT, UNI dependencies) must be installed manually. Follow the instructions at: https://github.com/bowang-lab/scGPT

We also provide a ready-to-use environment file:

spatialfusion_env.yml

Tutorial data is available on Zenodo: https://zenodo.org/records/17594071


Repository Structure

.
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ src
โ”‚   โ””โ”€โ”€ spatialfusion
โ”‚       โ”œโ”€โ”€ data
โ”‚       โ”‚   โ”œโ”€โ”€ checkpoint_dir_ae
โ”‚       โ”‚   โ”‚   โ””โ”€โ”€ spatialfusion-multimodal-ae.pt
โ”‚       โ”‚   โ””โ”€โ”€ checkpoint_dir_gcn
โ”‚       โ”‚       โ”œโ”€โ”€ spatialfusion-full-gcn.pt
โ”‚       โ”‚       โ””โ”€โ”€ spatialfusion-he-gcn.pt
โ”‚       โ”œโ”€โ”€ embed/
โ”‚       โ”œโ”€โ”€ finetune/
โ”‚       โ”œโ”€โ”€ models/
โ”‚       โ””โ”€โ”€ utils/
โ”œโ”€โ”€ tests
โ”‚   โ”œโ”€โ”€ test_basic.py
โ”‚   โ”œโ”€โ”€ test_finetune.py
โ”‚   โ””โ”€โ”€ test_imports.py
โ””โ”€โ”€ tutorials
    โ”œโ”€โ”€ data
    โ””โ”€โ”€ embed-and-finetune-sample.ipynb

Highlights:

  • src/spatialfusion/data/ โ€” packaged pretrained AE and GCN checkpoints

  • src/spatialfusion/ โ€” main library modules

    • embed/ โ€” embedding utilities & pipeline
    • finetune/ โ€” niche-level finetuning
    • models/ โ€” neural network architectures
    • utils/ โ€” loaders, graph utilities, checkpoint code
  • tests/ โ€” basic test suite

  • tutorials/ โ€” practical examples and sample data


Citing

If you use SpatialFusion, please cite:

Broad Institute Spatial Foundation, SpatialFusion (2025). https://github.com/broadinstitute/spatialfusion

Full manuscript citation will be added when available.


Version

Version

This is the initial public release (v0.1.0).


License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spatialfusion-0.1.0.tar.gz (15.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spatialfusion-0.1.0-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file spatialfusion-0.1.0.tar.gz.

File metadata

  • Download URL: spatialfusion-0.1.0.tar.gz
  • Upload date:
  • Size: 15.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for spatialfusion-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6e21a9ca02193a3d1866836468ab809336b63e0785bbb20047f71bfdadf31eb4
MD5 3e49d690705a3b363ef7293d93872a9d
BLAKE2b-256 1865858aaeabf4ea9dc4d7928276907807e580d76dfc5cb15168c8df476f736d

See more details on using hashes here.

File details

Details for the file spatialfusion-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: spatialfusion-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for spatialfusion-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ced6ef13b9c6dd976f77c762460ebde6f7580b266eefb0855632e3a5a40746e
MD5 ee9ce80385ae601fe154ee52c8006fd4
BLAKE2b-256 d9d7e366bf418d6ecaaa4161b10a62b20af5873cec6a01882c906fd13ded1c56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page