Skip to main content

GPU-accelerated Robust Cell Type Decomposition (RCTD) for spatial transcriptomics using PyTorch

Project description

rctd-py

GPU-accelerated spatial transcriptomics deconvolution — 4× faster than R

CI PyPI Python License: GPL v3


A Python reimplementation of the spacexr RCTD algorithm (Cable et al., Nature Biotechnology 2022) with GPU acceleration via PyTorch.

Deconvolve spatial transcriptomics spots (Visium, Xenium, MERFISH, Slide-seq, …) into cell type proportions using a scRNA-seq reference — in minutes instead of hours.

✨ Highlights

🚀 4× end-to-end speedup Xenium 58k cells: 12 min (L40S GPU) vs 51 min (R, 8 CPU)
🎯 99.7% concordance with R spacexr 100% with sigma_override — per-pixel solver is bit-identical
🔧 Drop-in replacement Same algorithm, same parameters, same results — just faster
📦 pip install rctd-py Pure Python, works on CPU out of the box

Quick Start

from rctd import Reference, run_rctd
import anndata

# Load data
reference = Reference(anndata.read_h5ad("reference.h5ad"), cell_type_col="cell_type")
spatial = anndata.read_h5ad("spatial.h5ad")

# Run RCTD — handles normalization, sigma estimation, and deconvolution
result = run_rctd(spatial, reference, mode="doublet")

📓 Tutorial notebook (marimo) · 🌐 Rendered tutorial

Installation

pip install rctd-py   # CPU (works everywhere; GPU auto-detected if CUDA available)
GPU setup and CUDA compatibility

Recommended setup

Install PyTorch with CUDA before installing rctd-py — pip install rctd-py alone pulls CPU-only PyTorch on most systems:

# CUDA 12.4 (recommended for drivers >= 550.54)
pip install torch --index-url https://download.pytorch.org/whl/cu124
pip install rctd-py

# CUDA 12.1 (for older drivers >= 530.30)
pip install torch --index-url https://download.pytorch.org/whl/cu121

# CUDA 11.8 (legacy, drivers >= 520.61)
pip install torch --index-url https://download.pytorch.org/whl/cu118

Verify GPU detection

import torch
print(torch.cuda.is_available())    # True  (False means CPU-only torch or driver issue)
print(torch.cuda.get_device_name()) # e.g. 'NVIDIA L40S'
print(torch.version.cuda)           # e.g. '12.4'

CUDA compatibility table

No separate CUDA toolkit installation needed. PyTorch ships its own CUDA runtime — you only need a compatible NVIDIA driver.

PyTorch version Bundled CUDA Minimum NVIDIA driver
2.5+ CUDA 12.4 >= 550.54
2.3–2.4 CUDA 12.1 >= 530.30
2.0–2.2 CUDA 11.8 >= 520.61

Tip: Check your driver version with nvidia-smi (top right of the output). This is the driver version, not the CUDA toolkit version — nvcc --version shows the toolkit version, which is irrelevant here since PyTorch bundles its own runtime.

Tested GPUs

GPU VRAM Speedup (58k cells, K=45)
NVIDIA L40S 48 GB 4.3×

Memory management

Use the batch_size parameter in run_rctd to control GPU memory usage:

VRAM Recommended batch_size
24+ GB 10,000 (default)
8–16 GB 5,000
< 8 GB 2,000

Deconvolution Modes

Mode What it does Best for
full Estimates weights for all K cell types per pixel (constrained IRWLS) Visium, continuous mixtures
doublet Classifies each pixel as singlet or doublet, estimates top 1–2 types Slide-seq, sparse spatial
multi Greedy forward selection of up to 4 cell types per pixel Xenium, MERFISH, dense platforms

Benchmarks

End-to-end performance (Xenium, 45 cell types, doublet mode, L40S GPU)

Benchmark barplot

Dataset # cells R spacexr (8 CPU) rctd-py (L40S GPU) Speedup
Xenium (large) 58,191 51.1 min 11.8 min 4.3×
Xenium (small) 13,940 14.1 min 3.5 min 4.0×

Note: The IRWLS solver loop is memory-bandwidth bound for large cell type panels (K=45). Speedup scales with the number of cell types — smaller panels (K < 20) see larger speedups.

Validation

Validated against R spacexr on two Xenium datasets (45 cell types, 380 genes, doublet mode, UMI_min=20):

Dataset # cells Dominant type agreement With sigma_override
Xenium (small) 13,940 99.73% 100%
Xenium (large) 58,191 99.71%

The tiny default gap (0.27%) traces entirely to platform-effect estimation (fit_bulk), not the per-pixel solver — which is bit-identical to R. All disagreeing pixels are genuinely ambiguous (margin < 0.05 between top two types).

sigma_override is not needed for normal use. The default Python-estimated sigma is valid and produces near-identical results. It exists for specific scenarios:

  • Validation — proving solver equivalence with R
  • Migration — replicating exact R spacexr results when you already have R's sigma
  • Reproducibility — locking sigma to a known value across runs
# Only if you need exact R concordance and know R's sigma value:
result = run_rctd(spatial, reference, mode="doublet", sigma_override=62)

API

Click to expand full API reference

run_rctd(spatial, reference, mode, config, batch_size, sigma_override)

End-to-end pipeline. Takes an AnnData spatial object and a Reference, returns a typed result (FullResult, DoubletResult, or MultiResult). Pass sigma_override (int) to skip sigma estimation and use a known value (e.g. from R).

Reference(adata, cell_type_col, cell_min, n_max_cells, min_UMI)

Constructs cell type profiles from a scRNA-seq AnnData. Filters cell types below cell_min, caps per-type cells at n_max_cells.

RCTD(spatial, reference, config)

Stateful class for step-by-step control. Call fit_platform_effects(), then run_full_mode, run_doublet_mode, or run_multi_mode.

RCTDConfig — key parameters

Parameter Default Description
UMI_min 100 Minimum UMI count per pixel
UMI_min_sigma 300 Minimum UMI for sigma estimation
N_fit 1000 # cells for sigma fitting
MAX_MULTI_TYPES 4 Max cell types in multi mode
CONFIDENCE_THRESHOLD 5.0 Singlet confidence threshold
DOUBLET_THRESHOLD 20.0 Doublet certainty threshold

Result types

  • FullResultweights (N×K), cell_type_names, converged
  • DoubletResultweights, weights_doublet (N×2), spot_class, first_type, second_type
  • MultiResultweights, cell_type_indices, n_types, conf_list

Citation

If you use rctd-py, please cite the original RCTD paper:

@article{cable2022robust,
  title={Robust decomposition of cell type mixtures in spatial transcriptomics},
  author={Cable, Dylan M and Murray, Evan and Zou, Luli S and Goeva, Aleksandrina and Macosko, Evan Z and Chen, Fei and Bhatt, Shreya and Denber, Hannah S and others},
  journal={Nature Biotechnology},
  volume={40},
  pages={517--526},
  year={2022},
  doi={10.1038/s41587-021-00830-w}
}

Contributing

Contributions welcome! See CONTRIBUTING.md for setup instructions, or open an issue.

License

GNU General Public License v3.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rctd_py-0.2.0.tar.gz (451.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rctd_py-0.2.0-py3-none-any.whl (46.6 kB view details)

Uploaded Python 3

File details

Details for the file rctd_py-0.2.0.tar.gz.

File metadata

  • Download URL: rctd_py-0.2.0.tar.gz
  • Upload date:
  • Size: 451.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for rctd_py-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e962b43674d73eaf3f2148cc437201c6c84931563457ece510ac4c0e47553d5e
MD5 252f478db2813269733a909d4c344b2e
BLAKE2b-256 21e0c3e5cb67f1ee7255c09d81c8647740c44a947e5673dbb0b2617c18eca044

See more details on using hashes here.

File details

Details for the file rctd_py-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: rctd_py-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 46.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for rctd_py-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 76b7baaccdf276b153ab43dd9f167abdfac7aed21b28cb820bc1d392ee10f6a2
MD5 5478100a3f1830f28ac213c194223c45
BLAKE2b-256 fb47eda6bad5007a5b800683c52a499ab34c90ad463ae3a8028b9ae5b68a602d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page