Skip to main content

STEP, an acronym for Spatial Transcriptomics Embedding Procedure, is a deep learning-based tool for the analysis of single-cell RNA (scRNA-seq) and spatially resolved transcriptomics (SRT) data. STEP introduces a unified approach to process and analyze multiple samples of scRNA-seq data as well as align several sections of SRT data, disregarding location relationships. Furthermore, STEP conducts integrative analysis across different modalities like scRNA-seq and SRT.

Project description

STEP: Spatial Transcriptomics Embedding Procedure

Docs Pages PyPI version DOI

image image

Introduction

STEP, an acronym for Spatial Transcriptomics Embedding Procedure, is a foundation deep learning/AI architecture for the analysis of spatially resolved transcriptomics (SRT) data, and is also compatible with scRNA-seq data. STEP roots on the precise captures of three major varitions occured in the SRT (and scRNA-seq) data: Transcriptional Variations, Batch Variations and Spatial Variations with the correponding modular designs: Backbone model: a Transformer based model togther with gene module seqeunce mapping; Batch-effect model: A pair of inverse transformations utilizing the batch-embedding conception for the decoupled batch-effect elimination; Spatial model: a GCN-based spatial filter/smoother working on the extracted embedding from the Backbone model, different from the usage of GCN in other methods as a feature extractor. Thus, with the proper combinations of these models, STEP introduces a unified approach to systematically process and analyze single or multiple samples of SRT data, disregarding location relationships between sections (meaning both contiguous and non-contiguous sections), to reveal multi-scale bilogical heterogeneities (cell types and spatial domains) in multi-resolution SRT data. Furthermore, STEP can also conduct integrative analysis on scRNA-seq and SRT data.

Key Features

  • Integration of multiple scRNA-seq and single-cell resolution SRT samples to reveal cell-type level heterogeneities.
  • Alignment of various SRT data sections contiguous or non-contiguous to identify spatial domains across sections.
  • Scalable to different data resolutions, i.e., wild range of technologies and platforms of SRT data, including Visium HD, Visum, MERFISH, STARmap, Stereo-seq, ST, etc.
  • Scalable to large datasets with a high number of cells and spatial locations.
  • Performance of integrative analysis across modalities (scRNA-seq and SRT) and cell-type deconvolution for the non-single-cell resolution SRT data.

Other Capabilities

  • Capability to produce not only the batch-corrected embeddings but also batch-corrected gene expression profiles for scRNA-seq data.
  • Capability to perform spatial mapping of reference scRNA-seq data points to the spatial locations of SRT data based on the learned co-embeddings and kNN.
  • Comprehensive adata processing by specifically desinged BaseDataset class and its view version MaskedDataset class for the SRT data, which can be easily integrated with the PyTorch DataLoader class for training and validation.
  • Low computational cost and high efficiency in processing large-scale SRT data: 4-8 GB GPU memory is sufficient for processing a dataset with 100,000 spatial locations with 2000+ sample-size in fast mode, and consumes less than 6 minutes for training in 2000 iterations (tested on NVIDIA RTX 3090).

System Requiremtns

  • Software Requirements: Python 3.10+, CUDA 11.6+, PyTorch 1.13.1+, and dgl 1.1.3+.
  • Hardware Requirements: NVIDIA GPU with CUDA support (recommended), 8GB+ GPU memory. As possible as high RAM and CPU cores for storing and processing large-scale data (this is not required by STEP, but by the data itself).

Installation

Quick Installation (Recommended for Python 3.10, Linux, CUDA 11.7)

pip install step-kit[cu117]

This command installs STEP along with compatible versions of PyTorch and DGL.

Basic Installation

pip install step-kit

⚠️ Critical Note:

  • This basic installation does not include PyTorch and DGL.
  • STEP will not function without PyTorch and DGL properly installed.
  • You must install these dependencies separately before using STEP.

Manual Dependency Installation

If you need to install specific versions of PyTorch and DGL:

  1. Install DGL (example for latest DGL compatible with PyTorch 2.3.x and CUDA 11.8):

    pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu118/repo.html
    
  2. Install PyTorch (example for PyTorch 2.3.1 with CUDA 11.8):

    pip install torch==2.3.1+cu118 -f https://download.pytorch.org/whl/cu118/torch_stable.html
    

For more information on installing PyTorch and DGL, visit:

Remember to choose versions compatible with your system configuration and CUDA version (if applicable).

Contribution

We welcome contributions! Please see CONTRIBUTING.md for more details!

License

step is licensed under LICENSE

Contact

If you have any questions, please feel free to contact us at here, or feel free to open an issue on this repository.

Citation

The preprint of STEP is available at bioRxiv. If you use STEP in your research, please cite:

@article{Li2024.04.15.589470,
  title = {{{STEP}}: {{Spatial}} Transcriptomics Embedding Procedure for Multi-Scale Biological Heterogeneities Revelation in Multiple Samples},
  author = {Li, Lounan and Li, Zhong and Li, Yuanyuan and Yin, Xiao-ming and Xu, Xiaojiang},
  year = {2024},
  journal = {bioRxiv : the preprint server for biology},
  eprint = {https://www.biorxiv.org/content/early/2024/04/20/2024.04.15.589470.full.pdf},
  publisher = {Cold Spring Harbor Laboratory},
  doi = {10.1101/2024.04.15.589470},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

step_kit-0.2.5.tar.gz (85.3 kB view details)

Uploaded Source

Built Distribution

step_kit-0.2.5-py3-none-any.whl (99.2 kB view details)

Uploaded Python 3

File details

Details for the file step_kit-0.2.5.tar.gz.

File metadata

  • Download URL: step_kit-0.2.5.tar.gz
  • Upload date:
  • Size: 85.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Linux/6.6.51

File hashes

Hashes for step_kit-0.2.5.tar.gz
Algorithm Hash digest
SHA256 19d9aa551d5b67e85c2c96c9529828b3dfe044718542f1b4e1a1c127c04bf82f
MD5 508a2c05c0e8f562bacd2c31c6f0f138
BLAKE2b-256 9c8095f305ee75a7336bf5088683ca25956aac78e00d789a37aa106f6d652eae

See more details on using hashes here.

File details

Details for the file step_kit-0.2.5-py3-none-any.whl.

File metadata

  • Download URL: step_kit-0.2.5-py3-none-any.whl
  • Upload date:
  • Size: 99.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Linux/6.6.51

File hashes

Hashes for step_kit-0.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 e12b9f4294894dfad421dacc1c239290115a9beefa429668b062d65c048a039d
MD5 2bde6a352c034c1203aa9721ac0a1a31
BLAKE2b-256 49ad9f1b2840d1bb96e723d63b3adc343594a8e400d2431c809882417d03b1a4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page