STDrug: A Computational Method to Use Spatial Transcriptomics to Aid Personalized Drug-reposition Recommendation
Project description
STDrug: A Computational Method to Use Spatial Transcriptomics to Aid Personalized Drug-reposition Recommendation
Drug repurposing is a cost-effective strategy for accelerating therapeutic discovery, yet existing single-cell RNA-seq (scRNA-seq)-based methods often overlook the spatial context critical for capturing tissue-specific drug responses. We introduce STADS (Spatial Transcriptomics to Aid Drug-reposition Strategy), a personalized computational framework that leverages spatial transcriptomics data to improve drug repurposing.
Illustration of STADS architecture. STADS utilizes paired diseased and normal tissues from the same patients as input for its spatial domain identification module. This module first performs batch correction and sample alignment before applying a GCN combined with the coherent point drift (CPD) algorithm to identify corresponding spatial domains across conditions. These paired spatial domains then serve as inputs for the drug repurposing module, which identifies potentially reversible genes by comparing differentially expressed genes (DEGs) between spatial domains and integrating drug perturbation data from the L1000 dataset. To prioritize key reversible genes, STADS leverages weights extracted from an XGBoost model trained on potential drug information retrieved from GPT-4o. Additionally, STADS accounts for spatial domain interactions in its drug score calculation. The final drug score is computed by integrating spatial domain proportions and interactions, the significance and weighted influence of reversible genes from XGBoost, as well as drug side effect profiles and sensitivity data. The potential drugs are then validated using empirical evidence from literature and clinical trials, LLM-validated potential drug information, in-silico validation using EHR, and in-vitro validation using cell line experiments.
Installation
Docker
Install STDrug pre-built environment on AMD64 Linux using Docker.
docker pull akiyiwen/stdrug:latest
Run the docker image. Follow the instructions printed in the console, open a browser and navigate to http://localhost:8888 for Jupyter Notebook, http://localhost:8787 for Rstudio. Both tools will be needed to run a complete STDrug pipeline. For details, follow Tutorial and make sure to download the reference data either in or outside the docker container. If the reference data is downloaded outside docker, mount the corresponding path when starting the image. If prompted for username and password, use the one printed in the console.
docker run -it --rm -p 8787:8787 -p 8888:8888 akiyiwen/stdrug:latest
# Your user name is: arch
# Your password is: <long password>
# Start Jupyter Notebook at http://localhost:8888
# Start RStudio Server at http://localhost:8787
# If the reference data is outside docker, use -v to mount
# docker run -it --rm -p 8787:8787 -p 8888:8888 -v <downloaded data dir>:/home/arch/data akiyiwen/stdrug:latest
Manual install
STDrug consists of a spatial domain matching module and a drug score calculation module, written in Python and R correspondingly. For a manual installation and custom package usage, please follow the directions here.
The package is tested under Python 3.12 and R 4.5.2.
Install Python requirements using pip:
pip install stdrug
Install R environments using devtools and BiocManager:
install.packages(c("BiocManager", "devtools"))
BiocManager::install(c("cmapR", "limma"))
devtools::install_github(c("immunogenomics/presto", "jinworks/CellChat"))
devtools::install_github("akiyiwen/STdrug")
Alternatively, use renv:
install.packages("renv")
renv::init(bioconductor = T, repos = "https://cloud.r-project.org")
# Restart R session
renv::install("akiyiwen/STdrug")
Tutorial
Data preparation
If working on Linux, use scripts/datasets.sh to download and extract required reference data and example data for STDrug input.
curl -sL 'https://raw.githubusercontent.com/akiyiwen/STdrug/bd2f05cdcfe1af77db95c5796884d72c853f464b/scripts/datasets.sh' | bash
Alternatively, first manually download drug reference data from Dropbox. It is recommended to create a folder named data and extract the reference files under data/reference. After downloading, the folder should have the following structure:
data
└── reference
├── drug_validation
│ ├── liver.csv
│ └── prostate.csv
├── l1000
│ ├── GSE70138.tar.gz
│ └── GSE92742.tar.gz
├── tahoe
│ └── drug_ref.rds
├── fda.txt
├── gdsc.csv
└── sider.csv
Extract tarball files using tar:
tar -xzvf data/reference/l1000/GSE70138.tar.gz -C data/reference/l1000
tar -xzvf data/reference/l1000/GSE92742.tar.gz -C data/reference/l1000
After extraction, the folder structure should have the following structure:
data
└── reference
├── drug_validation
│ ├── liver.csv
│ └── prostate.csv
├── l1000
│ ├── GSE70138
│ │ ├── GSE70138_Broad_LINCS_cell_info_2017-04-28.txt
│ │ ├── GSE70138_Broad_LINCS_gene_info_2017-03-06.txt
│ │ ├── GSE70138_Broad_LINCS_inst_info_2017-03-06.txt
│ │ ├── GSE70138_Broad_LINCS_Level5_COMPZ_n118050x12328_2017-03-06.gctx
│ │ └── GSE70138_Broad_LINCS_sig_info_2017-03-06.txt
│ └── GSE92742
│ ├── GSE92742_Broad_LINCS_cell_info.txt
│ ├── GSE92742_Broad_LINCS_gene_info.txt
│ ├── GSE92742_Broad_LINCS_Level5_COMPZ.MODZ_n473647x12328.gctx
│ └── GSE92742_Broad_LINCS_sig_info.txt
├── tahoe
│ └── drug_ref.rds
├── fda.txt
├── gdsc.csv
└── sider.csv
(Optional) Download the sample data analyzed in the manuscript from Dropbox. You can also put them under data.
data
├── HCC01N.h5ad
├── HCC01N.rds
├── HCC01T.h5ad
├── HCC01T.rds
├── HCC02N.h5ad
├── HCC02N.rds
├── HCC02T.h5ad
├── HCC02T.rds
├── HCC03N.h5ad
├── HCC03N.rds
├── HCC03T.h5ad
├── HCC03T.rds
├── HCC04N.h5ad
├── HCC04N.rds
├── HCC04T.h5ad
└── HCC04T.rds
Quick Start
Spatial domain identification
The first step of STDrug is to identify spatial domains that match patient tumor tissue and adjacent normal tissue. Run Python script following the tutorial Identify spatial domains using STDrug for multiple samples. If using docker, open example.ipynb in Jupyter Notebook.
This module should produce output files in the following structure:
./output
|-- checkpoint
| |-- stads_cluster.h5ad // AnnData of integrated spatial data with spatial clustering
|-- partition.csv // Spatial domain annotation and meta data
Drug repurposing
Following the spatial domain identification module, STDrug uses a comprehensive drug ranking algorithm to repurpose drugs personalized for each patient. In this step, use R script to run the module following the tutorial Use STDrug to calculate drug score for multiple samples. If using docker, open example.rmd in Rstudio.
STDrug generates drug outputs structured as follows. The repurposed top drugs can be inspected from ./output/drugs_<patient>.csv.
./output
|-- checkpoint
| |-- cci_ratio_<patient>.csv // Cell-cell interation results for patient
| |-- drug_scores_<patient>.csv // Spatial domain specific drug score results for patient
| |-- stads_cluster.h5ad // AnnData of integrated spatial data with spatial clustering
| |-- stads_cluster.rds // Seurat object of integrated spatial data with spatial clustering
|-- drugs_<patient>.csv // Drug score and ranking for patient, higher drug score means better treatment potential
|-- partition.csv // Spatial domain annotation and meta data
Contributor
Citation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stdrug-0.0.2.tar.gz.
File metadata
- Download URL: stdrug-0.0.2.tar.gz
- Upload date:
- Size: 15.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ebc77b9d42572b4f88aa79db2287cc4f1797600389769277333345ad698313c7
|
|
| MD5 |
41fe67853164c5552e6bd5ff4429cba6
|
|
| BLAKE2b-256 |
722154c308521527f8397a3bc9fce88397eda308796004db831fdf8e4f667426
|
File details
Details for the file stdrug-0.0.2-py3-none-any.whl.
File metadata
- Download URL: stdrug-0.0.2-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce762cd93961b4983bfd06b89a45f5a1ba081dddaeb3b78945478f377e3321d0
|
|
| MD5 |
3e5a455e47d8cbf441c396f67d0f62e8
|
|
| BLAKE2b-256 |
f0de2a6890d607036327f8b1cbbbac84856a2ca066314003d570a0a0fec85aa0
|