Spatial RNA to Protein prediction with stacking models
Project description
SR2P is a stacking based framework for predicting spatial protein expression from spatial transcriptomics RNA profiles in spatial multi omics data.
🌟 Features
Stacking based protein prediction: Integrates multiple base learners and a meta learner for robust inference.
Spatial feature augmentation: Enables spatial neighborhood enhanced prediction for non GNN models.
Flexible model benchmarking: Supports both classical machine learning models and graph neural networks.
Easy to install: Available via pip.
Ready for RNA only data: Can infer protein abundance for spatial transcriptomics datasets without protein measurements.
⏬ Installation
We recommend using a separate Conda environment. Information about Conda and how to install it can be found in the anaconda webpage.
- Create a conda environment and install the SR2P package
conda create -n sr2p_env python=3.9
conda activate sr2p_env
pip install sr2p
The SR2P package has been installed successfully on Operating systems:
- macOS Sequoia 15.3.2
- Ubuntu 22.04
- SUSE Linux Enterprise Server 15 SP5 (Dardel HPC system)
📊 Data Input
SR2P uses .h5ad files, which are AnnData objects commonly used for spatial transcriptomics and spatial multi omics analysis.
spatial_genomics.h5ad (Spatial multi-omics data: RNA + protein)
.X: Feature matrix (spots × features), including RNA expression and protein abundance.obs: Spot metadata- Spatial coordinates: stored in
.obsor.obsm["spatial"]
st_adata.h5ad (Spatial transcriptomics data: RNA only)
.X: Gene expression matrix (spots × genes).obs: Spot metadata- Spatial coordinates: stored in
.obsor.obsm["spatial"]
🔗 Example Data Download
- Download the Spatial Multi-Omics Data Example.
Example datasets used in the tutorial can be organized under:
sr2p_data/
├── human_breast_cancer_rna_protein.h5ad
├── human_tonsil_rna_protein_1.h5ad
├── human_tonsil_rna_protein_2.h5ad
└── human_glioblastoma_rna_protein.h5ad
⚙️ Usage
A complete guide is provided in this tutorial.
🧬 SR2P workflow
A typical SR2P workflow includes:
-
Load spatial multi omics data
-
RNA and protein preprocessing
-
Train and test matrix construction
-
Spatial neighborhood feature construction (optional)
-
Single Model training and prediction (optional)
-
Stacking based integration for final predictions
📌 Supported models
SR2P supports the following model families:
| Model type | Methods |
|---|---|
| Linear model | PLS |
| Gradient boosting | XGBoost, LightGBM, CatBoost |
| Graph neural networks | GAT, GraphSAGE, DGAT |
| Meta learner | ExtraTrees |
📁 Output
SR2P returns predicted protein abundance as a pandas.DataFrame:
- Rows: spatial spots
- Columns: proteins
- Values: predicted protein abundance
Predictions can be exported to CSV for downstream analysis.
License
GNU General Public License v3.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sr2p-0.1.0.tar.gz.
File metadata
- Download URL: sr2p-0.1.0.tar.gz
- Upload date:
- Size: 40.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39426cdf0b21ae275608fb4cb9bc82296cdf24f9a120b08cca7337a7678165a8
|
|
| MD5 |
689f2af319428e35529b6f8cef13959c
|
|
| BLAKE2b-256 |
8138e295e3d7f2da5cd688c9a6f8a2f17e28903ce797bf9a5c46cd8f1c043aa0
|
File details
Details for the file sr2p-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sr2p-0.1.0-py3-none-any.whl
- Upload date:
- Size: 52.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e25239664099fa86b4a4853b9e6b2de21d76058159dd6b0e0afac9160cd8486
|
|
| MD5 |
33d88913b9031994b2cf913fb7e6f5ae
|
|
| BLAKE2b-256 |
1de4f5f9beb25e13e2ea41acf88f06e40fab8b9f12893925bfe8280a69a04ae8
|