A cluster-based cell-type deconvolution of spatial transcriptomic data (DECLUST)
Project description
DECLUST is a Python package developed to identify spatially coherent clusters of spots by integrating gene expression profiles with spatial coordinates in spatial transcriptomics data. It also enables accurate estimation of cell-type compositions within each cluster.
🌟 Features
Spatially-aware clustering: Combines gene expression and spatial coordinates.
Robust deconvolution: Aggregates signals over clusters to enhance cell type detection.
Easy to install: Available via pip.
Visualization: Includes modules for visualizing clustering and marker gene expression.
⏬ Installation
We recommend using a separate Conda environment. Information about Conda and how to install it can be found in the anaconda webpage.
- Create a conda environment and install the DECLUST package
conda create -n declust_env python=3.9
conda activate declust_env
pip install declust
- Following dependencies are required to installed in advanace: scanpy, rpy2, and R version >= 4.3 with dplyr R-packages. These dependencies can be installed using the
install_dependencies.shscript:
sh install_dependencies.sh
The DECLUST package has been installed successfully on Operating systems:
- macOS Sequoia 15.3.2
- SUSE Linux Enterprise Server 15 SP5 (Dardel HPC system)
📊 Data Input
DECLUST uses .h5ad files, which are AnnData objects commonly used for storing annotated data matrices in single-cell and spatial transcriptomics analysis.
Each .h5ad file includes:
sc_adata.h5ad (Single-cell RNA-seq data)
.X: Gene expression matrix (cells × genes).obs: Cell type annotation of single cells
st_adata.h5ad (Spatial transcriptomics data)
.X: Spatial gene expression matrix (spots × genes).obs: Spots coordinates
💡 Both datasets should originate from the same tissue and have overlapping gene sets to ensure proper implementation of DECLUST.
🔗 Example Data Download
-
Download the Real Data Example.
-
Download the Simulation Data Example.
⚙️ Usage
DECLUST can be embedded into python scripts or used independently as a tool. A guide of how to use it in python scripts is provided in this tutorial. In this section, we introduce how to use it as a bioinformatics pipeline.
Run the pipeline using the following command:
python declust.py --module <module_name> [other options]
- Available Modules
| Module | Description |
|---|---|
marker |
Construction of Reference Matrix from Annotated Single-Cell Transcriptomic Data |
cluster |
Identification of spatial clusters of spots from ST data |
pseudo_bulk |
Generate pseudo-bulk ST profiles per cluster |
deconv |
Run deconvolution by Ordinary Least Squares |
visualize |
Visualize markers or deconvolution results |
Type python declust.py --help in the terminal to see a list of available commands.
🧬 DECLUST pipeline
- Download DECLUST:
wget https://github.com/Qingyueee/DECLUST/archive/refs/tags/0.1.1.tar.gz
tar -xvf 0.1.1.tar.gz
- Unpack data:
cd DECLUST-0.1.1
unzip data.zip
- Marker gene selection:
python declust.py --module marker \
--celltype_col \
--sample_col
Outputs:
-
sc_data_overlapped.csvandsc_label.csvin thedata/folder -
marker_genes.csvin theresults/folder
- Clustering:
python declust.py --module cluster
Performs Hierarchical Clustering → DBSCAN → Seeded Region Growing (SRG). Saves:
srg_df.csvand clustering plots inresults/
- Deconvolution:
python declust.py --module deconv
Performs OLS-based deconvolution and outputs:
DECLUST_result.csvinresults/
You can run each step individually or execute the entire pipeline by running the deconvolution script.
To export pseudo-bulk profiles for external methods:
python declust.py --module pseduo_bulk
- Generates
pseudo_bulk.csvin theresults/folder.
💡 Custom Marker Genes
Users can provide their own marker gene list in one of two formats:
- CSV file containing two columns:
Gene: gene namesmaxgroup: corresponding cell type annotations
--custom_marker_genes file_path
- Comma-separated gene list, along with a corresponding comma-separated list of cell types:
--custom_marker_genes "DCN, LUM, C1S, AGR2, PPDPF, ..."
--custom_marker_celltype "CAFs, CAFs, CAFs, Cancer Epithelial, Cancer Epithelial, ..."
⚠️ The provided marker genes and cell type annotations must exist in the single-cell dataset.
📬 Quick example to run DECLUST on a simulated data
# 1. Download DECLUST
wget https://github.com/Qingyueee/DECLUST/archive/refs/tags/0.1.1.tar.gz
tar -xvf 0.1.1.tar.gz
cd DECLUST-0.1.1
# 2. Configuring environment and install dependencies
conda create -n declust_env python=3.9
conda activate declust_env
pip install declust
sh install_dependencies.sh
# 3. Download and unpack simulated data
wget "https://drive.usercontent.google.com/download?id=1VY_vIuZalCBe2IhNCNBSQwo5m5Da8aFw&export=download&authuser=0&confirm=t&uuid=93730baf-2a12-49d7-b475-ab715a3644c3&at=APcmpow759exSs6opQk4zSMVbjXf%3A1744370330609" -O simulation_data.zip
unzip simulation_data.zip
# 4. Run pipeline - it may take about 2 minutes to complete on a personal computer
python declust.py --module deconv \
--data_dir simulation_data \
--results_dir simulation_results \
--sc_file sc_adata_200_per_celltype.h5ad \
--st_file st_simu_adata.h5ad \
--celltype_col celltype_major \
--sample_col Patient
# 5. Results visulization
python declust.py --module visualize \
--data_dir simulation_data \
--results_dir simulation_results \
--sc_file sc_adata_200_per_celltype.h5ad \
--st_file st_simu_adata.h5ad
📁 Output Structure
project/
│
├── data/
│ ├── sc_adata_overlapped.h5ad
│ ├── sc_labels.csv
│ └── ...
│
├── results/
│ ├── marker_genes.csv
│ ├── srg_df.csv
│ ├── pseudo_bulk.csv
│ ├── DECLUST_result.csv
│ └── [visualization plots]
License
GNU General Public License v3.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file declust-1.0.2.tar.gz.
File metadata
- Download URL: declust-1.0.2.tar.gz
- Upload date:
- Size: 20.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c6e12ee5ff2bf849a4469e2b71e36a8f236ff10af6e6b577710f00a72dc43f8
|
|
| MD5 |
bea8f56d7ae237225ad7b6458f485861
|
|
| BLAKE2b-256 |
68caa803d4b58df1f0aac205d3bb1ff7ab76976477445744768da4262e8bb94e
|
File details
Details for the file declust-1.0.2-py3-none-any.whl.
File metadata
- Download URL: declust-1.0.2-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b0d15be75770f9639974696207abb8a0f93fb974a045c18d349773a69c733092
|
|
| MD5 |
f72eb260a5da1e16cedf5de680e19a59
|
|
| BLAKE2b-256 |
401a6db5c635a8d0eb19873ba3be016f13a7964148b36b3f6b2dd21266861c69
|