An end-to-end single-cell multimodal analysis model with deep parameter inference.
Project description
Single-cell multimodal modeling with deep parametric inference
The proliferation of single-cell multimodal sequencing technologies has enabled us to understand cellular heterogeneity with multiple views, providing novel and actionable biological insights into the disease-driving mechanisms. Here, we propose the deep parametric inference (DPI) model, an end-to-end framework for single-cell multimodal data analysis. At the heart of DPI is the multimodal parameter space, where the parameters from each modal are inferred by neural networks.
The dpi framework works with scanpy and supports the following single-cell multimodal analyses
- Multimodal data integration
- Multimodal data noise reduction
- Cell clustering and visualization
- Reference and query cell types
- Cell state vector field visualization
Pip install
pip install dpi-sc
Datasets
The dataset participating in "Single-cell multimodal modeling with deep parametric inference" can be downloaded at DPI data warehouse
Tutorial
We use pbmc1k data set to demonstrate the process of DPI analysis of single cell multimodal data.
Import dependencies
import scanpy as sc
import dpi
Retina image output (optional)
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
Load dataset
# The dataset can be downloaded from [Datasets] above.
sc_data = sc.read_h5ad("PBMC_COVID19_Healthy_Annotated.h5ad")
Set marker collection
rna_markers = ["CCR7", "CD19", "CD3E", "CD4"]
protein_markers = ["AB_CCR7", "AB_CD19", "AB_CD3", "AB_CD4"]
Preprocessing
dpi.preprocessing(sc_data)
dpi.normalize(sc_data, protein_expression_obsm_key="protein_expression")
sc_data.var_names_make_unique()
sc.pp.highly_variable_genes(
sc_data,
n_top_genes=3000,
flavor="seurat_v3",
subset=False
)
dpi.add_genes(sc_data, rna_markers)
sc_data = sc_data[:,sc_data.var["highly_variable"]]
dpi.scale(sc_data)
Prepare and run DPI model
Configure DPI model parameters
dpi.build_mix_model(sc_data, net_dim_rna_list=[512, 128], net_dim_pro_list=[128], net_dim_rna_mean=128, net_dim_pro_mean=128, net_dim_mix=128, lr=0.0001)
Run DPI model
dpi.fit(sc_data)
Visualize the loss
dpi.loss_plot(sc_data)
Save DPI model (optional)
dpi.saveobj2file(sc_data, "COVID19PBMC_healthy.dpi")
#sc_data = dpi.loadobj("COVID19PBMC_healthy.dpi")
Visualize the latent space
Extract latent spaces
dpi.get_spaces(sc_data)
Visualize the spaces
dpi.space_plot(sc_data, "mm_parameter_space", color="green", kde=True, bins=30)
dpi.space_plot(sc_data, "rna_latent_space", color="orange", kde=True, bins=30)
dpi.space_plot(sc_data, "pro_latent_space", color="blue", kde=True, bins=30)
Preparation for downstream analysis
Extract features
dpi.get_features(sc_data)
Get denoised datas
dpi.get_denoised_rna(sc_data)
dpi.get_denoised_pro(sc_data)
Cell clustering and visualization
Cell clustering
sc.pp.neighbors(sc_data, use_rep="mix_features")
dpi.umap_run(sc_data, min_dist=0.4)
sc.tl.leiden(sc_data)
Cell cluster visualization
sc.pl.umap(sc_data, color="leiden")
Observe multimodal data markers
RNA markers
dpi.umap_plot(sc_data, featuretype="rna", color=rna_markers, ncols=2)
dpi.umap_plot(sc_data, featuretype="rna", color=rna_markers, ncols=2, layer="rna_denoised")
Protein markers
dpi.umap_plot(sc_data, featuretype="protein", color=protein_markers, ncols=2)
dpi.umap_plot(sc_data, featuretype="protein", color=protein_markers, ncols=2, layer="pro_denoised")
Reference and query
Reference objects need to be pre-set with cell labels.
sc.pl.umap(sc_data, color="initial_clustering", frameon=False, title="PBMC COVID19 Healthy labels")
# The dataset can be downloaded from [Datasets] above.
filepath = "/home/hh/bigdata/hh/DPI/COVID-19/COVID19_Asymptomatic.h5ad"
sc_data_COVID19_Asymptomatic = sc.read_h5ad(filepath)
dpi.normalize(sc_data_COVID19_Asymptomatic, protein_expression_obsm_key="protein_expression")
sc_data_COVID19_Asymptomatic = sc_data_COVID19_Asymptomatic[:,sc_data.var.index]
sc_data_COVID19_Asymptomatic.obsm["rna_nor"] = sc_data.mm_rna.transform(sc_data_COVID19_Asymptomatic.X).astype("float16")
sc_data_COVID19_Asymptomatic.obsm["pro_nor"] = sc_data.mm_pro.transform(sc_data_COVID19_Asymptomatic.obsm["pro_nor"]).astype("float16")
dpi.annotate(sc_data, ref_labelname="initial_clustering", sc_data_COVID19_Asymptomatic)
sc.pl.umap(sc_data_COVID19_Asymptomatic, color="labels", frameon=False, title="PBMC COVID19 Asymptomatic Annotated")
sc.pl.umap(sc_data_COVID19_Asymptomatic, color="initial_clustering", frameon=False, title="PBMC COVID19 Asymptomatic labels")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dpi-sc-1.1.5.tar.gz
.
File metadata
- Download URL: dpi-sc-1.1.5.tar.gz
- Upload date:
- Size: 15.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 28fb057f38c11630a8b8b91ab492437398bba2c4b725c8568d80b65a02de6ae0 |
|
MD5 | da8eb7f04354c5092e65f359ba27bfc2 |
|
BLAKE2b-256 | 2a584f3e26d9b7af23fda4aaeb614ef51af1bbe84f0cffe56bb4b5ea97725473 |
File details
Details for the file dpi_sc-1.1.5-py3-none-any.whl
.
File metadata
- Download URL: dpi_sc-1.1.5-py3-none-any.whl
- Upload date:
- Size: 21.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eeb949084d22befd2cad95958f0d6ad72ac27bf4cd680ba2869487a8fddf30a0 |
|
MD5 | 5262cfa7fcc280690b9385a583106816 |
|
BLAKE2b-256 | 9ce06cb835852e638f5720d3dade6bce81b4e8aa4309e90a724740187ffeae1c |