A Python package to mix-and-match conflicting clustering results in single cell analysis, and generate reconciled clustering solutions.
Project description
scTriangulate
scTriangulate is a Python package to mix-and-match conflicting clustering results in single cell analysis, and generate reconciled clustering solutions.
scTriangulate leverages cooperative game theory (Shapley Value) in conjunction with complimentary stability metrics (i.e. reassign score, TFIDF score and SCCAF score) to intelligently integrate clustering solutions from nearly unlimited sources. Applied to multimodal datasets, this approach highlights new cell populations and mechanisms underlying lineage diversity.
Please don't hesitate to reach out to me if you have any questions (contact down the page), I will be responsive.
Overview
It can be used in an array of settings:
-
Integrate results from the same or multiple unsupervised clustering algorithms (i.e. Leiden, Seurat, SnapATAC) using different resolutions.
-
Integrate results from both unsupervised and supervised (i.e. cellHarmony, Seurat label transfer) clustering algorithms.
-
Integrate results from different reference atlases.
-
Integrate labels from multi-modality single cell datasets (CITE-Seq, Multiome, TEA-Seq, ASAP-Seq, etc.).
Tutorials and Installation
Check out our full documentation and step-by-step tutorials. But let's get a quick sense for a minimum example:
pip install sctriangulate
import scanpy as sc
from sctriangulate import *
from sctriangulate.preprocessing import *
from sctriangulate.colors import *
# Your adata should have (a) adata.X (b) at least two columns representing conflicting annotations in adata.obs (c) adata.obsm['X_umap'] for automatically generate visualization
adata = sc.read('./test/input.h5ad')
sctri = ScTriangulate(dir='./output',adata=adata,query=['sctri_rna_leiden_1','sctri_rna_leiden_2','sctri_rna_leiden_3'])
sctri.lazy_run()
# All the results will be saved in the dir you specified
Citation
scTriangulate: Decision-level integration of multimodal single-cell data. BioRxiv. Oct 2021 (https://www.biorxiv.org/content/10.1101/2021.10.16.464640v1)
Reproducibility
All scripts for reproducing the analyses in the preprint are available in the reproduce folder, along with all the necessary input files and intermediate outputs which are avaiable in Synapse storage.
Contact
Guangyuan(Frank) Li
Email: li2g2@mail.uc.edu
PhD student, Biomedical Informatics
Cincinnati Children’s Hospital Medical Center(CCHMC)
University of Cincinnati, College of Medicine
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for sctriangulate-0.12.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f371bda528fba9dcb12573007312c1ad171b86653ad93ce98e67ea7691813beb |
|
MD5 | acb699366f33b0205addcc07179d5e86 |
|
BLAKE2b-256 | 9a2fb002f74bd09ebd619c41138059935c9f47092c51bc6d50c87f4473d0aa82 |