Alternative splicing quantification in single cells with Leaflet
Project description
LeafletSC
LeafletSC is a binomial mixture model designed for the analysis of alternative splicing events in single-cell RNA sequencing data. The model facilitates understanding and quantifying splicing variability at the single-cell level. Below is the graphical model representation:
Compatibility with sequencing platforms
LeafletSC supports analysis from the following single-cell RNA sequencing platforms:
- Smart-Seq2
- Split-seq
- 10X
Getting Started
LeafletSC is implemented in Python and requires Python version 3.10 (3.11 has not been tested yet). We recommend the following approach:
# create a conda environment with python 3.10
conda create -n "LeafletSC" python=3.10 ipython
# activate environment
conda activate LeafletSC
# install latest version of LeafletSC into this environment
pip install LeafletSC
Once the package is installed, you can load it in python as follows:
import LeafletSC
# or specific submodules
from LeafletSC.utils import *
from LeafletSC.clustering import *
Requirements
Prior to using LeafletSC, please run regtools on your single-cell BAM files. Here is an example of what this might look like in a Snakefile:
{params.regtools_path} junctions extract -a 6 -m 50 -M 500000 {input.bam_use} -o {output.juncs} -s XS -b {output.barcodes}
# Combine junctions and cell barcodes
paste --delimiters='\t' {output.juncs} {output.barcodes} > {output.juncswbarcodes}
- Once you have your junction files, you can try out the mixture model tutorial under Tutorials
- While optional, we recommend running LeafletSC intron clustering with a gtf file so that junctions can be first mapped to annotated splicing events.
Capabilities
With LeafletSC, you can:
- Infer cell states influenced by alternative splicing and identify differentially spliced regions.
- Conduct differential splicing analysis between specific cell groups if cell identities are known.
- Generate synthetic alternative splicing datasets for robust analysis testing.
How does it work?
The full method can be found in our paper while the graphical model is shown below:
If you use Leaflet, please cite our paper
@unpublished{Isaev2023-bf,
title = "Investigating RNA splicing as a source of cellular diversity using a binomial mixture model",
author = "Isaev, Keren and Knowles, David A",
journal = "bioRxiv",
pages = "2023.10.17.562774",
month = oct,
year = 2023,
language = "en"
}
To-do:
- Add documentation and some tests for how to run the simulation code
- Add 10X/split-seq mode in addition to smart-seq2
- Extend framework to seurat/scanpy anndata objects
- Add notes on generative model and inference method
- Clean up dependencies
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for LeafletSC-0.1.9-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ec7a849058a2cc26068647eaaec80da49f493a432e34df7c5bc7060b2041b70 |
|
MD5 | b1d70f3b37736c868624d122d6cbee15 |
|
BLAKE2b-256 | a4aa4554a4cf607679a842836eb6600d8cd0f7c477013fa87a51858a4b57ac0c |