A Python-reimplementation of the Pairs algorithm described by A. Scialdone et al. (2015)
Project description
# PyPairs - A python scRNA-Seq classifier
This is a python-reimplementation of the _Pairs_ algorithm as described by A. Scialdone et. al. (2015). Original Paper available under: https://doi.org/10.1016/j.ymeth.2015.06.021
The algorithm aims to classify single cells based on their transcriptomic signal. Initially created to predict cell cycle phase from scRNA-Seq data, this algorithm can be used for various applications.
It is a supervised maschine learning algorithm and as such it consits of two components: training (sandbag) and prediction (cyclone)
## sandbag
This function implements the training step of the pair-based prediction method. Pairs of genes _(A, B)_ are identified from a training data set, with known category for each sample. In each pair, the fraction of cells in category 1 with expression of A > B (based on expression values in the dataset) and the fraction with B > A in each other category exceeds a set threshold fraction. These pairs are defined as the marker pairs for category 1. This is repeated for each category to obtain a separate set of marker pairs.
## cyclone
This function implements the classification step. To illustrate, consider classification of cell cycles into G1 phase (a category of cell cycle). Pairs of marker genes are identified with sandbag, where the expression of the first gene in the training data is greater than the second in G1 phase but less than the second in all other phases. For each cell, cyclone calculates the proportion of all marker pairs where the expression of the first gene is greater than the second in the new data. A high proportion suggests that the cell is likely to belong in G1 phase, as the expression ranking in the new data is consistent with that in the training data. Proportions are not directly comparable between phases due to the use of different sets of gene pairs for each phase. Instead, proportions are converted into scores (see below) that account for the size and precision of the proportion estimate. The same process is repeated for all phases, using the corresponding set of marker pairs in pairs.
## example
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.