Skip to main content

A Python-reimplementation of the Pairs algorithm described by A. Scialdone et al. (2015)

Project description

# PyPairs - A python scRNA-Seq classifier

This is a python-reimplementation of the _Pairs_ algorithm as described by A. Scialdone et. al. (2015). Original Paper available under: https://doi.org/10.1016/j.ymeth.2015.06.021

The algorithm aims to classify single cells based on their transcriptomic signal. Initially created to predict cell cycle phase from scRNA-Seq data, this algorithm can be used for various applications.

It is a supervised maschine learning algorithm and as such it consits of two components: training (sandbag) and prediction (cyclone)

## sandbag

This function implements the training step of the pair-based prediction method. Pairs of genes _(A, B)_ are identified from a training data set, with known category for each sample. In each pair, the fraction of cells in category 1 with expression of A > B (based on expression values in the dataset) and the fraction with B > A in each other category exceeds a set threshold fraction. These pairs are defined as the marker pairs for category 1. This is repeated for each category to obtain a separate set of marker pairs.

## cyclone

This function implements the classification step. To illustrate, consider classification of cell cycles into G1 phase (a category of cell cycle). Pairs of marker genes are identified with sandbag, where the expression of the first gene in the training data is greater than the second in G1 phase but less than the second in all other phases. For each cell, cyclone calculates the proportion of all marker pairs where the expression of the first gene is greater than the second in the new data. A high proportion suggests that the cell is likely to belong in G1 phase, as the expression ranking in the new data is consistent with that in the training data. Proportions are not directly comparable between phases due to the use of different sets of gene pairs for each phase. Instead, proportions are converted into scores (see below) that account for the size and precision of the proportion estimate. The same process is repeated for all phases, using the corresponding set of marker pairs in pairs.

## example

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypairs-2.0.1.tar.gz (7.3 kB view hashes)

Uploaded Source

Built Distribution

pypairs-2.0.1-py3-none-any.whl (8.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page