Skip to main content

Doublet prediction in single-cell RNA-sequencing data

Project description

Scrublet

Single-Cell Remover of Doublets

Python code for identifying doublets in single-cell RNA-seq data. For details and validation of the method, see our paper in Cell Systems or the preprint on bioRxiv.

Quick start:

For a typical workflow, including interpretation of predicted doublet scores, see the example notebook.

Given a raw (unnormalized) UMI counts matrix counts_matrix with cells as rows and genes as columns, calculate a doublet score for each cell:

import scrublet as scr
scrub = scr.Scrublet(counts_matrix)
doublet_scores, predicted_doublets = scrub.scrub_doublets()

scr.scrub_doublets() simulates doublets from the observed data and uses a k-nearest-neighbor classifier to calculate a continuous doublet_score (between 0 and 1) for each transcriptome. The score is automatically thresholded to generate predicted_doublets, a boolean array that is True for predicted doublets and False otherwise.

Best practices:

  • When working with data from multiple samples, run Scrublet on each sample separately. Because Scrublet is designed to detect technical doublets formed by the random co-encapsulation of two cells, it may perform poorly on merged datasets where the cell type proportions are not representative of any single sample.
  • Check that the doublet score threshold is reasonable (in an ideal case, separating the two peaks of a bimodal simulated doublet score histogram, as in this example), and adjust manually if necessary.
  • Visualize the doublet predictions in a 2-D embedding (e.g., UMAP or t-SNE). Predicted doublets should mostly co-localize (possibly in multiple clusters). If they do not, you may need to adjust the doublet score threshold, or change the pre-processing parameters to better resolve the cell states present in your data.

Installation:

To install with PyPI:

pip install scrublet

To install from source:

git clone https://github.com/swolock/scrublet.git
cd scrublet
pip install -r requirements.txt
pip install --upgrade .

Old versions:

Previous versions can be found here.

Other doublet detection tools:

DoubletFinder
DoubletDecon
DoubletDetection

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrublet-0.2.3.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

scrublet-0.2.3-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file scrublet-0.2.3.tar.gz.

File metadata

  • Download URL: scrublet-0.2.3.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.0.0.post20201207 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.5

File hashes

Hashes for scrublet-0.2.3.tar.gz
Algorithm Hash digest
SHA256 2185f63070290267f82a36e5b4cae8c321f10415d2d0c9f7e5e97b1126bf653a
MD5 8be4c654b26bf5c54eb0c922c48a5125
BLAKE2b-256 bff852cecc93d2ac7b7ffe53662b60c34b2ad7f97eed7360e3d264080f8b1608

See more details on using hashes here.

File details

Details for the file scrublet-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: scrublet-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.0.0.post20201207 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.8.5

File hashes

Hashes for scrublet-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 92b8a0206fc710b397c8dd535ac75d26242dea0976d8aa632e3765438b60478a
MD5 e9e2763b95ffbcf30a67b9623dd00900
BLAKE2b-256 217482308f7bdcbda730b772a6d1afb6f55b9706601032126c4359afb3fb8986

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page