Skip to main content

Python implementation of the SCENIC pipeline for transcription factor inference from single-cell transcriptomics experiments.

Project description

buildstatus pypipackage docstatus

pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.

The pioneering work was done in R and results were published in Nature Methods [1].

pySCENIC can be run on a single desktop machine but easily scales to multi-core clusters to analyze thousands of cells in no time. The latter is achieved via the dask framework for distributed computing [2].

Full documentation is available on Read the Docs

News

2020-02-27

0.10.0 release

  • Added a helper script arboreto_with_multiprocessing.py that runs the Arboreto GRN algorithms (GRNBoost2, GENIE3) without Dask for compatibility.

  • Ability to set a fixed seed in both the AUCell step and in the calculation of regulon thresholds (CLI parameter --seed; aucell function parameter seed).

  • (since 0.9.18) In the modules_from_adjacencies function, the default value of rho_mask_dropouts is changed to False. This now matches the behavior of the R version of SCENIC. The cli version has an additional option to turn dropout masking back on (--mask_dropouts).

Overview

The pipeline has three steps:

  1. First transcription factors (TFs) and their target genes, together defining a regulon, are derived using gene inference methods which solely rely on correlations between expression of genes across cells. The arboreto package is used for this step.

  2. These regulons are refined by pruning targets that do not have an enrichment for a corresponding motif of the TF effectively separating direct from indirect targets based on the presence of cis-regulatory footprints.

  3. Finally, the original cells are differentiated and clustered on the activity of these discovered regulons.

The most impactful speed improvement is introduced by the arboreto package in step 1. This package provides an alternative to GENIE3 [3] called GRNBoost2. This package can be controlled from within pySCENIC.

All the functionality of the original R implementation is available and in addition:

  1. You can leverage multi-core and multi-node clusters using dask and its distributed scheduler.

  2. We implemented a version of the recovery of input genes that takes into account weights associated with these genes.

  3. Regulons, i.e. the regulatory network that connects a TF with its target genes, with targets that are repressed are now also derived and used for cell enrichment analysis.

Website

For more information, please visit LCB, SCENIC (R version), or SCENICprotocol (for a Nextflow implementation).

Acknowledgments

We are grateful to all providers of TF-annotated position weight matrices, in particular Martha Bulyk (UNIPROBE), Wyeth Wasserman and Albin Sandelin (JASPAR), BioBase (TRANSFAC), Scot Wolfe and Michael Brodsky (FlyFactorSurvey) and Timothy Hughes (cisBP).

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyscenic-0.10.1.tar.gz (7.0 MB view details)

Uploaded Source

Built Distribution

pyscenic-0.10.1-py3-none-any.whl (7.1 MB view details)

Uploaded Python 3

File details

Details for the file pyscenic-0.10.1.tar.gz.

File metadata

  • Download URL: pyscenic-0.10.1.tar.gz
  • Upload date:
  • Size: 7.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.8

File hashes

Hashes for pyscenic-0.10.1.tar.gz
Algorithm Hash digest
SHA256 2d9a774c37ccb26f3923a6000e80eea9d254597c7384f9cf0fd46027c6928828
MD5 7023f30821c55dc1b3f4d64c53bf2861
BLAKE2b-256 b1d5092cf64340520db2911bf12f80133542c9b98655cab474371cda4372d81e

See more details on using hashes here.

File details

Details for the file pyscenic-0.10.1-py3-none-any.whl.

File metadata

  • Download URL: pyscenic-0.10.1-py3-none-any.whl
  • Upload date:
  • Size: 7.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.8

File hashes

Hashes for pyscenic-0.10.1-py3-none-any.whl
Algorithm Hash digest
SHA256 df0f2653808b8bffa027baa620db551b770bd11ead16318d7a0f241930ea97e8
MD5 2c3e4c0af783543eaac82fc1de0ced2f
BLAKE2b-256 f951a2976c817ee7d9f356661f24e90887af862c93be1b79a56e31b2856e59dd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page