Skip to main content

Modelling CRISPR dropout data

Project description

Crispy logo

License PyPI version DOI

Module with utility functions to process CRISPR-based screens and method to correct gene independent copy-number effects.


Crispy uses Sklearn implementation of Gaussian Process Regression, fitting each sample independently.


Install pybedtools and then install Crispy

conda install -c bioconda pybedtools

pip install cy


Support to library imports:

from crispy.CRISPRData import Library

# Master Library, standardised assembly of KosukeYusa V1.1, Avana, Brunello and TKOv3 
# CRISPR-Cas9 libraries.
master_lib = Library.load_library("MasterLib_v1.csv.gz")

# Genome-wide minimal CRISPR-Cas9 library. 
minimal_lib = Library.load_library("MinLibCas9.csv.gz")

# Some of the most broadly adopted CRISPR-Cas9 libraries:
# 'Avana_v1.csv.gz', 'Brunello_v1.csv.gz', 'GeCKO_v2.csv.gz', 'Manjunath_Wu_v1.csv.gz', 
# 'TKOv3.csv.gz', 'Yusa_v1.1.csv.gz'
brunello_lib = Library.load_library("Brunello_v1.csv.gz")

Select sgRNAs (across multiple CRISPR-Cas9 libraries) for a given gene:

from crispy.GuideSelection import GuideSelection

# sgRNA selection class
gselection = GuideSelection()

# Select 5 optimal sgRNAs for MCL1 across multiple libraries 
gene_guides = gselection.select_sgrnas(
    "MCL1", n_guides=5, offtarget=[1, 0], jacks_thres=1, ruleset2_thres=.4

# Perform different rounds of sgRNA selection with increasingly relaxed efficiency thresholds 
gene_guides = gselection.selection_rounds("TRIM49", n_guides=5, do_amber_round=True, do_red_round=True)

Copy-number correction:

import crispy as cy
import matplotlib.pyplot as plt

# Import data
rawcounts, copynumber = cy.Utils.get_example_data()

# Import CRISPR-Cas9 library
lib = cy.Utils.get_crispr_lib()

# Instantiate Crispy
crispy = cy.Crispy(
    raw_counts=rawcounts, copy_number=copynumber, library=lib

# Fold-changes and correction integrated funciton.
# Output is a modified/expanded BED formated data-frame with sgRNA and segments information
bed_df = crispy.correct(x_features='ratio', y_feature='fold_change')

# Gaussian Process Regression is stored
crispy.gpr.plot(x_feature='ratio', y_feature='fold_change')


Credits and License

Developed at the Wellcome Sanger Institue (2017-2020).

For citation please refer to:

Gonçalves E, Behan FM, Louzada S, Arnol D, Stronach EA, Yang F, Yusa K, Stegle O, Iorio F, Garnett MJ (2019) Structural rearrangements generate cell-specific, gene-independent CRISPR-Cas9 loss of fitness effects. Genome Biol 20: 27

Gonçalves E, Thomas M, Behan FM, Picco G, Pacini C, Allen F, Parry-Smith D, Iorio F, Parts L, Yusa K, Garnett MJ (2019) Minimal genome-wide human CRISPR-Cas9 library. bioRxiv

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for cy, version 0.4.12
Filename, size File type Python version Upload date Hashes
Filename, size cy-0.4.12-py3-none-any.whl (38.1 MB) File type Wheel Python version py3 Upload date Hashes View
Filename, size cy-0.4.12.tar.gz (52.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page