Modelling CRISPR dropout data
Project description
Module with utility functions to process CRISPR-based screens and method to correct gene independent copy-number effects.
Description
Crispy uses Sklearn implementation of Gaussian Process Regression, fitting each sample independently.
Install
Install pybedtools
and then install Crispy
conda install -c bioconda pybedtools
pip install cy
Examples
Support to library imports:
from crispy.CRISPRData import Library
# Master Library, standardised assembly of KosukeYusa V1.1, Avana, Brunello and TKOv3
# CRISPR-Cas9 libraries.
master_lib = Library.load_library("MasterLib_v1.csv.gz")
# Genome-wide minimal CRISPR-Cas9 library.
minimal_lib = Library.load_library("MinLibCas9.csv.gz")
# Some of the most broadly adopted CRISPR-Cas9 libraries:
# 'Avana_v1.csv.gz', 'Brunello_v1.csv.gz', 'GeCKO_v2.csv.gz', 'Manjunath_Wu_v1.csv.gz',
# 'TKOv3.csv.gz', 'Yusa_v1.1.csv.gz'
brunello_lib = Library.load_library("Brunello_v1.csv.gz")
Select sgRNAs (across multiple CRISPR-Cas9 libraries) for a given gene:
from crispy.GuideSelection import GuideSelection
# sgRNA selection class
gselection = GuideSelection()
# Select 5 optimal sgRNAs for MCL1 across multiple libraries
gene_guides = gselection.select_sgrnas(
"MCL1", n_guides=5, offtarget=[1, 0], jacks_thres=1, ruleset2_thres=.4
)
# Perform different rounds of sgRNA selection with increasingly relaxed efficiency thresholds
gene_guides = gselection.selection_rounds("TRIM49", n_guides=5, do_amber_round=True, do_red_round=True)
Copy-number correction:
import crispy as cy
import matplotlib.pyplot as plt
# Import data
rawcounts, copynumber = cy.Utils.get_example_data()
# Import CRISPR-Cas9 library
lib = cy.Utils.get_crispr_lib()
# Instantiate Crispy
crispy = cy.Crispy(
raw_counts=rawcounts, copy_number=copynumber, library=lib
)
# Fold-changes and correction integrated funciton.
# Output is a modified/expanded BED formated data-frame with sgRNA and segments information
bed_df = crispy.correct(x_features='ratio', y_feature='fold_change')
print(bed_df.head())
# Gaussian Process Regression is stored
crispy.gpr.plot(x_feature='ratio', y_feature='fold_change')
plt.show()
Credits and License
Developed at the Wellcome Sanger Institue (2017-2020).
For citation please refer to:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cy-0.5.1.tar.gz
(55.1 kB
view hashes)
Built Distribution
cy-0.5.1-py3-none-any.whl
(50.9 MB
view hashes)