Modelling CRISPR dropout data
Project description
Module with utility functions to process CRISPR-based screens and method to correct gene independent copy-number effects.
Description
Crispy uses Sklearn implementation of Gaussian Process Regression, fitting each sample independently.
Install
Install pybedtools
and then install Crispy
conda install -c bioconda pybedtools
pip install cy
Examples
Support to library imports:
from crispy.CRISPRData import Library
# Master Library, standardised assembly of KosukeYusa V1.1, Avana, Brunello and TKOv3
# CRISPR-Cas9 libraries.
master_lib = Library.load_library("MasterLib_v1.csv.gz")
# Genome-wide minimal CRISPR-Cas9 library.
minimal_lib = Library.load_library("MinLibCas9.csv.gz")
# Some of the most broadly adopted CRISPR-Cas9 libraries:
# 'Avana_v1.csv.gz', 'Brunello_v1.csv.gz', 'GeCKO_v2.csv.gz', 'Manjunath_Wu_v1.csv.gz',
# 'TKOv3.csv.gz', 'Yusa_v1.1.csv.gz'
brunello_lib = Library.load_library("Brunello_v1.csv.gz")
Select sgRNAs (across multiple CRISPR-Cas9 libraries) for a given gene:
from crispy.GuideSelection import GuideSelection
# sgRNA selection class
gselection = GuideSelection()
# Select 5 optimal sgRNAs for MCL1 across multiple libraries
gene_guides = gselection.select_sgrnas(
"MCL1", n_guides=5, offtarget=[1, 0], jacks_thres=1, ruleset2_thres=.4
)
# Perform different rounds of sgRNA selection with increasingly relaxed efficiency thresholds
gene_guides = gselection.selection_rounds("TRIM49", n_guides=5, do_amber_round=True, do_red_round=True)
Copy-number correction:
import crispy as cy
import matplotlib.pyplot as plt
from crispy.CRISPRData import ReadCounts, Library
"""
Import sample data
"""
rawcounts, copynumber = cy.Utils.get_example_data()
"""
Import CRISPR-Cas9 library
Important:
Library has to have the following columns: "Chr", "Start", "End", "Approved_Symbol"
Library and segments have to have consistent "Chr" formating: "Chr1" or "chr1" or "1"
Gurantee that "Start" and "End" columns are int
"""
lib = Library.load_library("Yusa_v1.1.csv.gz")
lib = lib.rename(
columns=dict(start="Start", end="End", chr="Chr", Gene="Approved_Symbol")
).dropna(subset=["Chr", "Start", "End"])
lib["Chr"] = "chr" + lib["Chr"]
lib["Start"] = lib["Start"].astype(int)
lib["End"] = lib["End"].astype(int)
"""
Calculate fold-change
"""
plasmids = ["ERS717283"]
rawcounts = ReadCounts(rawcounts).remove_low_counts(plasmids)
sgrna_fc = rawcounts.norm_rpm().foldchange(plasmids)
"""
Correct CRISPR-Cas9 sgRNA fold changes
"""
crispy = cy.Crispy(
sgrna_fc=sgrna_fc.mean(1), copy_number=copynumber, library=lib.loc[sgrna_fc.index]
)
# Fold-changes and correction integrated funciton.
# Output is a modified/expanded BED formated data-frame with sgRNA and segments information
# n_sgrna: represents the minimum number of sgRNAs required per segment to consider in the fit.
# Recomended default values range between 4-10.
bed_df = crispy.correct(n_sgrna=10)
print(bed_df.head())
# Gaussian Process Regression is stored
crispy.gpr.plot(x_feature="ratio", y_feature="fold_change")
plt.show()
Credits and License
Developed at the Wellcome Sanger Institue (2017-2020).
For citation please refer to:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cy-0.5.8.tar.gz
(56.0 kB
view details)
Built Distribution
cy-0.5.8-py3-none-any.whl
(52.8 MB
view details)
File details
Details for the file cy-0.5.8.tar.gz
.
File metadata
- Download URL: cy-0.5.8.tar.gz
- Upload date:
- Size: 56.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b0555d9e6aef68fbe9868aaf950f8aec0d42c1ce05108a1d3e562749f1114e5b |
|
MD5 | a6f968e9aac9e3b695b7572bf2592b26 |
|
BLAKE2b-256 | 0159217fe9c5cab3da35afd6a0bfa3ad0ba982014554f26efcac7588ea823bc3 |
File details
Details for the file cy-0.5.8-py3-none-any.whl
.
File metadata
- Download URL: cy-0.5.8-py3-none-any.whl
- Upload date:
- Size: 52.8 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f10d11e64eab9b1aa198d1a9e5ec3eb460781c27822f903fd68778f61c31e4a2 |
|
MD5 | 69e71d2b8cc9c1aa07b9b0380f971bb6 |
|
BLAKE2b-256 | 17deeb1c2b0eca44eeb9a7bad2b9fd8b87394afb254ff5917f1084feb0781e74 |