Skip to main content

Base Editor screen analysis [Bayesian Estimation of variant effect] with guide Activity Normalization

Project description

crispr-bean

PyPI pyversions PyPI version Test Documentation License: AGPL v3

bean improves CRISPR pooled screen analysis by 1) unconfounding variable per-guide editing outcome by considering genotypic outcome from reporter sequence and 2) through accurate modeling of screen procedure.

Reporter construct

Overview

bean supports end-to-end analysis of pooled sorting screens, with or without reporter.

dag_bean_v2.svg

bean subcommands include the following: Click on the links to see the full documentation.

  1. count, count-samples: Base-editing-aware mapping of guide, optionally with reporter from .fastq files.
    • create-screen creates minimal ReporterScreen object from flat gRNA count file. Note that this way, allele counts are not included and many functionalities involving allele and edit counts are not supported.
  2. profile: Profile editing preferences of your editor.
  3. qc: Quality control report and filtering out / masking of aberrant sample and guides
  4. filter: Filter reporter alleles; essential for tiling mode that allows for all alleles generated from gRNA.
  5. run: Quantify targeted variants' effect sizes from screen data.
  • Screen data is saved as ReporterScreen object in the pipeline. BEAN stores mapped gRNA and allele counts in ReporterScreen object which is compatible with AnnData.

Installation

First install PyTorch. Then download from PyPI:

pip install crispr-bean[model]

Following installation without PyTorch dependency wouldn't have variant effect size quantification (bean run) functionality.

pip install crispr-bean

For the latest version of bean (and for the test files in tests/data), install from Github:

git clone https://github.com/pinellolab/crispr-bean.git
cd crispr-bean
pip install -e .

Documentaton

See the documentation for tutorials and API references.

Tutorials

Library design Selection Reporter Tutorial link
GWAS variant library FACS sorting Yes/No GWAS variant screen
Coding sequence tiling libarary FACS sorting Yes/No Coding sequence tiling screen
GWAS variant library Survival / Proliferation Yes/No GWAS variant screen
Coding sequence tiling libarary Survival / Proliferation Yes/No Coming soon!
Perturbation library without reporter FACS sorting Yes/No No reporter screen

Library design: variant or tiling?

The bean filter and bean run steps depend on the type of gRNA library design, where BEAN supports two modes of running. variant library design

  1. variant library: Several gRNAs tile each of the targeted variants. Only the editing rate of the target variant is considered and the bystander effects are ignored.

    • :heavy_plus_sign: Increase power for your target variant, as the signal is not distributed across likely no-effect bystanders.
    • :heavy_minus_sign: Ignores potential bystander effect
    • :heavy_check_mark: Suitable for noncoding GWAS variant screens.
  2. tiling library: gRNA densely tiles a long region (e.g. gene(s), exon(s), coding sequence(s)). Bystander edits are considered to obtain alleles with significant fractions. Edited alleles can be "translated" to output coding variants.

    • :heavy_plus_sign: Considers bystander effect
    • :heavy_minus_sign: If the library results in alleles that are not diverse enough across gRNAs, signal will likely be diluted to all variants in that alleles. (ex. Allele "GGGGG" with a single gRNA score will distribute scores across 5 G's.)
    • :heavy_check_mark: Suitable for coding variant screens with tiling design.

Using BEAN as Python module

import bean as be
cdata = be.read_h5ad("bean_counts_sample.h5ad")

Python package bean supports multiple data wrangling functionalities for ReporterScreen objects. See the ReporterScreen API tutorial for more detail.

Run time

  • Installation takes 14.4 mins after pytorch installation with pytorch in Dell XPS 13 Ubuntu WSL.
  • bean run takes 4.6 mins with --scale-by-acc tag in Dell XPS 13 Ubuntu WSL for variant screen dataset with 3455 guides and 6 replicates with 4 sorting bins.
  • Full pipeline takes 90.1s in GitHub Action for toy dataset of 2 replicates and 30 guides.

Contributing

If you have questions or feature request, please open an issue. Please feel free to send a pull request.

Citation

If you have used BEAN for your analysis, please cite:
Ryu, J. et al. Joint genotypic and phenotypic outcome modeling improves base editing variant effect quantification. medRxiv (2023) doi:10.1101/2023.09.08.23295253

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crispr-bean-1.2.2.tar.gz (12.1 MB view hashes)

Uploaded Source

Built Distribution

crispr_bean-1.2.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (12.9 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page