Skip to main content

Base Editor screen analysis with guide Activity Normalization

Project description

crispr-bean

PyPI pyversions PyPI version Test Documentation License: AGPL v3

bean improves CRISPR pooled screen analysis by 1) unconfounding variable per-guide editing outcome by considering genotypic outcome from reporter sequence and 2) through accurate modeling of screen procedure.

Reporter construct

Overview

bean supports end-to-end analysis of pooled sorting screens, with or without reporter.

dag_bean_v2.svg

bean subcommands include the following: Click on the links to see the full documentation.

  1. count, count-samples: Base-editing-aware mapping of guide, optionally with reporter from .fastq files.
    • create-screen creates minimal ReporterScreen object from flat gRNA count file. Note that this way, allele counts are not included and many functionalities involving allele and edit counts are not supported.
  2. profile: Profile editing preferences of your editor.
  3. qc: Quality control report and filtering out / masking of aberrant sample and guides
  4. filter: Filter reporter alleles; essential for tiling mode that allows for all alleles generated from gRNA.
  5. run: Quantify targeted variants' effect sizes from screen data.
  • Screen data is saved as ReporterScreen object in the pipeline. BEAN stores mapped gRNA and allele counts in ReporterScreen object which is compatible with AnnData.

Installation

First install PyTorch. Then download from PyPI:

pip install crispr-bean

For the latest version of bean (and for the test files in tests/data), install from Github:

git clone https://github.com/pinellolab/crispr-bean.git
cd crispr-bean
pip install -e .

Documentaton

See the documentation for tutorials and API references.

Tutorials

Library design Selection Reporter Tutorial link
GWAS variant library FACS sorting Yes/No GWAS variant screen
Coding sequence tiling libarary FACS sorting Yes/No Coding sequence tiling screen
GWAS variant library Survival / Proliferation Yes/No GWAS variant screen
Coding sequence tiling libarary Survival / Proliferation Yes/No Coding sequence tiling screen
Perturbation library without reporter FACS sorting No No reporter screen

Also see notebook that visualizes screen analysis result here.

Library design: variant or tiling?

The bean filter and bean run steps depend on the type of gRNA library design, where BEAN supports two modes of running. variant library design

  1. variant library: Several gRNAs tile each of the targeted variants. Only the editing rate of the target variant is considered and the bystander effects are ignored.

    • :heavy_plus_sign: Increase power for your target variant, as the signal is not distributed across likely no-effect bystanders.
    • :heavy_minus_sign: Ignores potential bystander effect
    • :heavy_check_mark: Suitable for noncoding GWAS variant screens.
  2. tiling library: gRNA densely tiles a long region (e.g. gene(s), exon(s), coding sequence(s)). Bystander edits are considered to obtain alleles with significant fractions. Edited alleles can be "translated" to output coding variants.

    • :heavy_plus_sign: Considers bystander effect
    • :heavy_minus_sign: If the library results in alleles that are not diverse enough across gRNAs, signal will likely be diluted to all variants in that alleles. (ex. Allele "GGGGG" with a single gRNA score will distribute scores across 5 G's.)
    • :heavy_check_mark: Suitable for coding variant screens with tiling design.

Using BEAN as Python module

import bean as be
cdata = be.read_h5ad("bean_counts_sample.h5ad")

Python package bean supports multiple data wrangling functionalities for ReporterScreen objects. See the ReporterScreen API tutorial for more detail.

Run time

  • Installation takes 14.4 mins after pytorch installation with pytorch in Dell XPS 13 Ubuntu WSL.
  • bean run takes 4.6 mins with --scale-by-acc tag in Dell XPS 13 Ubuntu WSL for variant screen dataset with 3455 guides and 6 replicates with 4 sorting bins.
  • Full pipeline takes 90.1s in GitHub Action for toy dataset of 2 replicates and 30 guides.

Contributing

See CHANGELOG for recent updates. If you have questions or feature request, please open an issue. Please feel free to send a pull request.

Citation

If you have used BEAN for your analysis, please cite:
Ryu, J., Barkal, S., Yu, T. et al. Joint genotypic and phenotypic outcome modeling improves base editing variant effect quantification. Nat Genet (2024). https://doi.org/10.1038/s41588-024-01726-6

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

crispr_bean-1.2.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (16.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.24+ x86-64

File details

Details for the file crispr_bean-1.2.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for crispr_bean-1.2.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 a10b8b148c8aa629a17e385deeb851b82944f36009488756018a50456396df66
MD5 261a0ecf685f6effce883bb17bc3672d
BLAKE2b-256 6b2fc455021ddcb72d35862ac04c61a892fc8287d51686699855b4b8f99b62e9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page