Chip-based CRISPR analysis
Project description
Introduction
--- title: Workflow --- flowchart TD SP[(synthesize plasmids)] --> FP[(functional plasmids)] --> E[editing] --> SR[(sample reads)] --> A[alignment] FP --> A --> OEO[(observed editing outcomes)] --> C[correction] --> CEO[(corrected editing outcomes)] SP --> NP[(nonfunctional plasmids)] --> SR NP[(nonfunctional plasmids)] --> NCR[(negative control reads)] --> A2[alignment] FP --> A2 --> NCEO[(negative control editing outcomes)] --> C
We design the workflow naapam to decouple the CRISPR/Cas9 editing outcomes from the synthesis error of reference plasmids. We first give an overview of the workflow and left the techniqal details in Discriminate functional and nonfunctional plasmids, Sequence alignment and Correction observed editing outcomes by negative control.
As shown in the above diagram, we apply the hard classification on the synthesized plasmids to get functional and nonfunctional plasmids. We assume that only functional plasmids can be edited by the CRISPR/Cas9 system. Nonfunctional plasmids are transferred into the cell lines, but are not edited. Based on our assumption that only functional plasmids can be edited, we use functional plasmids as references to analyze editing outcomes.
For cell lines express Cas9, both nonfunctional plasmids and edited functional plasmids contribute to the editing outcomes in NGS reads. For the negative control (cell lines without Cas9), only nonfunctional plasmids contributes to the editing outcomes. The synthesis of plasmids is error-prone. A naive analysis often attributes these synthesis errors to the CRISPR/Cas9 system, and therefore overesitimates the editing efficiency and distorts the overall editing profile. A reasonable assumption is that the abundance of non-functional plasmids is similar in cell lines with and without Cas9. Therefore, we may correct the editing profiles for the cell lines with Cas9 based on those of the negative controls.
Discriminate functional and nonfunctional plasmids
block
block:ID
R1B["R1 barcode"]
R1P["R1 primer"]
TSS["G"]
SG["sgRNA"]
SC["scaffold"]
TES["G"]
T["target"]
SEP["CAG"]
B["barcode"]
RCR2P["R2 primer'"]
RCR2B["R2 barcode'"]
end
We parse the components for each read in control samples. We discriminate functional and nonfunction plasmids based on the integrity and conservation of:
- primer;
- sgRNA;
- scaffold;
- barcode;
- protospacer;
- PAM;
- transcription start and end sites markded by G;
We also require that barcode, sgRNA, protospacer are consistent (comes from the same plasmid design).
Sequence alignment
We use the bioconda package rearr (version 1.0.11) to align the NGS reads to the functional plasmids for discriminating their editing types. We package rearr together with naapam so you need not download it separately. Rearr use an efficient and accurate chimeric alignment engine to call editing outcome from raw reads. It is especially good at extracting predictable templated insertions resulted from stagger cleavage of CRISPR/Cas9 system (Precise and Predictable CRISPR Chromosomal Rearrangements Reveal Principles of Cas9-Mediated Nucleotide Insertion). See the documentation for more details about rearr.
Correction observed editing outcomes by negative control
Let $w_0$ be the observed wild type frequency of a functional plasmid in control sample. Let $e_0^{(i)}$ be the observed frequency of editing outcome $i$ (actually nonfunctional plasmids) in control sample. Then $w_0 + \sum_i e_0^{(i)} = 1$. Similarly, let $w$ be observed the wild type frequency of a functional plasmid in the sample with Cas9. Let $e^{(i)}$ be the observed frequency of editing outcome $i$ in the sample with Cas9. Then $w + \sum_i e^{(i)} = 1$. By the assumption that the abundance of non-functional plasmids is similar in cell lines with and without Cas9, the expected frequency of functional plasmids (wild type + edited) is $w_0$. For the editing outcome $i$, among its observed frequency $e^{(i)}$ in the cell line with Cas9, we expect that $e_0^{(i)}$ comes from nonfunctional plasmids. In summary, the corrected frequency of the editing outcome $i$ is $$ \frac{e^{(i)} - e_0^{(i)}}{w_0} $$ if wild type is included, and $$ \frac{e^{(i)} - e_0^{(i)}}{\sum_i (e^{(i)} - e_0^{(i)})} $$ if wild type is excluded.
Install
$ pip install naapam
Dependencies
- bowtie2
- gawk
Usage
Follow the notebooks in order:
align.ipynbanalysis.ipynb
Copy them out of the package by
from importlib import resources
import shutil
for file in ["align.ipynb", "analyze.ipynb"]:
shutil.copyfile(src=resources.files("naapam.notebooks") / file, dst=file)
You need to config the directory and the plasmid file in the first block of the notebooks.
data_dir: contains the raw NGS reads.root_dir: root directory for outputs.plasmid_file: the design file of plasmids.config_dir: directory for config files.correct_dir: output directory of the corrected alignment results.
Copy examples of plasmid files out of the package for reference.
from importlib import resources
import shutil
for file in [
"final_hgsgrna_libb_all_0811-NGG.csv",
"final_hgsgrna_libb_all_0811_NAA_scaffold_nbt.csv",
]:
shutil.copyfile(src=resources.files("naapam.plasmids") / file, dst=file)
Copy examples of config directory out of the package for reference.
from importlib import resources
import shutil
shutil.copytree(src=resources.files("naapam.filter_configs"), dst="filter_configs")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file naapam-0.1.17.tar.gz.
File metadata
- Download URL: naapam-0.1.17.tar.gz
- Upload date:
- Size: 5.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bc9ca648fbc77abc5328af5f71411441ffc42bb774aadb30311fda156fe72d02
|
|
| MD5 |
1dbf306733c158a1685fe6f3bf10f9b7
|
|
| BLAKE2b-256 |
f3b2a0881cf77ea288f7af9b4d549f45d9e54796cf86511bf8fe830827d67ba4
|
Provenance
The following attestation bundles were made for naapam-0.1.17.tar.gz:
Publisher:
release.yml on ljw20180420/naapam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
naapam-0.1.17.tar.gz -
Subject digest:
bc9ca648fbc77abc5328af5f71411441ffc42bb774aadb30311fda156fe72d02 - Sigstore transparency entry: 1356316133
- Sigstore integration time:
-
Permalink:
ljw20180420/naapam@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Branch / Tag:
refs/tags/v0.1.17 - Owner: https://github.com/ljw20180420
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Trigger Event:
release
-
Statement type:
File details
Details for the file naapam-0.1.17-cp314-cp314t-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: naapam-0.1.17-cp314-cp314t-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 6.5 MB
- Tags: CPython 3.14t, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52859edad16268460ec565fbb9cc1c2b077c066ec3a89331848a61a8ab615261
|
|
| MD5 |
5844f5e32f29259d349fd26fe7249f9f
|
|
| BLAKE2b-256 |
4505cb61fd7753d493883804952a7ac9e9b466854664ea1b98618e316c93ded0
|
Provenance
The following attestation bundles were made for naapam-0.1.17-cp314-cp314t-musllinux_1_2_x86_64.whl:
Publisher:
release.yml on ljw20180420/naapam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
naapam-0.1.17-cp314-cp314t-musllinux_1_2_x86_64.whl -
Subject digest:
52859edad16268460ec565fbb9cc1c2b077c066ec3a89331848a61a8ab615261 - Sigstore transparency entry: 1356316162
- Sigstore integration time:
-
Permalink:
ljw20180420/naapam@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Branch / Tag:
refs/tags/v0.1.17 - Owner: https://github.com/ljw20180420
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Trigger Event:
release
-
Statement type:
File details
Details for the file naapam-0.1.17-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: naapam-0.1.17-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 5.5 MB
- Tags: CPython 3.14t, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a9e32db1c9b161028e803d97e0f6da7ad2aa50ddc8446c3fff6ab1cc7b8ee2e
|
|
| MD5 |
b4e6cf62a156db62f5b7f929c17823bb
|
|
| BLAKE2b-256 |
6f905c0acc8159a2aed6f6b6480289b7519e4a7b5df52ea25f9fbbc307604d92
|
Provenance
The following attestation bundles were made for naapam-0.1.17-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:
Publisher:
release.yml on ljw20180420/naapam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
naapam-0.1.17-cp314-cp314t-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl -
Subject digest:
6a9e32db1c9b161028e803d97e0f6da7ad2aa50ddc8446c3fff6ab1cc7b8ee2e - Sigstore transparency entry: 1356316180
- Sigstore integration time:
-
Permalink:
ljw20180420/naapam@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Branch / Tag:
refs/tags/v0.1.17 - Owner: https://github.com/ljw20180420
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Trigger Event:
release
-
Statement type:
File details
Details for the file naapam-0.1.17-cp314-cp314-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: naapam-0.1.17-cp314-cp314-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 6.5 MB
- Tags: CPython 3.14, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db24fcef72a954f9a04109908024c8d9146677bbfd4aeafa39a0e7d832a6dc32
|
|
| MD5 |
695a61aa166f2e56e52da7d94394351e
|
|
| BLAKE2b-256 |
a4684fc931446bf7e770bec675b5dbe712508a5a1ecf1be3ef9a3db7608100ec
|
Provenance
The following attestation bundles were made for naapam-0.1.17-cp314-cp314-musllinux_1_2_x86_64.whl:
Publisher:
release.yml on ljw20180420/naapam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
naapam-0.1.17-cp314-cp314-musllinux_1_2_x86_64.whl -
Subject digest:
db24fcef72a954f9a04109908024c8d9146677bbfd4aeafa39a0e7d832a6dc32 - Sigstore transparency entry: 1356316142
- Sigstore integration time:
-
Permalink:
ljw20180420/naapam@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Branch / Tag:
refs/tags/v0.1.17 - Owner: https://github.com/ljw20180420
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Trigger Event:
release
-
Statement type:
File details
Details for the file naapam-0.1.17-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: naapam-0.1.17-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 5.5 MB
- Tags: CPython 3.14, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc91ca70acbd7ee1b7770e02f733fac9042d1651e7081d82a9a788888fe24a5a
|
|
| MD5 |
7ff0308992be80a6193f0e118c24ddf4
|
|
| BLAKE2b-256 |
6f21705e7cad09da8bfb70f310113d1f05e12b28b53b75973c16494a68cd523d
|
Provenance
The following attestation bundles were made for naapam-0.1.17-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:
Publisher:
release.yml on ljw20180420/naapam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
naapam-0.1.17-cp314-cp314-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl -
Subject digest:
cc91ca70acbd7ee1b7770e02f733fac9042d1651e7081d82a9a788888fe24a5a - Sigstore transparency entry: 1356316155
- Sigstore integration time:
-
Permalink:
ljw20180420/naapam@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Branch / Tag:
refs/tags/v0.1.17 - Owner: https://github.com/ljw20180420
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Trigger Event:
release
-
Statement type:
File details
Details for the file naapam-0.1.17-cp313-cp313-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: naapam-0.1.17-cp313-cp313-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 6.5 MB
- Tags: CPython 3.13, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
785ca7541210ebf4096f0e72c16ea26d404034cbfa774aa7ba360619699482ac
|
|
| MD5 |
c918c0e856832fff2293e5fc29521231
|
|
| BLAKE2b-256 |
ba15df775bfcc0e514f96fef71b74a8a86c986dd4ab310131b21629a3566781b
|
Provenance
The following attestation bundles were made for naapam-0.1.17-cp313-cp313-musllinux_1_2_x86_64.whl:
Publisher:
release.yml on ljw20180420/naapam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
naapam-0.1.17-cp313-cp313-musllinux_1_2_x86_64.whl -
Subject digest:
785ca7541210ebf4096f0e72c16ea26d404034cbfa774aa7ba360619699482ac - Sigstore transparency entry: 1356316171
- Sigstore integration time:
-
Permalink:
ljw20180420/naapam@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Branch / Tag:
refs/tags/v0.1.17 - Owner: https://github.com/ljw20180420
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Trigger Event:
release
-
Statement type:
File details
Details for the file naapam-0.1.17-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.
File metadata
- Download URL: naapam-0.1.17-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
- Upload date:
- Size: 5.5 MB
- Tags: CPython 3.13, manylinux: glibc 2.24+ x86-64, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d6166c0796167712df0afed81db30f152b43c10c39a4bdc70bdff3372f81289
|
|
| MD5 |
e9259ed70497dd794a2cebc0e0124789
|
|
| BLAKE2b-256 |
d94b4efec04c765c2788506081221a72b872068e39a6a068ffc6b7379db2bac2
|
Provenance
The following attestation bundles were made for naapam-0.1.17-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:
Publisher:
release.yml on ljw20180420/naapam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
naapam-0.1.17-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl -
Subject digest:
8d6166c0796167712df0afed81db30f152b43c10c39a4bdc70bdff3372f81289 - Sigstore transparency entry: 1356316139
- Sigstore integration time:
-
Permalink:
ljw20180420/naapam@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Branch / Tag:
refs/tags/v0.1.17 - Owner: https://github.com/ljw20180420
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3d4d722ad78d86d44e54044acab6fbc24b027085 -
Trigger Event:
release
-
Statement type: