Skip to main content

Python package to analyze the results of base editor screens

Project description

poola_be

Python package for base editor screens

Install

pip install poola_be

How to use

To demonstrate the use of these functions, we will first design a base editor tiling library with guides tiling the transcript ENST00000380152 of BRCA2. These guides are annotated with predicted edits using the C>T base editor in the window of nucleotide 4-8.

from poola_be import core as pool_be
import pandas as pd

design_df = pd.read_csv('sample_input/crisprbe-guides.txt', sep='\t')
design_df.head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
Input CRISPR Enzyme Edit Type Edit Window Target Assembly Target Genome Sequence Target Gene ID Target Gene Symbol Target Gene Strand Target Transcript ID ... PAM Sequence sgRNA Target Sequence Start Pos. (global) sgRNA Orientation Nucleotide Edits (global) Guide Edits Nucleotide Edits Amino Acid Edits Mutation Category Constraint Violations Note
0 ENST00000380152 SpyoCas9 C-T 4..8 GRCh38 (9606) NC_000013.11 ENSG00000139618 BRCA2 + ENST00000380152.8 ... TGG 32316449 sense NaN NaN NaN NaN NaN NaN NaN
1 ENST00000380152 SpyoCas9 C-T 4..8 GRCh38 (9606) NC_000013.11 ENSG00000139618 BRCA2 + ENST00000380152.8 ... AGG 32316462 sense 32316465C>T C_4 5C>T Pro2Leu Missense NaN NaN
2 ENST00000380152 SpyoCas9 C-T 4..8 GRCh38 (9606) NC_000013.11 ENSG00000139618 BRCA2 + ENST00000380152.8 ... AGG 32316467 antisense 32316479G>A;32316481G>A, 32316483G>A C_8_6, C_4 19G>A;21G>A, 23G>A Glu7Lys, Arg8Lys Missense, Missense NaN NaN
3 ENST00000380152 SpyoCas9 C-T 4..8 GRCh38 (9606) NC_000013.11 ENSG00000139618 BRCA2 + ENST00000380152.8 ... TGG 32316477 antisense NaN NaN NaN NaN NaN NaN NaN
4 ENST00000380152 SpyoCas9 C-T 4..8 GRCh38 (9606) NC_000013.11 ENSG00000139618 BRCA2 + ENST00000380152.8 ... TGG 32316488 antisense NaN NaN NaN NaN NaN NaN NaN

5 rows × 23 columns

Assign severe mutation bin

As noted in the "Mutation Category" column, each guide is predicted to make more one or more types of mutations if Cs are present in the editing window. We can then annotate each guide with the most severe mutation bin in the order Nonsense > Splice site > Missense > Intron > Silent > UTR > no edit.

design_df['Mutation Bin'] = design_df['Mutation Category'].apply(pool_be.get_most_severe_mutation_type)
design_df[['sgRNA Target Sequence','Mutation Category','Mutation Bin']].head()
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
sgRNA Target Sequence Mutation Category Mutation Bin
0 TCGTAGGTAAAAATGCCTAT NaN No edits
1 TGCCTATTGGATCCAAAGAG Missense Missense
2 GGCCTCTCTTTGGATCCAAT Missense, Missense Missense
3 AAAAAATGTTGGCCTCTCTT NaN No edits
4 TTAAAAATTTCAAAAAATGT NaN No edits

Calculate median residue

We can then get the median residue of the predicted edits.

design_df['Median Residue'] = design_df.apply(lambda x: pool_be.get_median_residues(x['Mutation Bin'], x['Amino Acid Edits']), axis=1)
design_df[['sgRNA Target Sequence','Amino Acid Edits','Mutation Category','Mutation Bin','Median Residue']].head(15)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
sgRNA Target Sequence Amino Acid Edits Mutation Category Mutation Bin Median Residue
0 TCGTAGGTAAAAATGCCTAT NaN NaN No edits NaN
1 TGCCTATTGGATCCAAAGAG Pro2Leu Missense Missense 2.0
2 GGCCTCTCTTTGGATCCAAT Glu7Lys, Arg8Lys Missense, Missense Missense 7.5
3 AAAAAATGTTGGCCTCTCTT NaN NaN No edits NaN
4 TTAAAAATTTCAAAAAATGT NaN NaN No edits NaN
5 AAGACACGCTGCAACAAAGC Thr17Ile, Arg18Cys Missense, Missense Missense 17.5
6 TTTTTTTTTTAAATAGATTT NaN NaN No edits NaN
7 TAGGACCAATAAGTCTTAAT Pro26Leu Missense Missense 26.0
8 TCAAACCAATTAAGACTTAT Trp31Ter Nonsense Nonsense 31.0
9 GCAGGTTCAGAATTATAGGG Glu45Lys Missense Missense 45.0
10 TCTGCAGGTTCAGAATTATA Ala47Thr Missense Missense 47.0
11 TTCTGCAGGTTCAGAATTAT Ala47Thr Missense Missense 47.0
12 TTATGTTCAGATTCTTCTGC Glu51Lys Missense Missense 51.0
13 TGTGGAGTTTTAAATAGGTT NaN NaN No edits NaN
14 ACCTATTTAAAACTCCACAA NaN NaN No edits NaN

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

poola_be-0.0.1.tar.gz (12.9 kB view hashes)

Uploaded Source

Built Distribution

poola_be-0.0.1-py3-none-any.whl (8.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page