Skip to main content

Provides DNA overhang misannealing data (Potatov 2018).

Project description


Travis CI build status

Tatapov is a Python library making accessible and easy to explore the DNA overhang misannealing data from the following paper (available on arxiv):

Optimization of Golden Gate assembly through application of ligation sequence-dependent fidelity and bias profiling, Potapov Vladimir, Jennifer L. Ong, Rebecca B. Kucera, Bradley W. Langhorst, Katharina Bilotti, John M. Pryor, Eric J. Cantor, Barry Canton, Thomas F. Knight, Thomas C. Evans Jr., Gregory Lohman. May 2018,

The Supplementary Material of this paper provides tables of inter-overhang annealing data in various 4 conditions (01h or 18h icubations at 25C or 37C). Tatapov provides these tables (it will download them automatically from Arxiv on the first use) as Pandas dataframes, so that they are easy to manipulate.

It also provides simple methods to build and plot subsets of the data (plotting requires Matplotlib installed).

Usage Example


import tatapov

# Get a subset of the data at 25C (1h incubation)
data = tatapov.annealing_data["25C"]["01h"] # a pandas dataframe
overhangs = ["ACGA", "AAAT", "AGAG"]
subset = tatapov.data_subset(data, overhangs, add_reverse=True)

# Plot the data subset
ax, _ = tatapov.plot_data(subset, figwidth=5)

In the plot above, if you see anything else than the square pairs around the diagonal, it means there is cross-talking between your overhangs (so risk of misannealing). If one of these diagmonal square pairs appears lighter than the others, it means that the corresponding overhang has weak self-annealing (risk of having no assembly).

Identifying weak self-annealing overhangs

import tatapov

annealing_data = tatapov.annealing_data['37C']['01h']

# Compute a dictionary {overhang: self-annealing score in 0-1}
relative_self_annealing = tatapov.relative_self_annealings(annealing_data)

weak_self_annealing_overhangs = [
    for overhang, self_annealing in relative_self_annealing.items()
    if self_annealing < 0.4

Identifying overhang pairs with significant cross-talking

import tatapov

annealing_data = tatapov.annealing_data['37C']['01h']

# Compute a dictionary {overhang_pair: cross-talking score in 0-1}
cross_annealings = tatapov.cross_annealings(annealing_data)

high_cross_annealing_pairs = [
    for overhang_pair, cross_annealing in cross_annealings.items()
    if cross_annealing > 0.08


You can install Tatapov through PIP

sudo pip install tatapov

Alternatively, you can unzip the sources in a folder and type

sudo python install

License = MIT

Tatapov is an open-source software originally written at the Edinburgh Genome Foundry by Zulko and released on Github under the MIT licence (¢ Edinburg Genome Foundry). Everyone is welcome to contribute !

Please contact us if there is any issue regarding copyright (there shouldn’t be as the repository does not contain any data, and the paper data is free to download).

More biology software

Tatapov is part of the EGF Codons synthetic biology software suite for DNA design, manufacturing and validation.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
tatapov-0.1.2.tar.gz (9.2 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page