Representing double stranded DNA and functions for simulating cloning and homologous recombination between DNA molecules.
Project description
pydna
Planning genetic constructs with many parts and assembly steps, such as recombinant metabolic pathways :petri_dish:, are often difficult to properly document as is evident from the state of such documentation in the scientific literature :radioactive:.
The pydna python package provide a human-readable formal descriptions of :dna: cloning and genetic assembly strategies in Python :snake: which allow for simulation and verification.
Pydna can perhaps be thought of as executable documentation for cloning.
A cloning strategy expressed in pydna is complete, unambiguous and stable.
Pydna provides simulation of:
- Restriction digestion
- Ligation
- PCR
- Primer design
- Gibson assembly
- Golden gate assembly
- Homologous recombination
- Gel electrophoresis of DNA with generation of gel images
Virtually any sub-cloning experiment can be described in pydna, and its execution yield the sequences of intermediate and final DNA molecules.
Pydna has been designed to be understandable for biologists with only some basic understanding of Python.
Pydna can formalize planning and sharing of cloning strategies and is especially useful for complex or combinatorial DNA molecule constructions.
To get started, we have compiled some simple examples. For more elaborate use, look at some assembly strategies of D-xylose metabolic pathways MetabolicEngineeringGroupCBMA/ypk-xylose-pathways.
There is an open access paper in BMC Bioinformatics describing pydna:
Please reference the above paper:
Pereira, F., Azevedo, F., Carvalho, Â., Ribeiro, G. F., Budde, M. W., & Johansson, B. (2015). Pydna: a simulation and documentation tool for DNA assembly strategies using python. BMC Bioinformatics, 16(142), 142.
if using pydna in a scientific publication.
Usage
Most pydna functionality is implemented as methods for the double stranded DNA sequence record classes Dseq and Dseqrecord, which are subclasses of the Biopython Seq and SeqRecord classes.
These classes make cut and paste cloning and PCR very simple:
>>> from pydna.dseq import Dseq
>>> seq = Dseq("GGATCCAAA","TTTGGATCC",ovhg=0)
>>> seq
Dseq(-9)
GGATCCAAA
CCTAGGTTT
>>> from Bio.Restriction import BamHI
>>> a,b = seq.cut(BamHI)
>>> a
Dseq(-5)
G
CCTAG
>>> b
Dseq(-8)
GATCCAAA
GTTT
>>> a+b
Dseq(-9)
GGATCCAAA
CCTAGGTTT
>>> b+a
Dseq(-13)
GATCCAAAG
GTTTCCTAG
>>> b+a+b
Dseq(-17)
GATCCAAAGGATCCAAA
GTTTCCTAGGTTT
>>> b+a+a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pydna/dsdna.py", line 217, in __add__
raise TypeError("sticky ends not compatible!")
TypeError: sticky ends not compatible!
>>>
As the example above shows, pydna keeps track of sticky ends.
Notably, homologous recombination and Gibson assembly between linear DNA fragments can be easily simulated without any additional information besides the primary sequence of the fragments.
Gel electrophoresis of DNA fragments can be simulated using the included gel module
Jupyter QtConsole 4.7.7
Python 3.8.5 | packaged by conda-forge | (default, Aug 29 2020, 01:22:49)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.18.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from pydna.gel import gel
In [2]: from pydna.ladders import PennStateLadder
In [3]: from pydna.dseqrecord import Dseqrecord
In [4]: gel([PennStateLadder,[Dseqrecord("A"*2000)]])
Out[4]:
Pydna can be very compact. The eleven lines of Python below simulates the construction of a recombinant plasmid. DNA sequences are downloaded from Genbank by accession numbers that are guaranteed to be stable over time.
from pydna.genbank import Genbank
gb = Genbank("myself@email.com") # Tell Genbank who you are!
gene = gb.nucleotide("X06997") # Kluyveromyces lactis LAC12 gene for lactose permease.
from pydna.parsers import parse_primers
primer_f,primer_r = parse_primers(''' >760_KlLAC12_rv (20-mer)
ttaaacagattctgcctctg
>759_KlLAC12_fw (19-mer)
aaatggcagatcattcgag ''')
from pydna.amplify import pcr
pcr_prod = pcr(primer_f,primer_r, gene)
vector = gb.nucleotide("AJ001614") # pCAPs cloning vector
from Bio.Restriction import EcoRV
lin_vector = vector.linearize(EcoRV)
rec_vec = ( lin_vector + pcr_prod ).looped()
Pydna can automate the simulation of sub cloning experiments using python. This is helpful to generate examples for teaching purposes.
Read the documentation (below) or the cookbook with example files for further information.
Please post a message in the google group for pydna if you need help or have problems, questions or comments :sos:.
Feedback & suggestions are very welcome!
Who is using pydna?
Taylor, L. J., & Strebel, K. (2017). Pyviko: an automated Python tool to design gene knockouts in complex viruses with overlapping genes. BMC Microbiology, 17(1), 12. PubMed
Wang, Y., Xue, H., Pourcel, C., Du, Y., & Gautheret, D. (2021). 2-kupl: mapping-free variant detection from DNA-seq data of matched samples. In Cold Spring Harbor Laboratory (p. 2021.01.17.427048). DOI PubMed
An Automated Protein Synthesis Pipeline with Transcriptic and Snakemake
and other projects on github
Documentation
Documentation is built using Sphinx from docstrings in the code and displayed at readthedocs
The numpy docstring format is used.
Installation using pip
Pip is included in recent Python versions and is the officially recommended tool.
Pip installs the minimal installation requirements automatically, but not the optional requirements (see below).
pip install pydna
or use the --pre switch to get the latest version of pydna.
pip install pydna --pre
Windows:
You should be able to pip install pydna from the Windows terminal as biopython now can be installed with pip as well.
C:\> pip install pydna
By default python and pip are not on the PATH. You can re-install Python and select this option during installation, or give the full path for pip. Try something like this, depending on where your copy of Python is installed:
C:\Python37\Scripts\pip install pydna
Installing requirements
If you want to install requirements before installing pydna, you can do:
pip install -r requirements.txt
And for the optional requirements:
pip install -r requirements_optional.txt
For testing:
pip install -r requirements_test.txt
or
conda install --file requirements.txt
Source Code
Pydna is developed on Github :octocat:.
Minimal installation dependencies
Pydna versions before 1.0.0 were compatible with python 2.7 only. The list below is the minimal requirements for installing pydna. Biopython has c-extensions, but the other modules are pure python.
- Python 3.8, 3.9, 3.10 or 3.11
- appdirs >= 1.3.0
- biopython >= 1.80
- networkx >= 1.8.1
- prettytable >= 0.7.2
Optional dependencies
If the modules listed below in the first column are installed, they will provide the functionality listed in the second column.
Dependency | Function in pydna |
---|---|
pyparsing | fix corrupt Genbank files with pydna.genbankfixer |
requests | download sequences with pydna.download |
CAI | codon adaptation index calculations in several modules |
numpy | gel simulation with pydna.gel |
scipy | “ |
matplotlib | “ |
pillow | “ |
Requirements for running tests and analyzing code coverage
Changelog
See the change log for recent changes.
Automatic testing & Release process
There are three github actions associated with this package:
pydna_test_and_coverage_workflow.yml
The pydna_test_and_coverage_workflow.yml
is triggered on all pushed non-tagged commits.
This workflow run tests, doctests and a series of Jupyter notebooks using pytest.
The two other workflows build a setuptools wheel and packages for different Python versions on Linux, Windows and macOS.
These are triggered by publishing a github release manually from the github interface.
Building a PyPI package
poetry build # run this command in the root directory where the pyproject.toml file is located
History
Pydna was made public in 2012 on Google code.
:microbe:
:portugal:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pydna-5.2.0a8.tar.gz
.
File metadata
- Download URL: pydna-5.2.0a8.tar.gz
- Upload date:
- Size: 101.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.1 CPython/3.11.0 Linux/5.15.0-58-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 202e4c7f0e09cd90416b9600bf3a2fe89e778fad6f3b7e34da2d80f343205925 |
|
MD5 | d56ea50e9f9760e4796cf46458286117 |
|
BLAKE2b-256 | 4dd25874881dc1cc92a0f097a4ac55d8667514618b045c6f4c59c86651ee674d |
Provenance
File details
Details for the file pydna-5.2.0a8-py3-none-any.whl
.
File metadata
- Download URL: pydna-5.2.0a8-py3-none-any.whl
- Upload date:
- Size: 115.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.1 CPython/3.11.0 Linux/5.15.0-58-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 36ece345eca9f9852d6d404e9f6da03954ecaf1a7f8c515c17a0e378f3e85064 |
|
MD5 | ca2b678cc10d79fb358d323ea51975d5 |
|
BLAKE2b-256 | 7403c8722f762aa88b17ca1a860173309df07659565aa8fbb6b3d1ff84f3f1b8 |