Predict the minimum free energy structure of nucleic acids
Predict the minimum free energy structure of nucleic acids.
pip install seqfold
from seqfold import calc_dg # a bifurcated DNA structure calc_dg("GGGAGGTCGTTACATCTGGGTAACACCGGTACTGATCCGGTGACCTCCC") # -12.94
$ seqfold TAGCTCAGCTGGGAGAGCGCCTGCTTTGCACGCAGGAGGT -t 32 -6.58
While UNAFold and mfold are the most widely used applications for nucleic acid secondary structure prediction, their format and license are restrictive.
seqfold is meant to be a more open-source, but minimal, application for predicting minimum free energy secondary structure.
|OS||Linux, MacOS, Windows||Linux, MacOS||Linux, MacOS, Windows|
|Format||python, CLI python||CLI binary||CLI binary|
|Dependencies||none||(mfold_util)||Perl, (gnuplot, glut/OpenGL)|
|Graphical||no||yes (output)||yes (output)|
That papers and others that were used to develop this library are below. Each paper is listed along with how it relates to
Nussinov, Ruth, and Ann B. Jacobson. "Fast algorithm for predicting the secondary structure of single-stranded RNA." Proceedings of the National Academy of Sciences 77.11 (1980): 6309-6313.
Framework for the dynamic programming approach. It has a conceptually helpful "Maximal Matching" example that demonstrates the approach on a simple sequence with only matched or unmatched bp.
Zuker, Michael, and Patrick Stiegler. "Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information." Nucleic acids research 9.1 (1981): 133-148.
The most cited paper in this space. Extends further than
Nussinov, 1980 with a nearest neighbor approach to energies and a consideration of each of stack, bulge, internal loop, and hairpin. Their data structure and traceback method are both more intuitive than
Jaeger, John A., Douglas H. Turner, and Michael Zuker. "Improved predictions of secondary structures for RNA." Proceedings of the National Academy of Sciences 86.20 (1989): 7706-7710.
Zuker and colleagues expand on the 1981 paper to incorporate penalties for multibranched loops and dangling ends.
SantaLucia Jr, John, and Donald Hicks. "The thermodynamics of DNA structural motifs." Annu. Rev. Biophys. Biomol. Struct. 33 (2004): 415-440.
The paper from which almost every DNA energy function in
seqfold comes from (with the exception of multibranch loops). Provides neighbor entropies and enthalpies for stacks, mismatching stacks, terminal stacks, and dangling stacks. Ditto for bulges, internal loops, and hairpins.
Turner, Douglas H., and David H. Mathews. "NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure." Nucleic acids research 38.suppl_1 (2009): D280-D282.
Source of RNA nearest neighbor change in entropy and enthalpy parameter data. In
Ward, M., Datta, A., Wise, M., & Mathews, D. H. (2017). Advanced multi-loop algorithms for RNA secondary structure prediction reveal that the simplest model is best. Nucleic acids research, 45(14), 8541-8550.
An investigation of energy functions for multibranch loops that validates the simple linear approach employed by
Jaeger, 1989 that keeps runtime at
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size seqfold-0.3.0-py2.py3-none-any.whl (22.5 kB)||File type Wheel||Python version py2.py3||Upload date||Hashes View hashes|
|Filename, size seqfold-0.3.0.tar.gz (30.7 kB)||File type Source||Python version None||Upload date||Hashes View hashes|
Hashes for seqfold-0.3.0-py2.py3-none-any.whl