Skip to main content

Optimizing Codon Usage with a Quasispecies Model

Project description

Summary

We provide a library that enables us to select a number of reference genes to which codon usage should be optimized. Furthermore, we allow for input of a variable amount of fitness factors: translation speed of codons, tRNA abundance, etc. Given these contributing fitness factors the result is displayed as the strength of the respective fitness factors that lead to the best resemblance between simulated and reference codon usage. In a next step, the strengths can be tuned and a codon usage can be generated that can afterwards be used to adapt a gene sequence with the help of classic codon optimization tools as OPTIMIZER.

Example

In an example workflow you might want to select a fasta file that contains the genes you want use. You can either select them from a file or a url. In both cases a histogram of codon usage and amino acid usage is generated.

You can then (optionally) load a list of highly expressed genes, we support the format from the HEG database. Visualizing the codon usage bias for e.g. checking if the CUB as you expect can be done by plotting various methods of dimensionality reduction.

If you do not want to use all the genes you can enter a number n. The first n genes will only be analysed.

You now have to select a fitness matrix which gives the probability of one amino acid to be represented by another one.

Additionally, you can select a number of fitnessfunctions that assign to each codon a fitness. These functions will be normalized! If you want to perform a test run you have to enter the parameters: alpha,beta,selection,t_i for every testfunction. alpha and beta are parameters for the <todo> model of codon substitution and are related to transition/transversion bias. Input is either comma or whitespace/tab separated (or a combination of those).

You can compare the absolute codon usage and relative (normalized for each amino acid) codon usage by plot comparison. For optimizing the distance you can try optimizing the first gene and again regard the comparison to see if the algorithm works at all.

In a last step you can optimize all genes you have read in. Returned are the optimal parameters, a goodness of fit and the RSCU that you can use for optimizing with the help of, e.g., OPTIMIZER.

Authors and License

GPLv3 Jan-Hendrik Trösemeier, Susanne Lipp, Christel Kamp

Contact: name.lastname at pei.de

Project details


Release history Release notifications

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
cobilib-1.0.0.tar.gz (74.1 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page