Routines for the extraction of degenerate sides and estimation of numbers neutral substitutions from sequences and alignments.
Project description
codon-degeneracy
This python package provides routines for the extraction of degenerate sites from sequences and alignments. The latter is particularly useful for estimations of rates of neutral evolution.
Dependencies
This code uses biopython and scikit-bio internally. In order to installl via pip, numpy has to be installed.
Installing
Simply clone this repo:
git clone https://github.com/nickmachnik/degenerate-sites.git [TARGET DIR]
and then install via pip
pip install [TARGET DIR]
Testing
Test the cloned package:
cd [TARGET DIR]
python -m unittest
Getting started
One of the main features of the package is the counting of neutral substitutions at four fold degenerate sites.
This is best done with known orthologue pairs between species.
substitution_rate_at_ffds
provides that functionality and is easy to use like so:
from codon_degeneracy import substitution_rate_at_ffds as nsr
seq_a = (
"ATACCCATGGCCAACCTCCTACTCCTCATTGTACCCATTC"
"TAATCGCAATGGCATTCCTAATGCTTACCGAACGA")
seq_b = (
"ATGACCACAGTAAATCTCCTACTTATAATCATACCCACAT"
"TAGCCGCCATAGCATTTCTCACACTCGTTGAACGA")
(number_of_substitutions, number_of_sites), (orf_a, orf_b) = nsr(
# the input sequences
seq_a,
seq_b,
# NCBI codon table names as used in Bio.Data.CodonTable
"Vertebrate Mitochondrial",
"Vertebrate Mitochondrial")
The ORFs returned are there for sanity checks. The default behaviour is to select the first ATG codon as start.
NOTE: The numbers of neutral substitutions per site reported by this function are merely a lower bound, as they do not include the possibility of multiple substitutions per site.
There are more useful and well documented functions under the hood, which I enourage to explore by browsing the code.
License
MIT license (LICENSE or https://opensource.org/licenses/MIT)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for codon_degeneracy-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f421dfb908761e9c37a9dc10f4094bd03722574fd69b34989053bcd410f58787 |
|
MD5 | 7e418b59279b8f3ccf303d8ab8904399 |
|
BLAKE2b-256 | 93300170e6367ef2138a152bbb5816039a0e55469680a656a0de4e0d24d8500e |