Efficiently computing dN/dS according to the NJ86 method
Project description
comp_dnds
Efficiently computing dN/dS according to the Nei-Gojobori(1986)[^1] method
from comp_dnds import dnds
d = dnds()
ref_seq = "ATG AAA CCC GGG TTT TAA".replace(" ", "")
obs_seq = "ATG AAA CGC GGC TAC TAA".replace(" ", "")
dn, ds, z, p = d.compute(ref_seq, obs_seq)
ω = dn/ds
print(round(ω, 2))
# 0.12
For longer more realistic sequences, it is also possible to compute the significance of ω, using bootstrap resampling. The Z-score and the p-value are computed using the method described by Nei&Kumar(2000)[^2]
d = dnds()
ref_seq = "GCC GGG GGA AGG ACA TAT CTC GCT CCA CCT AAT GGA ATC ATC GGT".replace(" ", "")
obs_seq = "GAC GGC CGA GGG GCA AAT CGC ACT ACA ACT ACT GAA GTC ATC AGT".replace(" ", "")
dn, ds, z, p = d.compute(ref_seq, obs_seq, bootstrap=1000)
print(f"ω= {round(dn/ds,2)} - z-score= {round(z,2)} - p-val= {round(p, 5)}")
# ω= 6.91 - z-score= 3.72 - p-val= 0.0002
Installation
pip install comp_dnds
Background
To compute the dN/dS ratio using the Nei-Gojobori(1986)[^1] method, one needs an observed, and a reference, or ancestral sequence. The observed sequence is typically the one from your sample/experiment, while the reference/ancestral sequence can be inferred with ancestral state reconstruction, for example using IQ-TREE.
The dN/dS ratio, often represented as ω (omega), is a metric used in molecular biology and evolutionary biology to measure the selective pressure acting on a protein-coding gene. It compares the rate of non-synonymous substitutions (dN) to the rate of synonymous substitutions (dS) in coding sequences.
Non-synonymous substitutions (dN): These are mutations that change the amino acid in a protein. They can potentially affect the protein's structure and function, and therefore they can be subject to natural selection.
Synonymous substitutions (dS): These are mutations that do not change the amino acid in a protein. Because they don't change the protein's structure or function, they are often considered to be selectively neutral, or at least under less selective pressure than non-synonymous changes.
The dN/dS ratio provides insight into the evolutionary forces acting on a gene:
-
dN/dS = 1: The rate of non-synonymous substitutions is about the same as the rate of synonymous substitutions. This suggests that the gene is evolving under neutral evolution.
-
dN/dS < 1: The rate of non-synonymous changes is lower than the rate of synonymous changes. This suggests that the gene is under purifying or negative selection. Negative selection acts to remove deleterious mutations.
-
dN/dS > 1: The rate of non-synonymous changes is higher than the rate of synonymous changes. This suggests that the gene is under positive or adaptive selection. This means that non-synonymous changes provide some advantage and are being selected for.
Benchmark
Figure 1: Average time taken by biopython cal_dn_ds
and comp_dnds
to compute the dN and dS value for a sequence of length 999 nucleotides. The lower the better.
comp_dnds
produces identical results to the cal_dn_ds
implementation of biopython, while being on average 32X faster than biopython. Detailed benchmark in notebooks/benchmark.ipynb
[^1]: Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions
[^2]: Nei, M., & Kumar, S. (2000). Molecular evolution and phylogenetics. Oxford University Press, USA. Page 55
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file comp_dnds-0.2.0.tar.gz
.
File metadata
- Download URL: comp_dnds-0.2.0.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Linux/6.2.0-1019-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91a665746e58021b94c4aa4635c0cf706b4b57d3c69f398aab522515c37f8ccc |
|
MD5 | f10351397887e93a41aed669d5023010 |
|
BLAKE2b-256 | 377fe10ab8107e9284b6af5bfaab9254940e85f3b40883153a261132928ec815 |
File details
Details for the file comp_dnds-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: comp_dnds-0.2.0-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Linux/6.2.0-1019-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ba75d61aef9f9e0600a823534ca9fc48b424ae047ac4e1b36307b18a3abdd37 |
|
MD5 | b2df83988575b294168535c8df025fc8 |
|
BLAKE2b-256 | 35aeb791537a197896da3f2fdf91675fbc65eb6ec016b0502e6a951714b5c952 |