Skip to main content

Efficiently computing dN/dS according to the NJ86 method

Project description

comp_dnds

CI PyPI version

Efficiently computing dN/dS according to the Nei-Gojobori(1986)[^1] method

from comp_dnds import dnds

d = dnds()
ref_seq = "ATG AAA CCC GGG TTT TAA".replace(" ", "")
obs_seq = "ATG AAA CGC GGC TAC TAA".replace(" ", "")
dn, ds, z, p = d.compute(ref_seq, obs_seq)
ω = dn/ds
print(round(ω, 2))
# 0.12

For longer more realistic sequences, it is also possible to compute the significance of ω, using bootstrap resampling. The Z-score and the p-value are computed using the method described by Nei&Kumar(2000)[^2]

d = dnds()
ref_seq = "GCC GGG GGA AGG ACA TAT CTC GCT CCA CCT AAT GGA ATC ATC GGT".replace(" ", "")
obs_seq = "GAC GGC CGA GGG GCA AAT CGC ACT ACA ACT ACT GAA GTC ATC AGT".replace(" ", "")
dn, ds, z, p = d.compute(ref_seq, obs_seq, bootstrap=1000)
print(f"ω= {round(dn/ds,2)} - z-score= {round(z,2)} - p-val= {round(p, 5)}")
# ω= 6.91 - z-score= 3.72 - p-val= 0.0002

Installation

pip install comp_dnds

Background

To compute the dN/dS ratio using the Nei-Gojobori(1986)[^1] method, one needs an observed, and a reference, or ancestral sequence. The observed sequence is typically the one from your sample/experiment, while the reference/ancestral sequence can be inferred with ancestral state reconstruction, for example using IQ-TREE.

The dN/dS ratio, often represented as ω (omega), is a metric used in molecular biology and evolutionary biology to measure the selective pressure acting on a protein-coding gene. It compares the rate of non-synonymous substitutions (dN) to the rate of synonymous substitutions (dS) in coding sequences.

Non-synonymous substitutions (dN): These are mutations that change the amino acid in a protein. They can potentially affect the protein's structure and function, and therefore they can be subject to natural selection.

Synonymous substitutions (dS): These are mutations that do not change the amino acid in a protein. Because they don't change the protein's structure or function, they are often considered to be selectively neutral, or at least under less selective pressure than non-synonymous changes.

The dN/dS ratio provides insight into the evolutionary forces acting on a gene:

  • dN/dS = 1: The rate of non-synonymous substitutions is about the same as the rate of synonymous substitutions. This suggests that the gene is evolving under neutral evolution.

  • dN/dS < 1: The rate of non-synonymous changes is lower than the rate of synonymous changes. This suggests that the gene is under purifying or negative selection. Negative selection acts to remove deleterious mutations.

  • dN/dS > 1: The rate of non-synonymous changes is higher than the rate of synonymous changes. This suggests that the gene is under positive or adaptive selection. This means that non-synonymous changes provide some advantage and are being selected for.

Benchmark


Figure 1: Average time taken by biopython cal_dn_ds and comp_dnds to compute the dN and dS value for a sequence of length 999 nucleotides. The lower the better.

comp_dnds produces identical results to the cal_dn_ds implementation of biopython, while being on average 32X faster than biopython. Detailed benchmark in notebooks/benchmark.ipynb

[^1]: Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions
[^2]: Nei, M., & Kumar, S. (2000). Molecular evolution and phylogenetics. Oxford University Press, USA. Page 55

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

comp_dnds-0.2.0.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

comp_dnds-0.2.0-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file comp_dnds-0.2.0.tar.gz.

File metadata

  • Download URL: comp_dnds-0.2.0.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.13 Linux/6.2.0-1019-azure

File hashes

Hashes for comp_dnds-0.2.0.tar.gz
Algorithm Hash digest
SHA256 91a665746e58021b94c4aa4635c0cf706b4b57d3c69f398aab522515c37f8ccc
MD5 f10351397887e93a41aed669d5023010
BLAKE2b-256 377fe10ab8107e9284b6af5bfaab9254940e85f3b40883153a261132928ec815

See more details on using hashes here.

File details

Details for the file comp_dnds-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: comp_dnds-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.13 Linux/6.2.0-1019-azure

File hashes

Hashes for comp_dnds-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ba75d61aef9f9e0600a823534ca9fc48b424ae047ac4e1b36307b18a3abdd37
MD5 b2df83988575b294168535c8df025fc8
BLAKE2b-256 35aeb791537a197896da3f2fdf91675fbc65eb6ec016b0502e6a951714b5c952

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page