Computes Tajimas D, the Pi- or Watterson-Estimator for multiple sequences.
Project description
Compute the Tajima's-D, Pi-Estimator or Watterson-Estimator for multiple sequences.
This module is now part of the bfx suite. See https://py-bfx.readthedocs.io for more information.
Tajima's D is a population genetic test statistic that computes the difference between the mean number of pairwise differences and the number of segregating sites. It is used to determine whether a population is expanding or shrinking.
Tajima's D
Tajima's D is defined as follows: $\theta_\text{Tajima}=\frac{\theta_{\pi}%20-%20\theta_{W}}{\sqrt{\text{Var}(\theta_{\pi}-\theta_{W})}}$
If $\theta_\text{Tajima}<0$, there are many rare variants, indicating an expanding population.
Whereas $0<\theta_\text{Tajima}$, indicates an declining population as there are many intermediate variants.
A result is consideres significant if $\theta_\text{Tajima}<-2$ or $2<\theta_\text{Tajima}$.
Pi-Estimator
The π estimator is the average number of pairwise differences between any two sequences:
$\theta_{\pi}=\frac{\text{Nr. of pairwise differences}}{\binom{n}{2}}$
Watterson-Estimator
The Watterson estimator is the expected number of segregating sites.
$\theta_{W}=\frac{\text{Nr. of segregating sites}}{\Sigma_{i=1}^{n-1}\frac{1}{i}}$
Installation
Using pip / pip3:
pip install tajimas_d
Using conda:
conda install -c bioconda tajimas_d
Or by source:
git clone git@github.com:not-a-feature/tajimas_d.git
cd tajimas_d
pip install .
How to use
from tajimas_d import tajimas_d, watterson_estimator, pi_estimator
sequences = ["AAAA", "AAAT", "AAGT", "AAGT"]
theta_tajima = tajimas_d(sequences)
theta_pi = pi_estimator(sequences)
theta_w = watterson_estimator(sequences)
Standalone version
The standalone version requires miniFasta>=2.2
to be installed.
usage: tajimas_d [-h] -f PATH [-p] [-t] [-w]
tajimas_d: Compute Tajima's D, the Pi- or Watterson-Estimator for multiple
sequences.
optional arguments:
-h, --help show this help message and exit
-f PATH, --file PATH Path to fasta file with all sequences.
-p, --pi Compute the Pi-Estimator score.
-t, --tajima Compute the Pi-Estimator score. (default)
-w, --watterson Compute the Watterson-Estimator score.
License
Copyright (C) 2024 by Jules Kreuer - @not_a_feature
This piece of software is published unter the GNU General Public License v3.0 TLDR:
Permissions | Conditions | Limitations |
---|---|---|
✓ Commercial use | Disclose source | ✕ Liability |
✓ Distribution | License and copyright notice | ✕ Warranty |
✓ Modification | Same license | |
✓ Patent use | State changes | |
✓ Private use |
Go to LICENSE.md to see the full version.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for tajimas_d-2.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a802fbb3997733284420f0dc803907485209ffcd5ac1b4b05b80c3fd42737060 |
|
MD5 | 3f553aaa53b216f5c68951c4a4a99dce |
|
BLAKE2b-256 | 8bcf195fbbc26bdd004c3e5cce8fc3d831a071d3438e934ae2ea692c3260c085 |