Modern Python implementation of the McDonald-Kreitman test toolkit
Project description
MKado 御門
A modern Python implementation of the McDonald-Kreitman test toolkit for detecting selection in molecular evolution.
Features
- Standard MK test: Classic 2x2 contingency table with Fisher's exact test
- Polarized MK test: Uses a third outgroup to assign mutations to lineages
- Asymptotic MK test: Frequency-bin α estimates with exponential extrapolation (Messer & Petrov 2013)
- Tarone-Greenland α_TG: Weighted multi-gene estimator that corrects for sample size heterogeneity (Stoletzki & Eyre-Walker 2011)
- Batch processing: Process multiple genes with parallel execution and Benjamini-Hochberg correction for multiple testing
- Volcano plots: Visualize batch results with publication-ready volcano plots
- Multiple output formats: Pretty-print, TSV, and JSON
Installation
# Install with uv (recommended)
uv pip install mkado
# Or install with pip
pip install mkado
Development Installation
# Clone the repository
git clone https://github.com/andrewkern/mkado.git
cd mkado
# Install with uv
uv sync
Quick Start
# Standard MK test (combined alignment file)
mkado test alignment.fa -i "dmel" -o "dsim"
# Asymptotic MK test
mkado test alignment.fa -i "dmel" -o "dsim" -a
# Polarized MK test
mkado test alignment.fa -i "dmel" -o "dsim" --polarize-match "dyak"
# Batch process a directory
mkado batch alignments/ -i "dmel" -o "dsim"
# Batch with asymptotic test and 8 parallel workers
mkado batch alignments/ -i "dmel" -o "dsim" -a -w 8
# Get file info
mkado info sequences.fa
Usage Modes
mkado supports two modes for specifying ingroup/outgroup sequences:
Combined File Mode (Recommended)
Use -i and -o to filter sequences by name pattern from a single alignment file:
mkado test alignment.fa -i "speciesA" -o "speciesB"
mkado batch alignments/ -i "speciesA" -o "speciesB"
Separate Files Mode
Provide separate FASTA files for ingroup and outgroup:
mkado test ingroup.fa outgroup.fa
mkado batch genes/ --ingroup-pattern "*_in.fa" --outgroup-pattern "*_out.fa"
Commands
mkado test
Run MK test on a single alignment.
mkado test FASTA [OUTGROUP_FILE] [OPTIONS]
Key Options:
| Option | Short | Description |
|---|---|---|
--ingroup-match |
-i |
Ingroup sequence name pattern (combined mode) |
--outgroup-match |
-o |
Outgroup sequence name pattern (combined mode) |
--asymptotic |
-a |
Use asymptotic MK test |
--polarize |
-p |
Second outgroup file (separate files mode) |
--polarize-match |
Second outgroup pattern (combined mode) | |
--bins |
-b |
Frequency bins for asymptotic test (default: 10) |
--plot-asymptotic |
Generate alpha(x) plot for asymptotic test (PNG, PDF, or SVG) | |
--format |
-f |
Output format: pretty, tsv, json |
--reading-frame |
-r |
Reading frame 1-3 (default: 1) |
Examples:
# Combined file mode
mkado test alignment.fa -i "dmel" -o "dsim"
mkado test alignment.fa -i "dmel" -o "dsim" -a -b 20
mkado test alignment.fa -i "dmel" -o "dsim" -a --plot-asymptotic alpha_fit.png
mkado test alignment.fa -i "dmel" -o "dsim" --polarize-match "dyak"
# Separate files mode
mkado test ingroup.fa outgroup.fa
mkado test ingroup.fa outgroup.fa -a
mkado test ingroup.fa outgroup.fa -p outgroup2.fa
mkado batch
Run MK test on multiple alignment files.
mkado batch DIRECTORY [OPTIONS]
Key Options:
| Option | Short | Description |
|---|---|---|
--ingroup-match |
-i |
Ingroup pattern (enables combined file mode) |
--outgroup-match |
-o |
Outgroup pattern (required with -i) |
--asymptotic |
-a |
Use asymptotic MK test |
--alpha-tg |
Compute weighted α_TG (Stoletzki & Eyre-Walker 2011) | |
--aggregate/--per-gene |
Aggregate results or per-gene (asymptotic) | |
--pattern |
File glob pattern (default: auto-detect *.fa, *.fasta, *.fna) | |
--workers |
-w |
Parallel workers (0=auto, 1=sequential) |
--bins |
-b |
Frequency bins for asymptotic test |
--format |
-f |
Output format: pretty, tsv, json |
--volcano |
Generate volcano plot (PNG, PDF, or SVG) | |
--plot-asymptotic |
Generate alpha(x) plot for aggregated asymptotic test |
Examples:
# Combined file mode (recommended)
mkado batch alignments/ -i "dmel" -o "dsim"
mkado batch alignments/ -i "dmel" -o "dsim" -a
mkado batch alignments/ -i "dmel" -o "dsim" -a --per-gene
mkado batch alignments/ -i "dmel" -o "dsim" --alpha-tg
mkado batch alignments/ -i "dmel" -o "dsim" -w 8
# Generate a volcano plot
mkado batch alignments/ -i "dmel" -o "dsim" --volcano results.png
# Generate asymptotic alpha(x) plot
mkado batch alignments/ -i "dmel" -o "dsim" -a --plot-asymptotic alpha_fit.png
# Separate files mode
mkado batch genes/ --ingroup-pattern "*_in.fa" --outgroup-pattern "*_out.fa"
mkado info
Display information about a FASTA file.
mkado info FASTA [-r READING_FRAME]
Example Output
$ mkado test alignment.fa -i "kreitman" -o "mauritiana"
Found 11 ingroup, 1 outgroup sequences
MK Test Results:
Divergence: Dn=6, Ds=8
Polymorphism: Pn=1, Ps=8
Fisher's exact p-value: 0.176
Neutrality Index (NI): 0.1667
Alpha (α): 0.8333
DoS: 0.3175
Python API
from mkado import mk_test, asymptotic_mk_test, SequenceSet
# Run MK test
result = mk_test("ingroup.fa", "outgroup.fa")
print(f"Alpha: {result.alpha}")
print(f"P-value: {result.p_value}")
# Run asymptotic MK test
result = asymptotic_mk_test("ingroup.fa", "outgroup.fa")
print(f"Asymptotic Alpha: {result.alpha_asymptotic}")
print(f"95% CI: {result.ci_low} - {result.ci_high}")
# Combined file mode - filter by sequence name
all_seqs = SequenceSet.from_fasta("combined.fa")
ingroup = all_seqs.filter_by_name("dmel")
outgroup = all_seqs.filter_by_name("dsim")
result = mk_test(ingroup, outgroup)
Interpretation
Neutrality Index (NI)
- NI = 1: Neutral evolution
- NI > 1: Excess polymorphism (segregating weakly deleterious variants)
- NI < 1: Excess divergence (positive selection)
Alpha (α)
- α = 0: No adaptive substitutions
- α > 0: Proportion of substitutions driven by positive selection
- α < 0: Excess polymorphism relative to divergence
Direction of Selection (DoS)
From Stoletzki & Eyre-Walker (2011), DoS = Dn/(Dn+Ds) - Pn/(Pn+Ps):
- DoS = 0: Neutral evolution
- DoS > 0: Positive selection (excess adaptive substitutions)
- DoS < 0: Slightly deleterious polymorphisms
DoS is bounded [-1, +1] and symmetric around 0, making it easier to interpret than NI.
Development
# Install dev dependencies
uv sync
# Run tests
uv run pytest
# Run linter
uv run ruff check src/
# Run formatter
uv run ruff format src/
Examples
Example data and tutorials are available in the examples/ directory:
# Run batch MK test on example data
mkado batch examples/anopheles_batch/ -i gamb -o afun
# Run asymptotic MK test
mkado batch examples/anopheles_batch/ -i gamb -o afun -a
See the documentation for detailed tutorials and API reference.
References
- Haller, B. C., & Messer, P. W. (2017). asymptoticMK: A web-based tool for the asymptotic McDonald–Kreitman test. G3: Genes, Genomes, Genetics, 7(5), 1569-1575. https://doi.org/10.1534/g3.117.039693
- McDonald, J. H., & Kreitman, M. (1991). Adaptive protein evolution at the Adh locus in Drosophila. Nature, 351(6328), 652-654. https://doi.org/10.1038/351652a0
- Messer, P. W., & Petrov, D. A. (2013). Frequent adaptation and the McDonald–Kreitman test. PNAS, 110(21), 8615-8620. https://doi.org/10.1073/pnas.1220835110
- Rand, D. M., & Kann, L. M. (1996). Excess amino acid polymorphism in mitochondrial DNA: contrasts among genes from Drosophila, mice, and humans. Molecular Biology and Evolution, 13(6), 735-748. https://doi.org/10.1093/oxfordjournals.molbev.a025634
- Smith, N. G. C., & Eyre-Walker, A. (2002). Adaptive protein evolution in Drosophila. Nature, 415(6875), 1022-1024. https://doi.org/10.1038/4151022a
- Stoletzki, N., & Eyre-Walker, A. (2011). Estimation of the Neutrality Index. Molecular Biology and Evolution, 28(1), 63-70. https://doi.org/10.1093/molbev/msq249
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mkado-0.1.2.tar.gz.
File metadata
- Download URL: mkado-0.1.2.tar.gz
- Upload date:
- Size: 34.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6474d5dd482e3c816cba5db13168152d36eb12a41e5e24917d3413bdb6f674d
|
|
| MD5 |
e6a787c1ef204b17e1c324ac57b6d3ee
|
|
| BLAKE2b-256 |
85fc79303d3753467874e22291259aa6f363e1ed22f430a30d72c928d86746bb
|
Provenance
The following attestation bundles were made for mkado-0.1.2.tar.gz:
Publisher:
publish.yml on kr-colab/mkado
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mkado-0.1.2.tar.gz -
Subject digest:
a6474d5dd482e3c816cba5db13168152d36eb12a41e5e24917d3413bdb6f674d - Sigstore transparency entry: 1005473293
- Sigstore integration time:
-
Permalink:
kr-colab/mkado@99dcafa6c97e10507860c67fd7c5b08404a52e5a -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/kr-colab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@99dcafa6c97e10507860c67fd7c5b08404a52e5a -
Trigger Event:
push
-
Statement type:
File details
Details for the file mkado-0.1.2-py3-none-any.whl.
File metadata
- Download URL: mkado-0.1.2-py3-none-any.whl
- Upload date:
- Size: 45.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5aed8c7e50adf10fa20a2464699555971a960c58b4d16ca5e17276b0ba957a1e
|
|
| MD5 |
fa70dfdf46264a714ce347336d54cf6b
|
|
| BLAKE2b-256 |
f43c8a78fba7770b4336f8dd2132a99bf4c01e24ce417e68da36636382230763
|
Provenance
The following attestation bundles were made for mkado-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on kr-colab/mkado
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mkado-0.1.2-py3-none-any.whl -
Subject digest:
5aed8c7e50adf10fa20a2464699555971a960c58b4d16ca5e17276b0ba957a1e - Sigstore transparency entry: 1005473295
- Sigstore integration time:
-
Permalink:
kr-colab/mkado@99dcafa6c97e10507860c67fd7c5b08404a52e5a -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/kr-colab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@99dcafa6c97e10507860c67fd7c5b08404a52e5a -
Trigger Event:
push
-
Statement type: