Skip to main content

BABAPPAi: diagnostic framework for identifiability of episodic branch-site structure

Project description

BABAPPAi (babappai)

DOI

CI License: MIT

BABAPPAi is the renamed continuation of the BABAPPAΩ codebase. It is a diagnostic framework for branch-site recoverability/identifiability under matched neutral calibration.

1) Scope

BABAPPAi reports:

  • raw dispersion diagnostics: eii_z_raw, eii_01_raw
  • empirically calibrated identifiability probabilities: ceii_gene, ceii_site
  • empirical matched-neutral significance: p_emp, q_emp, significant_bool

BABAPPAi does not perform classical dN/dS likelihood-ratio testing, and significance does not prove adaptive substitution.

2) Installation

pip install babappai

3) Quickstart

babappai model fetch
babappai example write --outdir demo
babappai run --alignment demo/aln.fasta --tree demo/tree.nwk --outdir demo_out

4) Statistical outputs

For each gene-level run:

  • D_obs: observed dispersion statistic : sample variance (ddof=1) of site-level site_logit_mean across codon sites
  • mu0: matched neutral mean
  • sigma0_raw: raw neutral SD
  • sigma0_final: SD after floor application
  • eii_z_raw = (D_obs - mu0) / sigma0_final
  • eii_01_raw = sigmoid(eii_z_raw) (diagnostic magnitude scale)
  • ceii_gene: calibrated P(I_gene=1 | data)
  • ceii_site: calibrated P(I_site=1 | data)
  • ceii_gene_class, ceii_site_class: calibration-derived decision bands
  • ceii_ci: bootstrap calibration interval (if calibration asset provides it)
  • calibration_version, domain_shift_or_applicability
  • p_emp = (1 + count(D0 >= D_obs)) / (M + 1) from matched neutral replicates
  • q_emp: BH-adjusted p_emp across tested genes in an analysis set
  • significant_bool = (q_emp <= alpha) (default alpha=0.05)
  • significance_label ∈ {not_significant, significant}

ceii_* and q_emp are intentionally distinct layers:

  • ceii_*: recoverability/identifiability probability calibration
  • q_emp: excess-dispersion significance under matched-neutral calibration

5) Core CLI

babappai run --alignment aln.fasta --tree tree.nwk --outdir results
babappai run --alignment aln.fasta --tree tree.nwk --outdir results --alpha 0.05 --pvalue-mode empirical_monte_carlo --neutral-reps 200 --sigma-floor 0.05
babappai model fetch
babappai model status
babappai model verify
babappai doctor
babappai version
babappai version --ceii-asset babappai/data/ceii_calibration_v1.json

Significance-related options

  • --alpha (default 0.05)
  • --pvalue-mode (empirical_monte_carlo default; frozen_reference legacy fallback)
  • --neutral-reps (Monte Carlo neutral replicates)
  • --min-neutral-group-size
  • --sigma-floor
  • --retain-eii-bands / --no-retain-eii-bands
  • --report-threshold-bands / --no-report-threshold-bands
  • --ceii-enabled / --no-ceii-enabled
  • --ceii-asset

6) Validation workflows

babappai validate orthogroups select --input ORTHOGROUP_DIR --outdir selection_out
babappai validate orthogroups run --input selection_out --outdir empirical_out --alpha 0.05 --pvalue-mode empirical_monte_carlo

babappai validate synthetic run \
  --simulator scripts/simulator.py \
  --outdir synthetic_out \
  --alpha 0.05 \
  --pvalue-mode empirical_monte_carlo

babappai validate report --input validation_root --outdir report_out

Full-pipeline manuscript validation helper:

python scripts/run_full_pipeline_validation.py \
  --outdir results/validation/full_pipeline_v2 \
  --n_per_regime 100 \
  --n_replicates_per_scenario 5 \
  --bootstrap_reps 1000 \
  --alpha 0.05 \
  --pvalue_mode empirical_monte_carlo \
  --neutral_reps 200 \
  --sigma_floor 0.05 \
  --seed 123

cEII benchmark + calibration helper:

python scripts/run_ceii_calibration_benchmark.py \
  --outdir results/validation/ceii_benchmark_v1 \
  --n-per-regime 12 \
  --n-replicates-per-scenario 2 \
  --pvalue-mode frozen_reference \
  --bootstrap-reps 150 \
  --write-package-asset

7) Output files

babappai run writes:

  • results.json
  • branch_summary.tsv
  • site_summary.tsv
  • neutral_calibration_replicates.tsv
  • interpretation.txt
  • run_metadata.json

Validation runs add calibration/significance tables, bootstrap summaries, figures, and manuscript-facing markdown blocks.

8) Reproducibility and provenance

  • neutral replicate distributions used for p_emp are written to disk
  • run metadata includes software version, model DOI/SHA, calibration settings, and command provenance
  • BABAPPAi remains the renamed continuation of BABAPPAΩ; legacy model assets are explicitly labeled as such

9) Citation and interpretation guardrails

  • cite BABAPPAi software version
  • cite legacy model DOI while legacy frozen assets are used
  • interpret significance as excess dispersion relative to matched neutral calibration, not as proof of adaptation

10) Development

pip install -e .[test]
pip install ruff build twine
ruff check .
pytest
python -m build --sdist --wheel
python -m twine check dist/*

11) Reproducibility and project policy

  • large generated outputs policy: docs/reproducibility_artifacts.md
  • citation metadata: CITATION.cff
  • contribution guidelines: CONTRIBUTING.md
  • code of conduct: CODE_OF_CONDUCT.md
  • security reporting: SECURITY.md
  • maintainer release process: RELEASE_CHECKLIST.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

babappai-2.1.0.tar.gz (89.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

babappai-2.1.0-py3-none-any.whl (94.5 kB view details)

Uploaded Python 3

File details

Details for the file babappai-2.1.0.tar.gz.

File metadata

  • Download URL: babappai-2.1.0.tar.gz
  • Upload date:
  • Size: 89.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for babappai-2.1.0.tar.gz
Algorithm Hash digest
SHA256 32c73474a13a53cb08ae162a987d1270313999cac816e4754553b6f22194c6cb
MD5 92ed8a008b3574090d04282af52fdeac
BLAKE2b-256 26124eaf891df855a706515374c75224636c3cbccecd7b342ddf501a79b50a08

See more details on using hashes here.

Provenance

The following attestation bundles were made for babappai-2.1.0.tar.gz:

Publisher: publish.yml on sinhakrishnendu/BABAPPAi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file babappai-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: babappai-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 94.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for babappai-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcf2aad6f2ecd76c9b6a07189f4077f58b3d41f1e0e744b77bbc879dee8b911e
MD5 79a2910709363becd12bd7448c243575
BLAKE2b-256 932cc006e2111d4afe8d7ffa85f517b684de604e656d900e745e9f9285c7cada

See more details on using hashes here.

Provenance

The following attestation bundles were made for babappai-2.1.0-py3-none-any.whl:

Publisher: publish.yml on sinhakrishnendu/BABAPPAi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page