Skip to main content

BABAPPAi: diagnostic framework for identifiability of episodic branch-site structure

Project description

BABAPPAi (babappai)

CI License: MIT

BABAPPAi is the renamed continuation of the BABAPPAΩ codebase. It is a diagnostic framework for branch-site recoverability/identifiability under matched neutral calibration.

1) Scope

BABAPPAi reports:

  • raw dispersion diagnostics: eii_z_raw, eii_01_raw
  • empirically calibrated identifiability probabilities: ceii_gene, ceii_site
  • empirical matched-neutral significance: p_emp, q_emp, significant_bool

BABAPPAi does not perform classical dN/dS likelihood-ratio testing, and significance does not prove adaptive substitution.

2) Installation

pip install babappai

3) Quickstart

babappai model fetch
babappai example write --outdir demo
babappai run --alignment demo/aln.fasta --tree demo/tree.nwk --outdir demo_out

4) Statistical outputs

For each gene-level run:

  • D_obs: observed dispersion statistic : sample variance (ddof=1) of site-level site_logit_mean across codon sites
  • mu0: matched neutral mean
  • sigma0_raw: raw neutral SD
  • sigma0_final: SD after floor application
  • eii_z_raw = (D_obs - mu0) / sigma0_final
  • eii_01_raw = sigmoid(eii_z_raw) (diagnostic magnitude scale)
  • ceii_gene: calibrated P(I_gene=1 | data)
  • ceii_site: calibrated P(I_site=1 | data)
  • ceii_gene_class, ceii_site_class: calibration-derived decision bands
  • ceii_ci: bootstrap calibration interval (if calibration asset provides it)
  • calibration_version, domain_shift_or_applicability
  • p_emp = (1 + count(D0 >= D_obs)) / (M + 1) from matched neutral replicates
  • q_emp: BH-adjusted p_emp across tested genes in an analysis set
  • significant_bool = (q_emp <= alpha) (default alpha=0.05)
  • significance_label ∈ {not_significant, significant}

ceii_* and q_emp are intentionally distinct layers:

  • ceii_*: recoverability/identifiability probability calibration
  • q_emp: excess-dispersion significance under matched-neutral calibration

5) Core CLI

babappai run --alignment aln.fasta --tree tree.nwk --outdir results
babappai run --alignment aln.fasta --tree tree.nwk --outdir results --alpha 0.05 --pvalue-mode empirical_monte_carlo --neutral-reps 200 --sigma-floor 0.05
babappai model fetch
babappai model status
babappai model verify
babappai doctor
babappai version
babappai version --ceii-asset babappai/data/ceii_calibration_v1.json

Significance-related options

  • --alpha (default 0.05)
  • --pvalue-mode (empirical_monte_carlo default; frozen_reference legacy fallback)
  • --neutral-reps (Monte Carlo neutral replicates)
  • --min-neutral-group-size
  • --sigma-floor
  • --retain-eii-bands / --no-retain-eii-bands
  • --report-threshold-bands / --no-report-threshold-bands
  • --ceii-enabled / --no-ceii-enabled
  • --ceii-asset

6) Validation workflows

babappai validate orthogroups select --input ORTHOGROUP_DIR --outdir selection_out
babappai validate orthogroups run --input selection_out --outdir empirical_out --alpha 0.05 --pvalue-mode empirical_monte_carlo

babappai validate synthetic run \
  --simulator scripts/simulator.py \
  --outdir synthetic_out \
  --alpha 0.05 \
  --pvalue-mode empirical_monte_carlo

babappai validate report --input validation_root --outdir report_out

Full-pipeline manuscript validation helper:

python scripts/run_full_pipeline_validation.py \
  --outdir results/validation/full_pipeline_v2 \
  --n_per_regime 100 \
  --n_replicates_per_scenario 5 \
  --bootstrap_reps 1000 \
  --alpha 0.05 \
  --pvalue_mode empirical_monte_carlo \
  --neutral_reps 200 \
  --sigma_floor 0.05 \
  --seed 123

cEII benchmark + calibration helper:

python scripts/run_ceii_calibration_benchmark.py \
  --outdir results/validation/ceii_benchmark_v1 \
  --n-per-regime 12 \
  --n-replicates-per-scenario 2 \
  --pvalue-mode frozen_reference \
  --bootstrap-reps 150 \
  --write-package-asset

7) Output files

babappai run writes:

  • results.json
  • branch_summary.tsv
  • site_summary.tsv
  • neutral_calibration_replicates.tsv
  • interpretation.txt
  • run_metadata.json

Validation runs add calibration/significance tables, bootstrap summaries, figures, and manuscript-facing markdown blocks.

8) Reproducibility and provenance

  • neutral replicate distributions used for p_emp are written to disk
  • run metadata includes software version, model DOI/SHA, calibration settings, and command provenance
  • BABAPPAi remains the renamed continuation of BABAPPAΩ; legacy model assets are explicitly labeled as such

9) Citation and interpretation guardrails

  • cite BABAPPAi software version
  • cite legacy model DOI while legacy frozen assets are used
  • interpret significance as excess dispersion relative to matched neutral calibration, not as proof of adaptation

10) Development

pip install -e .[test]
pip install ruff build twine
ruff check .
pytest
python -m build --sdist --wheel
python -m twine check dist/*

11) Reproducibility and project policy

  • large generated outputs policy: docs/reproducibility_artifacts.md
  • citation metadata: CITATION.cff
  • contribution guidelines: CONTRIBUTING.md
  • code of conduct: CODE_OF_CONDUCT.md
  • security reporting: SECURITY.md
  • maintainer release process: RELEASE_CHECKLIST.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

babappai-1.1.0.tar.gz (86.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

babappai-1.1.0-py3-none-any.whl (91.5 kB view details)

Uploaded Python 3

File details

Details for the file babappai-1.1.0.tar.gz.

File metadata

  • Download URL: babappai-1.1.0.tar.gz
  • Upload date:
  • Size: 86.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for babappai-1.1.0.tar.gz
Algorithm Hash digest
SHA256 22789db9de621130828392bb0a0d584061c889380851c9f2b11db175e1a77f53
MD5 3b9bb63caa6b17de06ed3a7c292dfb75
BLAKE2b-256 62e99ee4f05dc6cf11ae047d5d5332d656de13f13e2d4994ea8d21b1b81a77fb

See more details on using hashes here.

Provenance

The following attestation bundles were made for babappai-1.1.0.tar.gz:

Publisher: publish.yml on sinhakrishnendu/BABAPPAi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file babappai-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: babappai-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 91.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for babappai-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c9d110f52b808675e6e0cf01fbcc972c652fd8eef6ec15b999961c809f9e2d80
MD5 d7ce3e0b32959bb0b0bc993c6dd141fb
BLAKE2b-256 19fef9eeff89df314958fe07b8b6c3502b4a2680940c7a5cc1230b20ec1e0ce3

See more details on using hashes here.

Provenance

The following attestation bundles were made for babappai-1.1.0-py3-none-any.whl:

Publisher: publish.yml on sinhakrishnendu/BABAPPAi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page