BABAPPAi: diagnostic framework for identifiability of episodic branch-site structure
Project description
BABAPPAi (babappai)
BABAPPAi is the renamed continuation of the BABAPPAΩ codebase. It is a diagnostic framework for branch-site recoverability/identifiability under matched neutral calibration.
1) Scope
BABAPPAi reports:
- raw dispersion diagnostics:
eii_z_raw,eii_01_raw - empirically calibrated identifiability probabilities:
ceii_gene,ceii_site - empirical matched-neutral significance:
p_emp,q_emp,significant_bool
BABAPPAi does not perform classical dN/dS likelihood-ratio testing, and significance does not prove adaptive substitution.
2) Installation
pip install babappai
3) Quickstart
babappai model fetch
babappai example write --outdir demo
babappai run --alignment demo/aln.fasta --tree demo/tree.nwk --outdir demo_out
4) Statistical outputs
For each gene-level run:
D_obs: observed dispersion statistic : sample variance (ddof=1) of site-levelsite_logit_meanacross codon sitesmu0: matched neutral meansigma0_raw: raw neutral SDsigma0_final: SD after floor applicationeii_z_raw = (D_obs - mu0) / sigma0_finaleii_01_raw = sigmoid(eii_z_raw)(diagnostic magnitude scale)ceii_gene: calibratedP(I_gene=1 | data)ceii_site: calibratedP(I_site=1 | data)ceii_gene_class,ceii_site_class: calibration-derived decision bandsceii_ci: bootstrap calibration interval (if calibration asset provides it)calibration_version,domain_shift_or_applicabilityp_emp = (1 + count(D0 >= D_obs)) / (M + 1)from matched neutral replicatesq_emp: BH-adjustedp_empacross tested genes in an analysis setsignificant_bool = (q_emp <= alpha)(defaultalpha=0.05)significance_label ∈ {not_significant, significant}
ceii_* and q_emp are intentionally distinct layers:
ceii_*: recoverability/identifiability probability calibrationq_emp: excess-dispersion significance under matched-neutral calibration
5) Core CLI
babappai run --alignment aln.fasta --tree tree.nwk --outdir results
babappai run --alignment aln.fasta --tree tree.nwk --outdir results --alpha 0.05 --pvalue-mode empirical_monte_carlo --neutral-reps 200 --sigma-floor 0.05
babappai model fetch
babappai model status
babappai model verify
babappai doctor
babappai version
babappai version --ceii-asset babappai/data/ceii_calibration_v1.json
Significance-related options
--alpha(default0.05)--pvalue-mode(empirical_monte_carlodefault;frozen_referencelegacy fallback)--neutral-reps(Monte Carlo neutral replicates)--min-neutral-group-size--sigma-floor--retain-eii-bands/--no-retain-eii-bands--report-threshold-bands/--no-report-threshold-bands--ceii-enabled/--no-ceii-enabled--ceii-asset
6) Validation workflows
babappai validate orthogroups select --input ORTHOGROUP_DIR --outdir selection_out
babappai validate orthogroups run --input selection_out --outdir empirical_out --alpha 0.05 --pvalue-mode empirical_monte_carlo
babappai validate synthetic run \
--simulator scripts/simulator.py \
--outdir synthetic_out \
--alpha 0.05 \
--pvalue-mode empirical_monte_carlo
babappai validate report --input validation_root --outdir report_out
Full-pipeline manuscript validation helper:
python scripts/run_full_pipeline_validation.py \
--outdir results/validation/full_pipeline_v2 \
--n_per_regime 100 \
--n_replicates_per_scenario 5 \
--bootstrap_reps 1000 \
--alpha 0.05 \
--pvalue_mode empirical_monte_carlo \
--neutral_reps 200 \
--sigma_floor 0.05 \
--seed 123
cEII benchmark + calibration helper:
python scripts/run_ceii_calibration_benchmark.py \
--outdir results/validation/ceii_benchmark_v1 \
--n-per-regime 12 \
--n-replicates-per-scenario 2 \
--pvalue-mode frozen_reference \
--bootstrap-reps 150 \
--write-package-asset
7) Output files
babappai run writes:
results.jsonbranch_summary.tsvsite_summary.tsvneutral_calibration_replicates.tsvinterpretation.txtrun_metadata.json
Validation runs add calibration/significance tables, bootstrap summaries, figures, and manuscript-facing markdown blocks.
8) Reproducibility and provenance
- neutral replicate distributions used for
p_empare written to disk - run metadata includes software version, model DOI/SHA, calibration settings, and command provenance
- BABAPPAi remains the renamed continuation of BABAPPAΩ; legacy model assets are explicitly labeled as such
9) Citation and interpretation guardrails
- cite BABAPPAi software version
- cite legacy model DOI while legacy frozen assets are used
- interpret significance as excess dispersion relative to matched neutral calibration, not as proof of adaptation
10) Development
pip install -e .[test]
pip install ruff build twine
ruff check .
pytest
python -m build --sdist --wheel
python -m twine check dist/*
11) Reproducibility and project policy
- large generated outputs policy:
docs/reproducibility_artifacts.md - citation metadata:
CITATION.cff - contribution guidelines:
CONTRIBUTING.md - code of conduct:
CODE_OF_CONDUCT.md - security reporting:
SECURITY.md - maintainer release process:
RELEASE_CHECKLIST.md
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file babappai-1.1.0.tar.gz.
File metadata
- Download URL: babappai-1.1.0.tar.gz
- Upload date:
- Size: 86.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
22789db9de621130828392bb0a0d584061c889380851c9f2b11db175e1a77f53
|
|
| MD5 |
3b9bb63caa6b17de06ed3a7c292dfb75
|
|
| BLAKE2b-256 |
62e99ee4f05dc6cf11ae047d5d5332d656de13f13e2d4994ea8d21b1b81a77fb
|
Provenance
The following attestation bundles were made for babappai-1.1.0.tar.gz:
Publisher:
publish.yml on sinhakrishnendu/BABAPPAi
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
babappai-1.1.0.tar.gz -
Subject digest:
22789db9de621130828392bb0a0d584061c889380851c9f2b11db175e1a77f53 - Sigstore transparency entry: 1191893645
- Sigstore integration time:
-
Permalink:
sinhakrishnendu/BABAPPAi@78c7837d6d8df39ed54c5ce8af7a4df4a5b1ebec -
Branch / Tag:
refs/tags/v1.1.3 - Owner: https://github.com/sinhakrishnendu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@78c7837d6d8df39ed54c5ce8af7a4df4a5b1ebec -
Trigger Event:
release
-
Statement type:
File details
Details for the file babappai-1.1.0-py3-none-any.whl.
File metadata
- Download URL: babappai-1.1.0-py3-none-any.whl
- Upload date:
- Size: 91.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9d110f52b808675e6e0cf01fbcc972c652fd8eef6ec15b999961c809f9e2d80
|
|
| MD5 |
d7ce3e0b32959bb0b0bc993c6dd141fb
|
|
| BLAKE2b-256 |
19fef9eeff89df314958fe07b8b6c3502b4a2680940c7a5cc1230b20ec1e0ce3
|
Provenance
The following attestation bundles were made for babappai-1.1.0-py3-none-any.whl:
Publisher:
publish.yml on sinhakrishnendu/BABAPPAi
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
babappai-1.1.0-py3-none-any.whl -
Subject digest:
c9d110f52b808675e6e0cf01fbcc972c652fd8eef6ec15b999961c809f9e2d80 - Sigstore transparency entry: 1191893647
- Sigstore integration time:
-
Permalink:
sinhakrishnendu/BABAPPAi@78c7837d6d8df39ed54c5ce8af7a4df4a5b1ebec -
Branch / Tag:
refs/tags/v1.1.3 - Owner: https://github.com/sinhakrishnendu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@78c7837d6d8df39ed54c5ce8af7a4df4a5b1ebec -
Trigger Event:
release
-
Statement type: