BABAPPAi: diagnostic framework for identifiability of episodic branch-site structure
Project description
BABAPPAi (babappai)
BABAPPAi is the software framework built around the canonical frozen BABAPPAΩ model.
- BABAPPAΩ: fixed neural inference artifact (weights are immutable in this project).
- BABAPPAi: operational package around that model (raw EII, matched-neutral significance, applicability/abstention, optional cEII calibration, reporting, and packaging).
0) Final Submission Folder
For manuscript submission handoff, use the consolidated folder:
final/
It contains the cleaned main manuscript files, supplementary files, cover letter, reviewer/metadata files, and packaged zip assembled for NAR Genomics and Bioinformatics submission.
1) Scope
BABAPPAi reports:
- raw dispersion diagnostics:
eii_z_raw,eii_01_raw - empirically calibrated identifiability probabilities (conditional):
ceii_gene,ceii_site - empirical matched-neutral significance:
p_emp,q_emp,significant_bool
Important interpretation contract:
- raw EII and matched-neutral significance are universally reportable outputs
- cEII is an auxiliary post-inference calibration layer and may be withheld (
null) under inapplicable/unstable regimes - calibration updates do not imply model-weight changes
BABAPPAi does not perform classical dN/dS likelihood-ratio testing, and significance does not prove adaptive substitution.
2) Installation
pip install babappai
3) Quickstart
babappai model fetch
babappai example write --outdir demo
babappai run --alignment demo/aln.fasta --tree demo/tree.nwk --outdir demo_out
4) Statistical outputs
For each gene-level run:
D_obs: observed dispersion statistic : sample variance (ddof=1) of site-levelsite_logit_meanacross codon sitesmu0: matched neutral meansigma0_raw: raw neutral SDsigma0_final: SD after floor applicationeii_z_raw = (D_obs - mu0) / sigma0_finaleii_01_raw = sigmoid(eii_z_raw)(diagnostic magnitude scale)ceii_gene: calibratedP(I_gene=1 | data)ceii_site: calibratedP(I_site=1 | data)ceii_gene_class,ceii_site_class: calibration-derived decision bandsceii_ci: bootstrap calibration interval (if calibration asset provides it)calibration_version,domain_shift_or_applicabilityp_emp = (1 + count(D0 >= D_obs)) / (M + 1)from matched neutral replicatesq_emp: BH-adjustedp_empacross tested genes in an analysis setsignificant_bool = (q_emp <= alpha)(defaultalpha=0.05)significance_label ∈ {not_significant, significant}
ceii_* and q_emp are intentionally distinct layers:
ceii_*: recoverability/identifiability probability calibrationq_emp: excess-dispersion significance under matched-neutral calibration
ceii_* is conditional and abstention-aware:
- if applicability/null checks fail,
ceii_geneandceii_siteare withheld (null) - this does not invalidate raw EII or matched-neutral significance
5) Core CLI
babappai run --alignment aln.fasta --tree tree.nwk --outdir results
babappai run --alignment aln.fasta --tree tree.nwk --outdir results --alpha 0.05 --pvalue-mode empirical_monte_carlo --neutral-reps 200 --sigma-floor 0.05
babappai model fetch
babappai model status
babappai model verify
babappai doctor
babappai version
babappai version --ceii-asset babappai/data/ceii_calibration_v3_2.json
Significance-related options
--alpha(default0.05)--pvalue-mode(empirical_monte_carlodefault;frozen_referencestatic reference-table mode)--neutral-reps(Monte Carlo neutral replicates)--min-neutral-group-size--sigma-floor--retain-eii-bands/--no-retain-eii-bands--report-threshold-bands/--no-report-threshold-bands--ceii-enabled/--no-ceii-enabled--ceii-asset
6) Validation workflows
babappai validate orthogroups select --input ORTHOGROUP_DIR --outdir selection_out
babappai validate orthogroups run --input selection_out --outdir empirical_out --alpha 0.05 --pvalue-mode empirical_monte_carlo
babappai validate synthetic run \
--simulator scripts/simulator.py \
--outdir synthetic_out \
--alpha 0.05 \
--pvalue-mode empirical_monte_carlo
babappai validate report --input validation_root --outdir report_out
Full-pipeline manuscript validation helper:
python scripts/run_full_pipeline_validation.py \
--outdir results/validation/full_pipeline_v2 \
--n_per_regime 100 \
--n_replicates_per_scenario 5 \
--bootstrap_reps 1000 \
--alpha 0.05 \
--pvalue_mode empirical_monte_carlo \
--neutral_reps 200 \
--sigma_floor 0.05 \
--seed 123
cEII benchmark + calibration helper:
python scripts/run_ceii_calibration_benchmark.py \
--outdir results/validation/ceii_benchmark_v1 \
--n-per-regime 12 \
--n-replicates-per-scenario 2 \
--pvalue-mode frozen_reference \
--bootstrap-reps 150 \
--write-package-asset
7) Output files
babappai run writes:
results.jsonbranch_summary.tsvsite_summary.tsvneutral_calibration_replicates.tsvinterpretation.txtrun_metadata.json
Validation runs add calibration/significance tables, bootstrap summaries, figures, and manuscript-facing markdown blocks.
8) Reproducibility and provenance
- neutral replicate distributions used for
p_empare written to disk - run metadata includes software version, model DOI/SHA, calibration settings, and command provenance
- model provenance identifies BABAPPAΩ as the canonical frozen inference backbone used by BABAPPAi
9) Citation and interpretation guardrails
- cite BABAPPAi software version
- cite the BABAPPAΩ model DOI as the canonical frozen model artifact
- interpret significance as excess dispersion relative to matched neutral calibration, not as proof of adaptation
10) Development
pip install -e .[test]
pip install ruff build twine
ruff check .
pytest
python -m build --sdist --wheel
python -m twine check dist/*
11) Reproducibility and project policy
- large generated outputs policy:
docs/reproducibility_artifacts.md - citation metadata:
CITATION.cff - contribution guidelines:
CONTRIBUTING.md - code of conduct:
CODE_OF_CONDUCT.md - security reporting:
SECURITY.md - maintainer release process:
RELEASE_CHECKLIST.md
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file babappai-2.2.1.tar.gz.
File metadata
- Download URL: babappai-2.2.1.tar.gz
- Upload date:
- Size: 103.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb4f0b1bff8c8483cba8dd09fef363d8bf85eda4e0ff53648918ef61d75652cd
|
|
| MD5 |
5790f29c945f6ca336e1b9c3cc7cd26e
|
|
| BLAKE2b-256 |
14b586b9afa681b51e0e1088d653cb252fc0dc761361af6f2d79fe6faa5cb05b
|
Provenance
The following attestation bundles were made for babappai-2.2.1.tar.gz:
Publisher:
publish.yml on sinhakrishnendu/BABAPPAi
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
babappai-2.2.1.tar.gz -
Subject digest:
eb4f0b1bff8c8483cba8dd09fef363d8bf85eda4e0ff53648918ef61d75652cd - Sigstore transparency entry: 1200120280
- Sigstore integration time:
-
Permalink:
sinhakrishnendu/BABAPPAi@54b14969c6d18132016c81d429eca6d9fc8bfd5b -
Branch / Tag:
refs/tags/v2.2.1 - Owner: https://github.com/sinhakrishnendu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@54b14969c6d18132016c81d429eca6d9fc8bfd5b -
Trigger Event:
release
-
Statement type:
File details
Details for the file babappai-2.2.1-py3-none-any.whl.
File metadata
- Download URL: babappai-2.2.1-py3-none-any.whl
- Upload date:
- Size: 110.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
146efac6a2cce0afd7c1fd9b06a041e945b2ca55602058ac3034ddacec8833d9
|
|
| MD5 |
d866f69b51b6d30d3d10ae36b654a520
|
|
| BLAKE2b-256 |
55b8df9158552291c448f780b2c4228a1b967f91a3210fb2c5faf2dfa2aa2cf2
|
Provenance
The following attestation bundles were made for babappai-2.2.1-py3-none-any.whl:
Publisher:
publish.yml on sinhakrishnendu/BABAPPAi
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
babappai-2.2.1-py3-none-any.whl -
Subject digest:
146efac6a2cce0afd7c1fd9b06a041e945b2ca55602058ac3034ddacec8833d9 - Sigstore transparency entry: 1200120307
- Sigstore integration time:
-
Permalink:
sinhakrishnendu/BABAPPAi@54b14969c6d18132016c81d429eca6d9fc8bfd5b -
Branch / Tag:
refs/tags/v2.2.1 - Owner: https://github.com/sinhakrishnendu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@54b14969c6d18132016c81d429eca6d9fc8bfd5b -
Trigger Event:
release
-
Statement type: