Statistical validity auditor for A/B tests — because significant != trustworthy.
Project description
abaudit
Statistical Validity Auditor for A/B Tests
A significant p-value answers the wrong question.
abaudit asks: given that the result is significant, how likely is it to actually be real?
Most A/B testing tools tell you whether your result is significant.
abaudit tells you whether you should trust it.
The Problem
Your A/B test returned p = 0.031. The team is ready to ship. But:
- You tested 8 metrics and reported the best one
- Someone peeked at the results on Day 3 and almost stopped the test
- The traffic split is 52/48 instead of 50/50
- Your prior belief that this variant would work was maybe 15%
Given all of that, what is the actual probability this effect is real?
abaudit computes that number.
Quickstart
pip install abaudit
import abaudit as ab
result = ab.audit(
control=control_data,
treatment=treatment_data,
metrics=['conversion', 'revenue', 'time_on_site'],
primary='conversion',
prior_f=0.2, # your belief the effect exists
alpha=0.05,
peeking_log=p_value_history,
)
result.summary()
# ┌─────────────────────────────┬────────┬────────┐
# │ Check │ Result │ Status │
# ├─────────────────────────────┼────────┼────────┤
# │ p-value │ 0.031 │ ✅ │
# │ PPV (prob. effect is real) │ 0.41 │ ⚠️ │
# │ Sample Ratio Mismatch │ 0.892 │ ✅ │
# │ Multiple metrics correction │ 0.093 │ ❌ │
# │ Optional stopping │ 3 peeks│ ⚠️ │
# │ Effect size plausibility │ d=0.8 │ ⚠️ │
# └─────────────────────────────┴────────┴────────┘
# Bias score: 0.42 / 1.0 ⚠️ Moderate concern
result.report("audit_report.html") # full HTML report
result.ppv # 0.41
result.bias_score # 0.42
result.flags # list of warnings
What abaudit Checks
| Module | Check | Answers |
|---|---|---|
validity |
PPV (Ioannidis 2005) | Given the significant result, what's the probability it's real? |
validity |
Multiple metric correction | You tested 8 things — what's the corrected p-value for the best one? |
validity |
Effect size plausibility | Is the reported effect size realistic or suspiciously large? |
validity |
Benford's Law | Do the summary statistics look fabricated? |
runtime |
Sample Ratio Mismatch | Was traffic split as intended? |
runtime |
Optional stopping | Was the test stopped early after peeking? |
design |
PPV-aware power analysis | Given your prior, how large does n need to be for results to be trustworthy? |
Statistical Foundation
The core of abaudit is the Positive Predictive Value framework from:
Ioannidis, J.P.A. (2005). Why Most Published Research Findings Are False.
PLOS Medicine 2(8): e124.
$$\text{PPV} = \frac{(1-\beta) \cdot f}{(1-\beta) \cdot f + \alpha \cdot (1-f)}$$
Where $f$ is your prior probability that the effect exists, $1-\beta$ is your test's power, and $\alpha$ is the significance threshold. This is exactly Bayes' rule applied to hypothesis testing.
Development Status
| Phase | Module | Status |
|---|---|---|
| 0 | Scaffold + _stats.py |
✅ Complete |
| 1 | validity.py — core audit |
🔄 In progress |
| 2 | design.py — pre-experiment |
⏳ Planned |
| 3 | runtime.py — health checks |
⏳ Planned |
| 4 | report.py — HTML reports |
⏳ Planned |
Contributing
git clone https://github.com/aldair-ai/abaudit.git
cd abaudit
pip install -e ".[dev]"
pytest
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file abaudit-0.1.0.tar.gz.
File metadata
- Download URL: abaudit-0.1.0.tar.gz
- Upload date:
- Size: 721.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67f7de9584e8f020bbbb3d0892854ca15a40f0e7a06dfe402bf82f4bec6f0fea
|
|
| MD5 |
e23fb0871c173f09a595285081eb6f59
|
|
| BLAKE2b-256 |
9a8bd29851d1ad097758fa1028b53050ca1ea8e6e9ebe8c6c6a2eaa135f07b97
|
Provenance
The following attestation bundles were made for abaudit-0.1.0.tar.gz:
Publisher:
publish.yml on aldair-ai/abaudit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
abaudit-0.1.0.tar.gz -
Subject digest:
67f7de9584e8f020bbbb3d0892854ca15a40f0e7a06dfe402bf82f4bec6f0fea - Sigstore transparency entry: 1578117512
- Sigstore integration time:
-
Permalink:
aldair-ai/abaudit@de34f3e79a9518573d856b8f8a55f7726f8947d4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/aldair-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@de34f3e79a9518573d856b8f8a55f7726f8947d4 -
Trigger Event:
release
-
Statement type:
File details
Details for the file abaudit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: abaudit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd424edf81594440c6aa0d54c232b6bec85ca4d5a217ff8f4b0992e7dc4c05be
|
|
| MD5 |
6e9de8319b693ce7940b5206e7f4dcb1
|
|
| BLAKE2b-256 |
d3ff24dbbfbde5e7836e08c551be002a4d4f1ab02e399a8b9ea1571efa28a17f
|
Provenance
The following attestation bundles were made for abaudit-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on aldair-ai/abaudit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
abaudit-0.1.0-py3-none-any.whl -
Subject digest:
fd424edf81594440c6aa0d54c232b6bec85ca4d5a217ff8f4b0992e7dc4c05be - Sigstore transparency entry: 1578117645
- Sigstore integration time:
-
Permalink:
aldair-ai/abaudit@de34f3e79a9518573d856b8f8a55f7726f8947d4 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/aldair-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@de34f3e79a9518573d856b8f8a55f7726f8947d4 -
Trigger Event:
release
-
Statement type: