Platform-agnostic A/B experiment readout auditor with SRM, peeking, MDE, practical significance, Welch t-test, guardrail, and pre-period balance checks.
Project description
TrialCheck
Platform-agnostic A/B experiment readout auditor.
TrialCheck does not run experiments. It audits completed readouts from any experimentation platform, spreadsheet, or warehouse export and returns a structured PASS / WARN / FAIL report.
About
Most experimentation platforms surface a p-value and a lift estimate. That is not enough information to make a trustworthy ship decision.
Before shipping an experiment result, a senior data scientist checks a consistent set of questions: Did assignment work correctly? Was the result called early? Is the effect large enough to matter in practice? Did any guardrail metrics move harmfully? Were the variants balanced before the test started? These checks are well-understood, but they are rarely automated — they live in runbooks, reviewer checklists, or institutional memory.
TrialCheck packages those checks into a single library call. It accepts a structured experiment summary (assignment counts, metric data, optional guardrails and pre-period covariates) and returns a per-check PASS / WARN / FAIL / INSUFFICIENT_INPUT report with recommendations. The result is readable by humans and parseable by machines (JSON, Markdown, HTML output).
The intended use case: a data scientist or analytics lead runs TrialCheck at readout time, reviews the report, and makes a better-informed decision. TrialCheck is decision support — not a decision-maker.
Architecture
flowchart TD
IN["ExperimentSummary\nassignment counts · metric data\nguardrails · pre-period covariates"]
IN --> SRM
IN --> PMC
IN --> CMC
IN --> PSC
IN --> MDE
IN --> PKG
IN --> GRD
IN --> PPB
SRM["SRM Check\nchi-square df=1\nerfc(sqrt(x/2))"]
PMC["Primary Metric\ntwo-proportion z-test\npooled SE under H0"]
CMC["Continuous Metric\nWelch t-test\nWelch-Satterthwaite dof"]
PSC["Practical Significance\nobserved lift vs\nbusiness threshold"]
MDE["MDE Context\nobserved lift vs\nplanned MDE"]
PKG["Peeking Risk\nduration ratio\n+ interim looks"]
GRD["Guardrail Movement\nbad direction\n+ tolerance"]
PPB["Pre-period Balance\nSMD per covariate\npooled SD"]
SRM --> AGG
PMC --> AGG
CMC --> AGG
PSC --> AGG
MDE --> AGG
PKG --> AGG
GRD --> AGG
PPB --> AGG
AGG["Overall Status\nFAIL > WARN > INSUFFICIENT_INPUT > PASS"]
AGG --> OUT
OUT["TrialReport\nJSON · Markdown · HTML\nexplicit claim boundary"]
Why this exists
A p-value alone is not enough to ship an experiment. Before acting on a readout, teams should check whether the result is trustworthy and decision-ready:
- Did assignment drift? (SRM — chi-square df=1)
- Was the result called early or monitored repeatedly? (peeking risk)
- Is the lift large enough to matter? (practical significance)
- Is the lift below the planned MDE? (MDE context)
- Is the continuous metric difference real? (Welch's t-test, no equal-variance assumption)
- Did a guardrail move in the wrong direction? (guardrail movement)
- Were variants balanced before the test? (pre-period covariate balance — SMD)
TrialCheck packages those checks into one lightweight Python library with zero dependencies.
Claim boundary
TrialCheck is an audit helper. It does not:
- run experiments
- assign users
- replace an experimentation platform
- prove causal validity
- perform CUPED or sequential testing
- make automatic ship/no-ship decisions
It surfaces readout risks so a data scientist, experiment owner, or analytics lead can make a better decision.
Install locally
cd trialcheck_v0
python -m pip install -e .
Quickstart
from trialcheck import TrialCheck, write_report
from trialcheck.io import load_experiment_json
experiment = load_experiment_json("sample_data/checkout_experiment_summary.json")
report = TrialCheck(experiment).run()
print(report.overall_status.value)
print(report.interpretation)
write_report(report, "outputs/trialcheck_report.json")
write_report(report, "outputs/trialcheck_report.md")
write_report(report, "outputs/trialcheck_report.html")
Run the demo
cd trialcheck_v0
set -e
python -m pip install -e .
python scripts/generate_demo_reports.py
open outputs/trialcheck_report.html
Run tests
cd trialcheck_v0
set -e
python -m unittest discover -s tests -v
Four canonical scenarios
The demo ships four pre-built scenarios that each exercise a different failure mode:
| Scenario | Overall | What fires |
|---|---|---|
clean_pass |
PASS | All checks green; full data supplied |
srm_fail |
FAIL | 56/44 split observed vs 50/50 planned |
peeking_warn |
WARN | 36% of planned duration, 2 interim looks |
guardrail_harm |
FAIL | Revenue/user drops 4.4%; refund rate doubles |
Run all four with python scripts/generate_demo_reports.py.
Public resume-safe claim
Built TrialCheck, a platform-agnostic A/B experiment readout auditor that checks completed experiment summaries for sample-ratio mismatch (chi-square), peeking risk, MDE context, practical and statistical significance (two-proportion z-test and Welch's t-test), guardrail movement, and pre-period covariate imbalance (SMD), producing JSON/Markdown/HTML audit reports with explicit decision caveats. Zero dependencies. 17 tests. 4 canonical demo scenarios.
Roadmap
- richer power/MDE utilities
- CSV batch audit mode
- optional report styling polish
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trialcheck-0.2.0.tar.gz.
File metadata
- Download URL: trialcheck-0.2.0.tar.gz
- Upload date:
- Size: 19.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09451deb2497624c109545f6d272abede7efe276cdd0810e4806fc19e058bf1d
|
|
| MD5 |
4cba1c6c7c0da9ed6fab140e0c54c6fe
|
|
| BLAKE2b-256 |
c986911703df515e4268099bfea4cb9a2bcaed6436a160eab39b1eb0bb6fca72
|
Provenance
The following attestation bundles were made for trialcheck-0.2.0.tar.gz:
Publisher:
publish.yml on SidharthKriplani/trialcheck
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
trialcheck-0.2.0.tar.gz -
Subject digest:
09451deb2497624c109545f6d272abede7efe276cdd0810e4806fc19e058bf1d - Sigstore transparency entry: 1440538864
- Sigstore integration time:
-
Permalink:
SidharthKriplani/trialcheck@0a8773b8d1e8cea34c02ee5314af7c2dbe9c4918 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/SidharthKriplani
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0a8773b8d1e8cea34c02ee5314af7c2dbe9c4918 -
Trigger Event:
release
-
Statement type:
File details
Details for the file trialcheck-0.2.0-py3-none-any.whl.
File metadata
- Download URL: trialcheck-0.2.0-py3-none-any.whl
- Upload date:
- Size: 17.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40a7c2dcaa393e53f076e9681cf42ba96fb2a851228c73be9290b5c57c12dccc
|
|
| MD5 |
fc245c5cacbd692b81191cfb8ffc9a59
|
|
| BLAKE2b-256 |
9343b269c7ecc22744352c6affeab2e8d0aa79348c1ee957e9b949e0dd6548ed
|
Provenance
The following attestation bundles were made for trialcheck-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on SidharthKriplani/trialcheck
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
trialcheck-0.2.0-py3-none-any.whl -
Subject digest:
40a7c2dcaa393e53f076e9681cf42ba96fb2a851228c73be9290b5c57c12dccc - Sigstore transparency entry: 1440538974
- Sigstore integration time:
-
Permalink:
SidharthKriplani/trialcheck@0a8773b8d1e8cea34c02ee5314af7c2dbe9c4918 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/SidharthKriplani
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0a8773b8d1e8cea34c02ee5314af7c2dbe9c4918 -
Trigger Event:
release
-
Statement type: