A low-code Python library for enterprise-grade experiment design, classical DoE, and statistical analysis.

These details have not been verified by PyPI

Project links

Project description

xpyrment

PyPI version Release Python Support Tests Coverage License Statistical Engine Industrial DoE Maintainer

xpyrment is an enterprise-grade, low-code Python library designed for experiment design, classical Design of Experiments (DoE), and statistical causal inference.

It provides an elegant, object-oriented fluent API to orchestrate the entire lifecycle of digital experimentation (A/B testing) alongside the rigorous mathematical techniques of modern, enterprise-scale platforms. It features native support for CUPED (variance reduction), ratio metrics via the Delta method, multiple comparison corrections, Sample Ratio Mismatch (SRM) diagnostics, mixture SPRT continuous monitoring (mSPRT), Bayesian inference, and classical industrial DoE design matrices.

🌟 Key Features

Unified Fluent Orchestrator API: Initialize experiments, define metric structures, run statistical evaluations, and compile publication-ready summaries or plots in a clean, state-gated object-oriented pipeline.
Rigorous Variance Reduction (CUPED): Built-in support for standard CUPED (continuous metrics) and Ratio CUPED (numerator and denominator adjustment). Reduces variance and sample size requirements by up to 88%+.
Ratio Metric Precision: Precise variance estimation of ratio metrics (e.g., CTR, revenue per click) where both numerator and denominator are stochastic, using first-order Taylor expansion (Delta method).
Classical Design of Experiments (DoE): Full and Fractional Factorial, Plackett-Burman, Taguchi Orthogonal Arrays, Definitive Screening Designs (DSD), Response Surface Methodologies (CCD & Box-Behnken), and D-Optimal coordinate exchange.
Continuous Monitoring & Early Stopping: Always-valid confidence intervals and sequential monitoring boundaries via mixture SPRT (mSPRT) and Pocock/O'Brien-Fleming alpha-spending functions.
Experimental Diagnostics: Built-in automated Chi-square tests to detect Sample Ratio Mismatch (SRM), pre-experiment covariate balance validation with Standardized Mean Differences (SMD), and time-series novelty/primacy effect detectors.
Multi-Testing Correction: Guard against Type I error inflation by automatically adjusting p-values for multiple metrics using Holm-Bonferroni, Bonferroni, or Benjamini-Hochberg (FDR).
Multi-Armed Bandits & Adaptive Traffic: Dynamically allocate traffic using Beta-Binomial / Normal-Normal Thompson Sampling, standard/decaying $\varepsilon$-Greedy, and classical UCB1 optimistic exploration. Supports sliding-window and discounted Thompson Sampling for drifting baselines.
Heterogeneous Treatment Effects (HTE): Personalize variant targeting using CATE estimators (S-Learner, T-Learner, and propensity-weighted X-Learner) alongside custom bootstrapped Causal Forests.
Synthetic Controls & Quasi-Experiments: Analyze unrandomized policy deployments using Abadie SLSQP-constrained Synthetic Controls, multi-variable Difference-in-Differences (DiD) regressions, and Synthetic DiD (SDID).
Premium Standalone Reports: Instantly export summaries into beautiful, portable, responsive CSS-styled HTML dashboards and GitHub-compatible Markdown summary tables.
Audit Trail Security: Cryptographically chain and sign state updates via a SHA-256 tamper-evident ledger, ensuring experiment metadata and configuration parameters remain auditable.
Interactive CLI Toolchain: Perform analytical power sizing, calculate Standardized Mean Differences (SMD) on pre-period covariates, and run rapid ordinary least squares regressions directly from your terminal.

⚙️ Installation

To install the stable release of xpyrment from PyPI, simply run:

pip install xpyrment

For development and contributor setups (including pytest, black, and mypy), clone the repository and install in editable mode:

git clone https://github.com/sadatian/xpyrment.git
cd xpyrment
pip install -e .[dev]

🚀 Quickstart Tutorial

This quickstart guides you through the entire A/B testing lifecycle: designing, simulating, configuring, and analyzing.

1. Experiment Design (Power Analysis)

Before launching your test, calculate the sample size required to detect a $5%$ relative lift in a key continuous metric (e.g., Average Order Value = $100, standard deviation = $35).

import xpyrment as xp

# Calculate sample size for a standard t-test
design = xp.design_experiment(
    metric_type="mean",
    baseline_value=100.0,
    standard_deviation=35.0,
    mde=0.05,                  # 5% relative lift
    mde_type="relative",
    alpha=0.05,                # Significance level (Type I error)
    power=0.80,                # Target power (1 - Type II error)
    pre_post_correlation=0.75, # Optional: Pre-Post correlation to calculate CUPED savings!
    daily_traffic=5000         # Optional: Daily user traffic to calculate duration
)

print(design)

Output:

=========================================
       Experiment Design Summary        
=========================================
Metric Type                   : Mean
Baseline Value                : 100.0000
Target MDE (Absolute)         : 5.0000
Target MDE (Relative)         : 5.00%
Significance Level (Alpha)    : 5.00%
Statistical Power (1-Beta)    : 80.00%
Sample Size Per Variant       : 1,537
Total Sample Size Required    : 3,074
Pre-Post Correlation          : 0.75
CUPED Sample Size Per Variant : 672
CUPED Total Sample Size       : 1,344
CUPED Sample Size Savings     : 43.8%
Daily Traffic                 : 5,000/day
Estimated Duration (Standard) : 0.6 days
Estimated Duration (CUPED)    : 0.3 days
=========================================

Visualizing Power Curves

Generate coordinates and plot required sample sizes against a range of MDEs to see the impact of CUPED:

# Generate power curve coordinates
curve_data = xp.generate_power_curve_data(
    metric_type="mean",
    baseline_value=100.0,
    standard_deviation=35.0,
    pre_post_correlation=0.75
)

# Plot standard vs. CUPED required sample sizes
xp.plot_power_curve(curve_data)

2. Generate Synthetic A/B Test Data

Let's generate simulated experimental data of 10,000 users split 50/50, complete with pre-period covariates so we can demonstrate CUPED and ratio metric evaluations:

df = xp.generate_ab_data(
    n_samples=10000,
    treatment_effect_revenue=2.5,        # +$2.50 absolute lift
    treatment_effect_conversion=0.015,    # +1.5% absolute lift
    treatment_effect_clicks=0.06,         # +6% relative lift in click ratios
    pre_period_correlation=0.82,          # Correlation between pre- and post- period
    random_seed=42
)

print(df.head())

user_id	variant	pre_revenue	revenue	converted	pre_clicks	pre_impressions	clicks	impressions
USER_000001	control	4.47	4.42	0	4	93	5	96
USER_000002	treatment	56.78	61.22	1	6	112	8	108
USER_000003	control	51.12	48.91	0	5	105	3	99
USER_000004	treatment	32.54	36.90	0	3	82	4	88

3. Setup and Run Analysis

Initialize the experiment environment using the setup function, define your metrics (with pre-period specifications for automatic CUPED), and run your analysis!

# 1. Initialize experiment setup
exp = xp.setup(
    data=df, 
    treatment_col="variant", 
    id_col="user_id"
)

# 2. Define your metrics
# Continuous metric (Average revenue) with automatic CUPED!
revenue = xp.MeanMetric(
    name="Average Revenue per User", 
    value_col="revenue", 
    pre_period_col="pre_revenue"
)

# Proportion metric (Conversion rate)
conversion = xp.ProportionMetric(
    name="Purchase Conversion Rate", 
    value_col="converted"
)

# Ratio metric (Click-Through-Rate = sum(clicks)/sum(impressions)) with ratio CUPED!
ctr = xp.RatioMetric(
    name="Click-Through-Rate (CTR)", 
    numerator_col="clicks", 
    denominator_col="impressions",
    pre_numerator_col="pre_clicks",
    pre_denominator_col="pre_impressions"
)

# 3. Add metrics to the experiment container
exp.add_metrics([revenue, conversion, ctr])

# 4. Run Analysis (optionally apply multi-test corrections like 'fdr_bh')
results = exp.run_analysis(
    control="control", 
    treatment="treatment",
    multi_test_correction="fdr_bh"
)

4. Review and Visualize Results

Standard Summary DataFrame

Call .summary() to get a polished, publication-ready pandas DataFrame with automatic statistical significance annotations (* for $p < 0.05$, ** for $p < 0.01$, *** for $p < 0.001$).

summary_df = results.summary()
print(summary_df)

Metric	Type	Control Mean	Treatment Mean	Relative Lift	95% CI (Rel)	p-value	Post-hoc Power	CUPED	Var Reduction
Average Revenue per User	Mean	49.9542	52.4712	+5.04%	[+3.78%, +6.30%]	0.0000***	100.0%	Yes	68.3%
Purchase Conversion Rate	Proportion	0.0990	0.1172	+18.42%	[+4.12%, +32.72%]	0.0112*	73.1%	No	-
Click-Through-Rate (CTR)	Ratio	0.0498	0.0528	+5.95%	[+4.11%, +7.78%]	0.0000***	100.0%	Yes	71.2%

!!! tip "" CUPED was automatically applied to both Average Revenue and Click-Through-Rate, achieving over $68%$ and $71%$ variance reduction respectively! This dramatically narrowed our confidence intervals and amplified our statistical power.

Forest Plot Visualization

Call .plot() to render a gorgeous forest plot representing confidence intervals. Statistically significant lifts are automatically rendered in vibrant teal, while others are shown in subtle gray.

# Render the forest plot
results.plot()

Covariate Balance Verification (Love Plot)

# Print an ASCII love plot directly in the console
print(results.love_plot())

5. Generate Standalone HTML Reports

With the v1 release, you can export beautiful standalone HTML dashboards or Markdown cards representing your experimental results, complete with embedded modern styling, KPI metrics, and covariate balance logs.

from xpyrment.report.generator import ExperimentReportGenerator

# Initialize the report generator with the analysis results
reporter = ExperimentReportGenerator(results, experiment_name="Mobile Landing Page Redesign")

# Save a premium responsive HTML dashboard (fully styled, self-contained)
reporter.save_html("reports/ab_experiment_dashboard.html")

# Save a GitHub-compatible Markdown summary card
reporter.save_markdown("reports/ab_experiment_summary.md")

🔬 Subpackage Taxonomy & Dependency Flow

To support industrial-scale digital tests and classical DoE, the package has been structured under src/xpyrment following a one-way dependency gating layout to avoid circular references:

metrics/     ← Houses core metric taxonomy and guardrail thresholds.
core/        ← Powers the phase gating lifecycle & spec registries.
plan/        ← Computes pre-registration power/durations.
design/      ← Handles randomizations, splits & DoE matrices.
validate/    ← Houses SRM checks and covariate balance tests.
run/         ← Handles ingestion & mSPRT monitors.
analyze/     ← Orchestrates frequentist/Bayesian engines.
interactions/← Decomposes multi-factor ANOVA interaction terms.
interpret/   ← Infers ship/no-ship decisions.
report/      ← Terminal consumer of all phases. Compiles audit trails & exportable reports.

📖 Mathematical Framework

Welch's t-test

For continuous metrics without a pre-period covariate, the standard error of the mean difference is: $$ SE = \sqrt{\frac{s_C^2}{n_C} + \frac{s_T^2}{n_T}} $$ Degrees of freedom are computed via the Welch-Satterthwaite equation to handle unequal sample sizes and variances.

Delta Method (Ratio Metrics)

Because click-through-rates or revenue ratios are calculated as:

$$ R = \frac{\sum_i X_i}{\sum_i Y_i} = \frac{\bar{X}}{\bar{Y}} $$

the variance of the ratio cannot be computed using standard methods because the denominator $Y$ is a random variable. We employ a first-order Taylor expansion (Delta method) to estimate variance:

$$ Var(R) \approx \frac{1}{\mu_Y^2} Var(X) + \frac{\mu_X^2}{\mu_Y^4} Var(Y) - 2\frac{\mu_X}{\mu_Y^3} Cov(X, Y) $$

CUPED (Controlled-experiments Using Pre-Experiment Data)

CUPED adjusts post-period metrics by subtracting the portion of variance explained by pre-period performance:

$$ Y_i^* = Y_i - \theta (X_i - \mu_{X, global}) $$

where $\theta = \frac{Cov(Y, X)}{Var(X)}$ is computed across the pooled data. The variance of the CUPED-adjusted metric is reduced by a factor of $1 - \rho^2$ (where $\rho$ is the correlation coefficient): $$ Var(Y^*) = Var(Y) (1 - \rho^2) $$ For ratio metrics, xpyrment applies CUPED adjustment separately to the numerator and denominator before applying the Delta method on adjusted vectors—a technique pioneered by Netflix and Uber.

Sample Ratio Mismatch (SRM) Goodness-of-Fit

A Pearson Chi-square test is calculated on the observed sample counts against the expected design weights to flag assignment bugs early:

$$ \chi^2 = \sum_i \frac{(O_i - E_i)^2}{E_i} $$

If the test p-value $< 0.001$, an SRMError is raised.

DerSimonian-Laird Random-Effects Meta-Analysis

To pool historical experiment estimates $\hat{\theta}_j$ with study variances $v_j$ across $k$ independent studies, the DerSimonian-Laird random-effects model accounts for between-study variance $\tau^2$:

$$ \tau^2 = \max\left(0, \ \frac{Q - (k - 1)}{\sum w_j - \frac{\sum w_j^2}{\sum w_j}}\right) $$

where $w_j = \frac{1}{v_j}$ are inverse-variance fixed weights, and $Q = \sum w_j (\hat{\theta}_j - \bar{\theta}_F)^2$ is Cochran's $Q$ heterogeneity statistic. Random weights $w_j^* = \frac{1}{v_j + \tau^2}$ are then applied to yield the pooled Random Effect estimate:

$$ \bar{\theta}_R = \frac{\sum w_j^* \hat{\theta}_j}{\sum w_j^*} $$

Simonsohn P-Curve Distribution Audits

To detect p-hacking, early peeking, or selective publication bias across independent experiments, the p-curve binomial test calculates the proportion of significant p-values ($p < 0.05$) lying in the low half ($p \le 0.025$):

True Evidential Power (Right-Skewed):

$$ p_{right-skew} = 1 - F_{binom}(N_{low} - 1; N_{total}, 0.5) $$
Reporting Bias / Selective Stopping (Left-Skewed):

$$ p_{left-skew} = F_{binom}(N_{low}; N_{total}, 0.5) $$

🛠️ Local Development & Testing

We use pytest for unit testing. To set up your local environment:

Create a virtual environment and activate it:

python -m venv .venv
.venv\Scripts\activate  # On Windows
source .venv/bin/activate  # On macOS/Linux

Install the package in editable mode with development dependencies:
```
pip install -e .[dev]
```
Run the unit test suite:
```
pytest
```

📄 License

Distributed under the AI Slop License.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.6.1.2

May 29, 2026

1.6.1.0

May 28, 2026

1.5.1.3

May 23, 2026

1.5.0.0

May 23, 2026

1.3.0.0

May 14, 2026

1.1.2.9

May 13, 2026

1.1.2.8

May 13, 2026

1.1.2.5

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xpyrment-1.6.1.2.tar.gz (272.3 kB view details)

Uploaded May 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

xpyrment-1.6.1.2-py3-none-any.whl (337.9 kB view details)

Uploaded May 29, 2026 Python 3

File details

Details for the file xpyrment-1.6.1.2.tar.gz.

File metadata

Download URL: xpyrment-1.6.1.2.tar.gz
Upload date: May 29, 2026
Size: 272.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.4.1 CPython/3.12.3 Windows/11

File hashes

Hashes for xpyrment-1.6.1.2.tar.gz
Algorithm	Hash digest
SHA256	`571615521e53b254b0f8b55e6a58e9a9ef8be341c17c28ec91063f6e96d4667d`
MD5	`0dc6b523a40fc5fcefe019b1c72fd16b`
BLAKE2b-256	`3273df4c3a57d363682bc668250f0b3c43d3e2c012cbd88c96eb0f11d3b7012f`

See more details on using hashes here.

File details

Details for the file xpyrment-1.6.1.2-py3-none-any.whl.

File metadata

Download URL: xpyrment-1.6.1.2-py3-none-any.whl
Upload date: May 29, 2026
Size: 337.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.4.1 CPython/3.12.3 Windows/11

File hashes

Hashes for xpyrment-1.6.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`cf44e18623ec565f9829cacd641ac066f5616c866a2ba77a411b5939841408dd`
MD5	`fe599c114d2f24c76a746ad1b00db803`
BLAKE2b-256	`3aad07af1f3533ee3341c37d830238e6a5ed35605a4e4fddc6e52a4e657bdec6`

See more details on using hashes here.

xpyrment 1.6.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

xpyrment

🌟 Key Features

⚙️ Installation

🚀 Quickstart Tutorial

1. Experiment Design (Power Analysis)

Visualizing Power Curves

2. Generate Synthetic A/B Test Data

3. Setup and Run Analysis

4. Review and Visualize Results

Standard Summary DataFrame

Forest Plot Visualization

Covariate Balance Verification (Love Plot)

5. Generate Standalone HTML Reports

🔬 Subpackage Taxonomy & Dependency Flow

📖 Mathematical Framework

Welch's t-test

Delta Method (Ratio Metrics)

CUPED (Controlled-experiments Using Pre-Experiment Data)

Sample Ratio Mismatch (SRM) Goodness-of-Fit

DerSimonian-Laird Random-Effects Meta-Analysis

Simonsohn P-Curve Distribution Audits

🛠️ Local Development & Testing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes