No project description provided
Project description
ABayesTest
ABayesTest is a lightweight Python package for performing Bayesian AB testing.
Computations are run using Stan via cmdstanpy, and jinja2 is used
as a backend templating engine
to construct the Stan model files.
Installation
To install, use:
python3.10 -m pip install git+ssh://git@github.com/cmgoold/abayestest.git
Installing ABayesTest will also create a local cache folder for storing
Stan model objects, which is .abayes in the repository root.
CmdStan
ABayesTest requires a working cmdstan installation. The easiest
way to download cmdstan is via cmdstanpy.
Simple API
The simplest use-case is running a comparison between two sets of approximately normally-distributed data sets. First, let's sample some fake data, where we have two groups with the following data generating process:
$$ \begin{align} y_{ij} &\sim \mathrm{Normal}(\mu_{j}, \sigma_{j})\ \mu_{A} &= 0, \quad \sigma_{A} = 0.2 \ \mu_{B} &= 1, \quad \sigma_{B} = 1 \end{align} $$
That is, both groups' data are normally distributed
with locations, 0 and 1, and scales
0.2 and 1, respectively.
Thus, there is a true difference of means of -1 and
a true difference of scales of -0.8. Here's the Python
code:
import numpy as np
from abayestest import ABayesTest
SEED = 1234
rng = np.random.default_rng(SEED)
N = 100
mu = [0, 1]
sigma = [0.2, 1]
y_a = rng.normal(size=N, loc=mu[0], scale=sigma[0])
y_b = rng.normal(size=N, loc=mu[1], scale=sigma[1])
We then initialize an ABayesTest object with the default options
(normal likelihood, default priors) and fit the model, passing
the data in as a tuple:
ab = ABayesTest(seed=SEED)
ab.fit(data=(y_a, y_b))
The model will run in Stan and return self.
You can access the cmdstanpy.CmdStanMCMC object
itself using ab.cmdstan_mcmc.
For instance, we can use cmdstanpy's diagnostic
function to check for any convergence problems:
ab.diagnose()
which returns:
Checking sampler transitions treedepth.
Treedepth satisfactory for all transitions.
Checking sampler transitions for divergences.
No divergent transitions found.
Checking E-BFMI - sampler transitions HMC potential energy.
E-BFMI satisfactory.
Effective sample size satisfactory.
Split R-hat values satisfactory all parameters.
Processing complete, no problems detected.
indicating no problems.
To inspect the results, run ab.summary(), which returns
a summary Pandas DataFrame straight from Arviz:
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
mu[0] 0.026 0.023 -0.018 0.067 0.000 0.000 4655.0 3215.0 1.0
mu[1] 1.059 0.105 0.851 1.249 0.001 0.001 5554.0 3166.0 1.0
mu_diff -1.033 0.108 -1.222 -0.820 0.001 0.001 5566.0 3225.0 1.0
mu_star[0] 0.026 0.023 -0.018 0.067 0.000 0.000 4655.0 3215.0 1.0
mu_star[1] 1.059 0.105 0.851 1.249 0.001 0.001 5554.0 3166.0 1.0
mu_star_diff -1.033 0.108 -1.222 -0.820 0.001 0.001 5566.0 3225.0 1.0
sigma[0] 0.229 0.016 0.199 0.259 0.000 0.000 4938.0 3202.0 1.0
sigma[1] 1.046 0.077 0.904 1.190 0.001 0.001 4530.0 2968.0 1.0
sigma_diff -0.817 0.078 -0.973 -0.681 0.001 0.001 4504.0 3051.0 1.0
sigma_star[0] -1.478 0.071 -1.616 -1.349 0.001 0.001 4938.0 3202.0 1.0
sigma_star[1] 0.042 0.073 -0.101 0.174 0.001 0.001 4530.0 2968.0 1.0
sigma_star_diff -1.520 0.101 -1.709 -1.334 0.001 0.001 4755.0 3271.0 1.0
ABayesTest always uses the parameter mu to refer to
the vector of group-specific locations, or other non-normal
distribution's canonincal parameters (e.g. the Poisson
rate parameter; see below). Dispersion parameters,
such as the normal distribution's scale parameter,
are referred to as sigma.
The parameters suffixed with _star are unconstrained
parameters, which ABayesTest uses for estimation under-the-hood.
More details about the parameter transformations and
likelihood parameterisations are given below, but
for the normal distribution, mu = mu_star and sigma_star = log(sigma).
Conditions A and B are always indexed as 0 and 1
in the Python outputs.
The additional variables mu_diff and sigma_diff (and
the _star companions) give
the difference in posterior distributions between groups 1 and 2
(i.e. mu[0] - mu[1] using Python's zero-indexing).
As we can see, these recover the data-generating assumptions above,
with posterior means close to -1 and -0.8 for the means
and standard deviations, respectively.
Using the estimated quantities, users can calculate
any quantities or metrics that are meaningful
to the AB test being performed. For instance,
the probability that condition B scores greater than
A is the proportion of the posterior distribution
of mu[1] - mu[0] that is greater than zero,
which in this case is 100%, as can be inferred
from the mu_diff distribution directly:
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde
def density(x):
limits = x.min(), x.max()
grid = np.linspace(*limits, 1000)
return grid, gaussian_kde(x)(grid)
mu_diff = ab.draws()["mu_diff"]
plt.plot(*density(mu_diff), color="#0492c2", lw=4)
plt.axvline(0, ls=":", color="gray")
plt.xlabel("score")
plt.ylabel("density")
plt.title("posterior of condition A - B")
The ABayesTest class also contains a handy method
to report the distribution of
differences in the posteriors
between conditions called
compare_conditions, which
tells us that:
100.00% of the posterior differences for mu favour condition B.
100.00% of the posterior differences for sigma favour condition B.
Posterior predictive distribution
ABayesTest automatically calculates the posterior predictive
distribution of the data, which is accessible in
the posterior draws object under the key y_rep.
This array is in long form, where group A and B's
predictions are stacked on top of each other.
Using the example above, we can inspect this
distribution using some small manipulation
of the posterior draws:
y_rep_raw = ab.draws()["y_rep"]
y_reps = y_rep_raw[:, :N], y_rep_raw[:, N:]
ys = y_a, y_b
fig, ax = plt.subplots(1, 2, figsize=(12, 4))
for i in range(2):
a_or_b = (1 - i) * "A" + i * "B"
grid, samples = density(y_reps[i].flatten())
ax[i].plot(grid, samples, color="#0492c2", lw=3, label="Posterior predictive")
ax[i].plot(ys[i], [0.01]*len(ys[i]), '|', color="black", label="raw")
ax[i].set_title(a_or_b)
ax[i].set_xlabel("score")
ax[i].set_ylabel("density")
if not i:
ax[i].legend(frameon=False, loc="upper right")
The rug plots show that the observed data fall within the posterior predictive densities.
Likelihood functions
Currently, ABayesTest supports normal, lognormal, gamma, Bernoulli, binomial, and Poisson distributions.
For non-normal likelihood functions, ABayesTest calculates the differences in canonical parameters on both unconstrained and original scales. The table below illustrates how each likelihood distribution is parameterised, what link functions are used to transform the parameters to the unconstrained scale, and the name of the unconstrained and original-scale parameters, for reference.
| Distribution | Parameterization | Link function transforms |
|---|---|---|
| normal | mean, sd | mean := mu = mu_starsd := sigma = exp(sigma_star) |
| lognormal | log-scale mean, log-scale sd | mean := mu = mu_starsd := sigma = exp(sigma_star) |
| gamma | shape, rate | shape := mu^2 / sigma^2 = exp(mu_star)^2 / exp(sigma_star)^2rate := shape / mu = shape / exp(mu_star) |
| Poisson | rate | rate := mu = exp(mu_star) |
| Bernoulli | probability | probability := mu = logit^-1(mu_star) |
| binomial | probability | probability := mu = logit^-1(mu_star) |
ABayesTest will always return the mu, mu_star, sigma and sigma_star parameters,
and their posterior differences, as standard.
Additional variables appended with _j indicate the long-form parameter vectors,
i.e. the value of the parameters at each index or case in the data.
All but the binomial likelihood require the same data format as above. That is,
the normal, lognormal, gamma, poisson, and bernoulli models just require
the y data vectors as a tuple, or alternatively as a dictionary.
The binomial likelihoods require an additional data vector for the n
parameter in the binomial PMF. It's assumed that the data for binomial models
enter as a tuple or dictionary of tuples, in the
form of data=( (n1, y1), (n2, y2) ).
Taking a specific example, below we simulate binomial data and it a model:
N = 500
mu = [0.6, 0.9]
n = rng.choice(range(70, 100), N)
y1 = rng.binomial(n=n, size=N, p=mu[0])
y2 = rng.binomial(n=n, size=N, p=mu[1])
data = (n, y1), (n, y2)
binomial = ABayesTest(likelihood="binomial", seed=SEED)
binomial.fit(data)
binomial.summary()
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
mu[0] 0.596 0.002 0.592 0.601 0.0 0.0 3597.0 2533.0 1.0
mu[1] 0.898 0.001 0.896 0.901 0.0 0.0 3186.0 2305.0 1.0
mu_diff -0.302 0.003 -0.307 -0.297 0.0 0.0 3570.0 2501.0 1.0
mu_star[0] 0.391 0.010 0.371 0.408 0.0 0.0 3597.0 2533.0 1.0
mu_star[1] 2.179 0.016 2.149 2.209 0.0 0.0 3186.0 2305.0 1.0
mu_star_diff -1.788 0.019 -1.823 -1.753 0.0 0.0 3374.0 2422.0 1.0
Here, the mu_diff parameter tells us that the mean posterior differences
is -0.3, which is exactly what we simulated.
Priors and prior predictive simulations
The default priors are all standard normals on the unconstrained scales, which can be inspected using:
from abayes import DEFAULT_PRIORS
DEFAULT_PRIORS
returning:
'normal': {'mu_star': 'normal(0, 1)', 'sigma_star': 'normal(0, 1)'},
'lognormal': {'mu_star': 'normal(0, 1)', 'sigma_star': 'normal(0, 1)'},
'gamma': {'mu_star': 'normal(0, 1)', 'sigma_star': 'normal(0, 1)'},
'poisson': {'mu_star': 'normal(0, 1)'},
'bernoulli': {'mu_star': 'normal(0, 1)'},
'binomial': {'mu_star': 'normal(0, 1)'}}
These priors are generally weakly informative, but can be changed to any Stan probability distributions you like. At the moment, different priors for each group, or hierarchical structures, are not supported.
ABayesTest also supports running prior
predictive simulations using the
prior_only flag passed to the
class constructor:
rng = np.random.default_rng(SEED)
N = 100
mu = [0, 1]
sigma = [0.2, 1]
y1 = rng.normal(size=N, loc=mu[0], scale=sigma[0])
y2 = rng.normal(size=N, loc=mu[1], scale=sigma[1])
prior = ABayesTest(prior_only=True, seed=SEED)
prior.fit((y1, y2))
y_rep_raw_prior = prior.draws()["y_rep"]
y_reps_prior = y_rep_raw_prior[:, :N], y_rep_raw_prior[:, N:]
ys = y1, y2
fig, ax = plt.subplots(1, 2, figsize=(12, 4))
for i in range(2):
a_or_b = (1 - i) * "A" + i * "B"
grid, samples = density(y_reps[i].flatten())
prior_grid, prior_samples = density(y_reps_prior[i].flatten())
ax[i].plot(grid, samples, color="green", lw=3, label="Posterior predictive")
ax[i].plot(prior_grid, prior_samples, color="#0492c2", lw=3, label="Prior predictive")
ax[i].plot(ys[i], [0.01]*len(ys[i]), '|', color="black", label="raw")
ax[i].set_title(a_or_b)
ax[i].set_xlabel("score")
ax[i].set_ylabel("density")
ax[i].set_xlim((-20, 20))
if not i:
ax[i].legend(frameon=False, loc="upper right")
The above plot shows the prior predictive distribution in blue and posterior predictive distribution from the first example above in green.
Raw Stan code
The 'private' attribute _render_model
can be used, if interested, to see the raw Stan code:
ab._render_model()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file abayestest-0.0.4.tar.gz.
File metadata
- Download URL: abayestest-0.0.4.tar.gz
- Upload date:
- Size: 159.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91ff8f986522442bec3f43cd26b57d97b3a01380a25ecfe2aad297b8c249b5b8
|
|
| MD5 |
9c83e0264e9ee043af087d4b0823fc24
|
|
| BLAKE2b-256 |
98f0386ff87c41ec962c5ac3a5dacbd8b72ee1ded4b032b96e18ccda278d7a61
|
File details
Details for the file abayestest-0.0.4-py3-none-any.whl.
File metadata
- Download URL: abayestest-0.0.4-py3-none-any.whl
- Upload date:
- Size: 27.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64177a074f3e469e18b6b09ef78a09b9b18d9c323e7bc8deb5e36fd87ff98b88
|
|
| MD5 |
2d8d4526ad127fbbfbf349219f899a0b
|
|
| BLAKE2b-256 |
e084af3f5d9431d8e3ef21e2bad5f1c7338de4fb5778af7cd86edd34a656fbd6
|