Skip to main content

A library for BaseMath's test, a novel Group Sequential Testing approach that enables the user to stop early.

Project description

Basemath

Welcome to Basemath, an open-source implementation of the statistical test bearing the same name designed for analyzing AB experiments.

Basemath employs a one-sided testing approach, where the null hypothesis posits that the treatment performs either equally or worse than the control concerning the target metric. The test has a predetermined maximum runtime determined by the input parameters. Additionally, it ensures that both type I and type II errors remain below specified error thresholds, denoted as α and β, respectively. Basemath assesses the experiment in batches and terminates prematurely if it can reject the null hypothesis. The β-spending function employed is is O’Brien-Fleming-like. Given that the majority of experiments yield either flat or negative results, stopping early in this scenario saves more running time compared to stopping in the less common case of a significant uplift.

What sets Basemath apart is its avoidance of recurrent numerical integration resulting in a straightforward and fast implementation. For a detailed exploration of Basemath's mathematical foundations, refer to our article Basemath’s Test: Group Sequential Testing Without Recurrent Numerical Integration.

Installation

Simply install the library using your package manger of choice.

For instance, with pip:

pip install basemath-analysis

Usage

The Binary Case

Assume you are conducting an experiment on your platform. You have modified the user experience (UX) and aim to demonstrate that this change improves the conversion of visitors to customers. To achieve this, you present the current UX (control group) to half of your visitors and the new UX (treatment group) to the other half. Data on the number of people visiting your platform and the number of visitors converting to customers are processed on a daily basis. Your expectation is that the UX change will lead to a relative increase in the conversion rate (number of customers / number of visitors) by at least 1%.

You initialize Basemath using the following Python code:

import basemath_analysis.basemath as bm
bm_test = bm.BaseMathsTest(cr_A, mde, alpha, beta, seed="experiment_name")

The parameters are as follows:

  • cr_A: The estimated conversion rate of your control group.
  • mde: The minimal relative uplift you are aiming for (1% in our example).
  • alpha: The maximal type I error you are willing to tolerate (α is often set to 5%).
  • beta: The maximal type II error you are willing to tolerate (β is often set to 20%).
  • seed: As the algorithm contains a random element, set a seed to ensure consistent outcomes when running the algorithm repeatedly on the same data. The seed is generated from a string such as the unique experiment name. After initialization, you can check the maximum number of required samples for each variation to estimate the running time:
print(bm_test.required_samples)

Enter the most recent data daily into the instance to determine if there is a significant outcome:

bm_test.evaluate_experiment(
    previous_customer_delta,
    customer_delta_since_yesterday,
    previous_visitor_number,
    visitors_since_yesterday

The parameters for this method are:

  • previous_customer_delta: The difference between the overall number of customers in the treatment and control groups as of the last check-in, i.e., as of yesterday in our example.
  • customer_delta_since_yesterday: The difference between the overall number of customers in the treatment and control groups since the last check-in, i.e., since yesterday in our example.
  • previous_visitor_number: The number of visitors per variation as of the last check-in.
  • visitors_since_yesterday: The number of visitors per variation since the last check-in.

The outcome of this method is either 0, 1, or -1. If Basemath hasn't reached a significant conclusion yet, it returns a 0. If the alternative hypothesis that the treatment is significantly better than the control can be rejected, it returns a -1. This can occur in each evaluation step. If Basemath finds a significant uplift for the treatment group, it returns a 1. This can only happen once the maximum number of required samples has been reached.

Once the outcome is no longer 0, the test has concluded, and the experiment can be stopped. Further evaluation beyond this point is futile.

The Continuous Case

Some changes may not result in more visitors converting to customers. The number of customers might remain the same; however, customers in the treatment group might spend more money on your platform. In this case, change your target metric from conversion rate to revenue per visitor (sum of revenue / number of visitors).

For a continuous target variable, the initialization of Basemath only slightly changes:

bm_test = bm.BaseMathsTest(rpv_A, mde, alpha, beta, var_A=var_A, seed="experiment_name")

Instead of the estimated conversion rate, enter the estimated average revenue per visitor (rpv_A) for the control group. Additionally, estimate the variance of the revenue per visitor, considering visitors who do not convert (i.e., assuming a revenue of 0 for them). The other parameters remain the same as in the binary case.

Basemath only works with two variations and a 50/50 split in traffic.

More detailed examples are available here.

Contributing

To contribute, you'll need a working python 3.8+ installation. We also recommend setting up a virtual environment for the project. You'll also need to install poetry if it's not already present.

Once you have the dependencies installed (with poetry install), you can set up pre-commit with pre-commit install. We run pre-commit in our CI as well, but it's recommended to install it locally so that you get immediate feedback from our various linters.

Beyond that, there's not really anything else you need to know to start contributing! Please create a pull request with whatever changes you'd like to propose and increment the version and update the changelog if necessary.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

basemath_analysis-0.1.1.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

basemath_analysis-0.1.1-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file basemath_analysis-0.1.1.tar.gz.

File metadata

  • Download URL: basemath_analysis-0.1.1.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.2.0-1016-azure

File hashes

Hashes for basemath_analysis-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3e1e9a427046fef8ea17541fcde51ad62a49454be08ec7e0595f05a56163d83f
MD5 bbb8cccdbe2f6929d966258c35cfa1ab
BLAKE2b-256 96dd45484b7b81a8bef2286d9488e7f77463a8e0ae99e63d50618970c47f3d33

See more details on using hashes here.

File details

Details for the file basemath_analysis-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: basemath_analysis-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.12 Linux/6.2.0-1016-azure

File hashes

Hashes for basemath_analysis-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9383c05e15567656c9c963dba3622d7373bf09436a3e68ee9c11af2e6199bee6
MD5 9bd3fb85a1214552a5f9329828f083ad
BLAKE2b-256 a0ff28e5e40fcc408af9791639de4e2f53a9f341e3f965be6d1d2ec5eaaa2063

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page