Skip to main content

A simple tool to calculate P-value after conducting an A/B experiment

Project description

Pyabtest

A simple tool to calculate P-value after conducting an A/B experiment

A/B experiment & Hypothesis testing

Normally we run an A/B experiment to see whether a new model brings some improvement in the production metrics. After running the experiment for a fixed time period, we use hypothesis testing to scientifically come to a conclusion whether to accept the new feature or not. Usually, hypothesis testing has following components:

Null hypothesis: New model does not bring any improvement
Alternative hypothesis: New model does bring some improvement

This tool will be useful to calculate P-value to check whether we can reject the null hypothesis or not.

Installation

Use the package manager pip to install pyabtest

pip install pyabtest

Usage

Following functionalities are exposed in this package

1. Test for Sample Ratio Mismatch (SRM)

This is a test to check whether we have created audience for control vs test in a truly random manner. If there is an SRM, we should discard the A/B test results as control and variant have different type of audience. For example, we can pass following numbers in control vs test to check for SRM.

  1. Number of male vs Number of female
  2. Number of users of age < 40 vs Number of users of age >= 40
  3. Number of active users vs Number of inactive users
  4. Number of english speaking users vs Number of non-english speaking users
  5. Number of mobile users vs Number of desktop users

Input: Control group 1 size, Control group 2 size, Variant group 1 size, Variant group 2 size

Output: P-value, Alpha, Decision

>>> import pyabtest
>>> pyabtest.test_for_sample_ratio_mismatch(control_group1_size=1000,control_group2_size=2000,variant_group1_size=
1010,variant_group2_size=1990,alpha=0.05)
{'P-value': 0.78445, 'Alpha value (significance level)': 0.05, 'Decision': "Don't discard A/B test results"}

Test used: Chi-squared Test

2. Test for Binary Metric

This test can be used when when the result/action/feedback is binary & we want to see if variant observations are coming from a differant population when compared to control. For example, this test can be used in the following situations:

  1. Clicks vs No clicks
  2. Cart vs No cart
  3. Order vs No order
  4. Number of zero search results vs Number of non-zero serach results
  5. Number of successful sessions vs Number of non-successful sessions
  6. Number of positive reviews vs Number of negative reviews
  7. Number of converted users vs Number of non-converted users

Input: No. of success in Control, No. of failures in Control, No. of success in variant, No. of failures in variant

Output: P-value, Alpha, Decision

>>> import pyabtest
>>> pyabtest.test_for_binary_metric(control_success=50, control_failures=1000, variant_success=40, variant_failures=900, alpha=0.05)
{'P-value': 0.58718, 'Alpha value (significance level)': 0.05, 'Decision': 'Do not reject null hypothesis'}

Test used: Chi-squared Test

3. Test for Numeric Metric

This test can be used for any generic numeric metric (Count or Fraction). We can use this test even if the observations do not follow a normal distribution. In general, this test does not assume anything about the distribution as it is a non-parametric test. Example metrics include:

  1. Number of clicks per unique user
  2. Number of carts per unique user
  3. Number of orders per unique user
  4. Clicks/Views per unique user
  5. Orders/Views per unique user
  6. Orders/Session per unique user
  7. Revenue per unique user
  8. Session time per unique user
  9. Order value per unique user
  10. Successful sessions per unique user

Input: Control array (Ex: Array containing no. of clicks for each user in control, order does not matter), Variant array (Ex: Array containing no. of clicks for each user in variant, order does not matter)

Output: P-value, Alpha, Decision

>>> import pyabtest
>>> from numpy import random
>>> pyabtest.test_for_numeric_metric(control_observations=random.randint(100, size=(20)), variant_observations=random.randint(100, size=(20)), alpha=0.05, no_of_samples=1000)
{'P-value': 0.7411, 'Alpha value (significance level)': 0.05, 'Decision': 'Do not reject null hypothesis'}

Test used: Bootstrap Test

License

MIT

References

  1. Hypothesis testing
  2. Chi-squared test
  3. Bootstrap test1, Bootstrap test2

Author

Rama Badrinath

Email: ramab1988@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyabtest-0.2.3.tar.gz (7.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyabtest-0.2.3-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file pyabtest-0.2.3.tar.gz.

File metadata

  • Download URL: pyabtest-0.2.3.tar.gz
  • Upload date:
  • Size: 7.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.0

File hashes

Hashes for pyabtest-0.2.3.tar.gz
Algorithm Hash digest
SHA256 5d8917e4092dde0c0df0dfc0b2e5ff939f0d65bbfa713c96dc7087715c562271
MD5 c8d0fd0644568eb34d7ad223aa0e1a85
BLAKE2b-256 bb687c9d4bac38a98348c352449347281a5efa3766eb213b3b532cfcc337fec6

See more details on using hashes here.

File details

Details for the file pyabtest-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: pyabtest-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 6.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.0

File hashes

Hashes for pyabtest-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 274805269b25022b4648ae56c5ac73f99c959835952025c6ceff543caec610b0
MD5 28cd44000c801dce26f0acfda4b1fa23
BLAKE2b-256 9651d0ffee586833fa2d8581b2d3198d4c911d8109f1e4a5b585f357af1babee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page