Skip to main content

Advanced A/B Testing Statistical Analytics

Project description

ABalytics: Advanced A/B Testing Statistical Analytics

ABalytics is a Python package designed for statistical analysis, particularly for assessing the significance of A/B testing results. Its goal is to provide high-quality analysis by selecting the appropriate statistical tests based on the type of variable being analyzed. It offers a suite of tools to perform various significance tests and posthoc analyses on experimental data.

Features

  • Boolean and Numeric Analysis: Supports analysis of both boolean and numeric data types, ensuring the use of correct statistical methods for each.
  • Significance Tests: Includes a variety of significance tests such as Chi-Square, Welch's ANOVA, and Kruskal-Wallis, to accurately determine the significance of results.
  • Posthoc Analysis: Offers posthoc analysis methods like Tukey's HSD, Dunn's test, and Games-Howell, for detailed examination following significance tests.
  • Normality and Homogeneity Checks: Performs checks for Gaussian distribution and homogeneity of variances using Levene's test, which are critical for selecting the right tests.
  • Data Transformation: Provides functionality to convert data from long to wide format, facilitating analysis of dependent groups.
  • Pretty Text Output: Generates a formatted text output with the results of the statistical tests, facilitating interpretation and reporting.

Installation

To install ABalytics, use pip:

pip install abalytics

Usage

Analyzing Results

ABalytics provides two main functions for analyzing A/B testing results: analyze_independent_groups and analyze_dependent_groups.

Independent Groups Analysis

analyze_independent_groups is used for analyzing data where the groups are independent of each other. It takes a pandas DataFrame, the name of the column containing the variable to analyze, the name of the column containing the grouping variable, and optional parameters for p-value threshold and minimum sample size.

Example:

from abalytics import analyze_independent_groups
import pandas as pd

# Load your data into a pandas DataFrame
df = pd.read_csv('your_data.csv')

# Analyze the results
analysis_results = analyze_independent_groups(
    df,
    variable_to_analyze="order_value",
    group_column="ab_test_group",
)

Dependent Groups Analysis

analyze_dependent_groups is used for analyzing data where the groups are dependent, such as repeated measures on the same subjects. It requires data in wide format. If your data is in long format, you can use the convert_long_to_wide function in abalytics.utils to convert it. The analyze_dependent_groups function takes a pandas DataFrame, the names of the columns to compare, and optional parameters for p-value threshold and minimum sample size.

Example:

from abalytics import analyze_dependent_groups
import pandas as pd

# Load your data into a pandas DataFrame
df = pd.read_csv('your_data.csv')

# Analyze the results
analysis_results = analyze_dependent_groups(
    df,
    variables_to_compare=["pre_test_score", "post_test_score"],
)

Data Transformation

The convert_long_to_wide function in abalytics.utils is designed to transform data from long format to wide format, with an option to keep multi-level columns or flatten them. analyze_dependent_groups requires data in wide format to operate correctly.

Example:

from abalytics.utils import convert_long_to_wide
import pandas as pd

# Assuming 'df_long' is your pandas DataFrame in long format
df_wide = convert_long_to_wide(
    df_long,
    index_col="subject_id",
    columns_col="condition",
    flatten_columns=True # Set to False if you wish to keep multi-level columns
)

Generating Pretty Text Output

To get a formatted text output of your results, you can use the utils.format_results_as_table function.

Example:

from abalytics.utils import format_results_as_table
from abalytics import analyze_independent_groups
import pandas as pd

# Load your data into a pandas DataFrame
df = pd.read_csv('your_data.csv')

# Analyze the results
analysis_results = analyze_independent_groups(
    df,
    variable_to_analyze="order_value",
    group_column="ab_test_group",
)

# Generate pretty text output
pretty_text = format_results_as_table(
    abalytics_results=[analysis_results],
    identifiers_list=[{"Test name": "A/B Test 1", "Channel": "Mobile"}],
)
print(pretty_text)

Executing this code will output a neatly formatted table displaying the outcomes of the statistical significance tests. The table includes the sample size and the test results. Optionally, you can set show_details to True to include additional details such as the a priori and posthoc tests used. By default, only the significant results are displayed. This can be changed by setting show_only_significant_results to False.

Example output:

Test name             Channel         n  Result                                              p-value
--------------------  ----------  -----  ------------------------------------------------  ---------
A/B Test 1            Mobile       5009  new_cta_1 (0.16) > new_cta_2 (0.15)                   0.007
A/B Test 1            Tablet       2887  new_cta_1 (0.22) > new_cta_2 (0.20)                   0.000
A/B Test 1            Tablet       2887  new_cta_1 (0.22) > old_implementation (0.20)          0.005
A/B Test 1            Desktop     20014  new_cta_1 (0.18) > new_cta_2 (0.17)                   0.000
A/B Test 1            Desktop     20014  new_cta_1 (0.18) > old_implementation (0.17)          0.000
A/B Test 2            Mobile        268  new_cta_1 (0.10) > new_cta_2 (0.06)                   0.006
A/B Test 2            Mobile        268  new_cta_1 (0.10) > old_implementation (0.06)          0.014
A/B Test 2            Desktop      5609  new_cta_1 (0.13) > new_cta_2 (0.12)                   0.025

Further examples of how to use ABalytics can be found in examples/example.py.

Contributing

Contributions to ABalytics are welcome. If you have suggestions for improvements or find any issues, please open an issue or submit a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abalytics-3.0.4.tar.gz (24.7 kB view details)

Uploaded Source

Built Distribution

abalytics-3.0.4-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file abalytics-3.0.4.tar.gz.

File metadata

  • Download URL: abalytics-3.0.4.tar.gz
  • Upload date:
  • Size: 24.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for abalytics-3.0.4.tar.gz
Algorithm Hash digest
SHA256 04e400eb28a0bc0de64c45cd88be8926e1c0f36f713d52670165c0a4bd6d13c3
MD5 268017a15750ebd1b697fa45754d7047
BLAKE2b-256 b3472fd665f2be56906d74587fa4c90c6c248ba98fd1ffecc7924f8ce088dccd

See more details on using hashes here.

File details

Details for the file abalytics-3.0.4-py3-none-any.whl.

File metadata

  • Download URL: abalytics-3.0.4-py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for abalytics-3.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 208598a9d9cd1564a3ae946f09dc39de6b84dc16c693f3a0fac4f5e1ea91a19f
MD5 46e25ca471bfaeeac5f622c19b5597be
BLAKE2b-256 71440f739980adef841520e09c20caf61660b1e183f1dedd5fdc70d235a836a3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page