Skip to main content

Do causal inference more casually

Project description

ci PyPI version Downloads

casual_inference

The casual_inference is a Python package provides a simple interface to do causal inference. Doing causal analyses is a complicated stuff. We have to pay attention to many things to do such analyses properly. The casual_inference is developed aiming to reduce such effort.

Installation

pip install casual-inference

Overview

This package will provide several types of evaluator. They have evaluate() and some summary_xxx() methods. The evaluate() method evaluates treatment impact by calculating several statistics in it, and the summary_xxx() methods summarize such statistics in some ways. (e.g., table style, bar chart style, ...)

The evaluate() method expects that the data which has a schema like as follows will be passed.

unit variant metric_A metric_B ...
1 1 0 0.01 ...
2 1 1 0.05 ...
3 2 0 0.02 ...
... ... ... ... ...

The table has been already aggregated by the unit column. (i.e. The unit column should be the primary key)

Columns

  • unit: The unit you want to conduct analysis on. Typically it will be user_id, session_id, ... in the web service domain.
  • variant: The group of intervention. This package always assumes 1 is a variant of control group.
  • metrics: metrics you want to evaluate. e.g., The number of purchases, conversion rate, ...

Quick Start

The casual_inference supports not only the evaluation of normal A/B testing and A/A testing, but also advanced causal inference techniques.

A/B test evaluation

from casual_inference.dataset import create_sample_ab_result
from casual_inference.evaluator import ABTestEvaluator

data = create_sample_ab_result(n_variant=3, sample_size=1000000, simulated_lift=[-0.01, 0.01])

evaluator = ABTestEvaluator()
evaluator.evaluate(
    data=data,
    unit_col="rand_unit",
    variant_col="variant",
    metrics=["metric_bin", "metric_cont"]
)

evaluator.summary_plot()

eval_result

It diagnoses Sample Ratio Mismatch (SRM) automatically. When it detects the SRM, it'll display a warning on the output so that the Analyst can interpret the result carefully.

You can also see the example notebook to see more detailed example.

A/A test evaluation

from casual_inference.dataset import create_sample_ab_result
from casual_inference.evaluator import AATestEvaluator

data = create_sample_ab_result(n_variant=2, sample_size=1000000, simulated_lift=[0.0])

evaluator = AATestEvaluator()
evaluator.evaluate(
    data=data,
    unit_col="rand_unit",
    metrics=["metric_bin", "metric_cont"]
)

evaluator.summary_plot()

eval_result

You can also see the example notebook to see more detailed example.

Sample Size evaluation

from casual_inference.dataset import create_sample_ab_result
from casual_inference.evaluator import SampleSizeEvaluator

data = create_sample_ab_result(n_variant=2, sample_size=1000000)

evaluator = SampleSizeEvaluator()
evaluator.evaluate(
    data=data,
    unit_col="rand_unit",
    metrics=["metric_bin", "metric_cont"]
)

evaluator.summary_plot()

eval_result

You can also see the example notebook to see more detailed example.

Advanced causal inference techniques

It also supports advanced causal inference techniques.

  • Linear Regression

Another evaluation methods like Propensity Score Matching are planed to implement in the future.

References

  • Kohavi, Ron, Diane Tang, and Ya Xu. 2020. ​Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press. https://experimentguide.com/
    • A Great book covering comprehensive topics around practical A/B testing. I do recommend to read this book for all people who works on A/B testing.
  • Alex Deng, Ulf Knoblich, and Jiannan Lu. 2018. Applying the Delta Method in Metric Analytics: A Practical Guide with Novel Ideas. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '18). Association for Computing Machinery, New York, NY, USA, 233–242. https://doi.org/10.1145/3219819.3219919
    • Describing how to approximate variance of relative difference, and when the analysis unit was more granular than the randomization unit.
  • Lucile Lu. 2016. Power, minimal detectable effect, and bucket size estimation in A/B tests. Twitter Engineering Blog. link
    • Describing Concept around Type I error and Type II error, Power Analysis. (Sample size calculation)
  • Aleksander Fabijan, Jayant Gupchup, Somit Gupta, Jeff Omhover, Wen Qin, Lukas Vermeer, and Pavel Dmitriev. 2019. Diagnosing Sample Ratio Mismatch in Online Controlled Experiments: A Taxonomy and Rules of Thumb for Practitioners. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '19). Association for Computing Machinery, New York, NY, USA, 2156–2164. https://doi.org/10.1145/3292500.3330722
    • Introduce Sample Ratio Mismatch (SRM) and describe various example of SRM happening, and provide taxonomy that help debugging when the SRM happened.
  • Shota Yasui. 2020. 効果検証入門. 技術評論社. https://gihyo.jp/book/2020/978-4-297-11117-5
    • A Great introduction book about practical causal inference technique written in Japanese.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

casual_inference-0.7.1.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

casual_inference-0.7.1-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file casual_inference-0.7.1.tar.gz.

File metadata

  • Download URL: casual_inference-0.7.1.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.10.15 Linux/6.5.0-1025-azure

File hashes

Hashes for casual_inference-0.7.1.tar.gz
Algorithm Hash digest
SHA256 2448809e58614b02fdcb8b0b32fbf167b9143e70fbdebea6abd898ee4b666738
MD5 9008568036b28f4432d2c1d6d295477c
BLAKE2b-256 618f9e7f0454ff7fdcfaca4d9ffed3d9197c023498cac58ba50680c973a37884

See more details on using hashes here.

File details

Details for the file casual_inference-0.7.1-py3-none-any.whl.

File metadata

  • Download URL: casual_inference-0.7.1-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.10.15 Linux/6.5.0-1025-azure

File hashes

Hashes for casual_inference-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 583938e5379b221ea8f5e97870f9fcd011179e782204f3153752098147be9259
MD5 2d9090d190b58df81f6ae459fd0a2ef8
BLAKE2b-256 3b0f0524e9d6c3cbf26309932df04e8c771723c702e9634b43f2db590650421e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page