Do causal inference more casually
Project description
casual_inference
The casual_inference
is a Python package provides a simple interface to do causal inference.
Doing causal analyses is a complicated stuff. We have to pay attention to many things to do such analyses properly.
The casual_inference
is developed aiming to reduce such effort.
Installation
pip install casual-inference
Overview
This package will provide several types of evaluator
. They have evaluate()
and some summary_xxx()
methods. The evaluate()
method evaluates treatment impact by calculating several statistics in it, and the summary_xxx()
methods summarize such statistics in some ways. (e.g., table style, bar chart style, ...)
The evaluate()
method expects that the data which has a schema like as follows will be passed.
unit | variant | metric_A | metric_B | ... |
---|---|---|---|---|
1 | 1 | 0 | 0.01 | ... |
2 | 1 | 1 | 0.05 | ... |
3 | 2 | 0 | 0.02 | ... |
... | ... | ... | ... | ... |
The table has been already aggregated by the unit
column. (i.e. The unit
column should be the primary key)
Columns
unit
: The unit you want to conduct analysis on. Typically it will be user_id, session_id, ... in the web service domain.variant
: The group of intervention. This package always assumes1
is a variant of control group.metrics
: metrics you want to evaluate. e.g., The number of purchases, conversion rate, ...
Quick Start
The casual_inference
supports not only the evaluation of normal A/B testing and A/A testing, but also advanced causal inference techniques.
A/B test evaluation
from casual_inference.dataset import create_sample_ab_result
from casual_inference.evaluator import ABTestEvaluator
data = create_sample_ab_result(n_variant=3, sample_size=1000000, simulated_lift=[-0.01, 0.01])
evaluator = ABTestEvaluator()
evaluator.evaluate(
data=data,
unit_col="rand_unit",
variant_col="variant",
metrics=["metric_bin", "metric_cont"]
)
evaluator.summary_plot()
It diagnoses Sample Ratio Mismatch (SRM) automatically. When it detects the SRM, it'll display a warning on the output so that the Analyst can interpret the result carefully.
You can also see the example notebook to see more detailed example.
A/A test evaluation
from casual_inference.dataset import create_sample_ab_result
from casual_inference.evaluator import AATestEvaluator
data = create_sample_ab_result(n_variant=2, sample_size=1000000, simulated_lift=[0.0])
evaluator = AATestEvaluator()
evaluator.evaluate(
data=data,
unit_col="rand_unit",
metrics=["metric_bin", "metric_cont"]
)
evaluator.summary_plot()
You can also see the example notebook to see more detailed example.
Sample Size evaluation
from casual_inference.dataset import create_sample_ab_result
from casual_inference.evaluator import SampleSizeEvaluator
data = create_sample_ab_result(n_variant=2, sample_size=1000000)
evaluator = SampleSizeEvaluator()
evaluator.evaluate(
data=data,
unit_col="rand_unit",
metrics=["metric_bin", "metric_cont"]
)
evaluator.summary_plot()
You can also see the example notebook to see more detailed example.
Advanced causal inference techniques
It also supports advanced causal inference techniques.
- Linear Regression
Another evaluation methods like Propensity Score Matching are planed to implement in the future.
References
- Kohavi, Ron, Diane Tang, and Ya Xu. 2020. Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. Cambridge University Press. https://experimentguide.com/
- A Great book covering comprehensive topics around practical A/B testing. I do recommend to read this book for all people who works on A/B testing.
- Alex Deng, Ulf Knoblich, and Jiannan Lu. 2018. Applying the Delta Method in Metric Analytics: A Practical Guide with Novel Ideas. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '18). Association for Computing Machinery, New York, NY, USA, 233–242. https://doi.org/10.1145/3219819.3219919
- Describing how to approximate variance of relative difference, and when the analysis unit was more granular than the randomization unit.
- Lucile Lu. 2016. Power, minimal detectable effect, and bucket size estimation in A/B tests. Twitter Engineering Blog. link
- Describing Concept around Type I error and Type II error, Power Analysis. (Sample size calculation)
- Aleksander Fabijan, Jayant Gupchup, Somit Gupta, Jeff Omhover, Wen Qin, Lukas Vermeer, and Pavel Dmitriev. 2019. Diagnosing Sample Ratio Mismatch in Online Controlled Experiments: A Taxonomy and Rules of Thumb for Practitioners. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '19). Association for Computing Machinery, New York, NY, USA, 2156–2164. https://doi.org/10.1145/3292500.3330722
- Introduce Sample Ratio Mismatch (SRM) and describe various example of SRM happening, and provide taxonomy that help debugging when the SRM happened.
- Shota Yasui. 2020. 効果検証入門. 技術評論社. https://gihyo.jp/book/2020/978-4-297-11117-5
- A Great introduction book about practical causal inference technique written in Japanese.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file casual_inference-0.7.1.tar.gz
.
File metadata
- Download URL: casual_inference-0.7.1.tar.gz
- Upload date:
- Size: 13.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.10.15 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2448809e58614b02fdcb8b0b32fbf167b9143e70fbdebea6abd898ee4b666738 |
|
MD5 | 9008568036b28f4432d2c1d6d295477c |
|
BLAKE2b-256 | 618f9e7f0454ff7fdcfaca4d9ffed3d9197c023498cac58ba50680c973a37884 |
File details
Details for the file casual_inference-0.7.1-py3-none-any.whl
.
File metadata
- Download URL: casual_inference-0.7.1-py3-none-any.whl
- Upload date:
- Size: 17.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.10.15 Linux/6.5.0-1025-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 583938e5379b221ea8f5e97870f9fcd011179e782204f3153752098147be9259 |
|
MD5 | 2d9090d190b58df81f6ae459fd0a2ef8 |
|
BLAKE2b-256 | 3b0f0524e9d6c3cbf26309932df04e8c771723c702e9634b43f2db590650421e |