Implementations of common offline policy evaluation methods.
Project description
Offline policy evaluation
Implementations and examples of common offline policy evaluation methods in Python. For more information on offline policy evaluation see this tutorial.
Installation
pip install offline-evaluation
Usage
from ope.methods import doubly_robust
Get some historical logs generated by a previous policy:
df = pd.DataFrame([
{"context": {"p_fraud": 0.08}, "action": "blocked", "action_prob": 0.90, "reward": 0},
{"context": {"p_fraud": 0.03}, "action": "allowed", "action_prob": 0.90, "reward": 20},
{"context": {"p_fraud": 0.02}, "action": "allowed", "action_prob": 0.90, "reward": 10},
{"context": {"p_fraud": 0.01}, "action": "allowed", "action_prob": 0.90, "reward": 20},
{"context": {"p_fraud": 0.09}, "action": "allowed", "action_prob": 0.10, "reward": -20},
{"context": {"p_fraud": 0.40}, "action": "allowed", "action_prob": 0.10, "reward": -10},
])
Define a function that computes P(action | context)
under the new policy:
def action_probabilities(context):
epsilon = 0.10
if context["p_fraud"] > 0.10:
return {"allowed": epsilon, "blocked": 1 - epsilon}
return {"allowed": 1 - epsilon, "blocked": epsilon}
Conduct the evaluation:
doubly_robust.evaluate(df, action_probabilities)
> {'expected_reward_logging_policy': 3.33, 'expected_reward_new_policy': -28.47}
This means the new policy is significantly worse than the logging policy. Instead of A/B testing this new policy online, it would be better to test some other policies offline first.
See examples for more detailed tutorials.
Supported methods
- Inverse propensity scoring
- Direct method
- Doubly robust (paper)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file offline-evaluation-0.0.6.tar.gz
.
File metadata
- Download URL: offline-evaluation-0.0.6.tar.gz
- Upload date:
- Size: 4.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0497da17967385b031dc3d1d1d0e4a1bb7009897729ce854d25afe9e7d3b8f25 |
|
MD5 | 076eb6c6d25cbe70b7b20d7efab80735 |
|
BLAKE2b-256 | 8ce1335135f65a0820718cf06658d9f61552c0ee568b2d2bf1bf4f4d61ad481c |