Skip to main content

Python package of estimators to perform off-policy evaluation

Project description

Estimators Library

In contextual bandits, a learning algorithm repeatedly observes a context, takes an action, and observes a reward for the chosen action. An example is content personalization: the context describes a user, actions are candidate stories, and the reward measures how much the user liked the recommended story. In essence, the algorithm is a policy that picks the best action given a context.

Given different policies, the metric of interest is their reward. One way to measure the reward is to deploy such policy online and let it choose actions (for example, recommend stories to users). However, such online evaluation can be costly for two reasons: It exposes users to an untested, experimental policy; and it doesn't scale to evaluating multiple target policies.

The alternative is off-policy evaluation: Given data logs collected by using a logging policy, off-policy evaluation can estimate the expected rewards for different target policies and provide confidence intervals around such estimates.

This repo collects estimators to perform such off-policy evaluation.

Tests

Run tests with:

python3 -m pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vw-estimators-0.2.2.tar.gz (16.1 kB view details)

Uploaded Source

File details

Details for the file vw-estimators-0.2.2.tar.gz.

File metadata

  • Download URL: vw-estimators-0.2.2.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.7

File hashes

Hashes for vw-estimators-0.2.2.tar.gz
Algorithm Hash digest
SHA256 53916c1b0e787e0b3778cba4cceab455614faf68252f42ed37e695e908ec7f97
MD5 64c0ee5250033b9a1ddc2d5c5aeeb9b1
BLAKE2b-256 ef11e216cedf935e5ac8920667877222a63cccaa8bd776e83e945673bff2ac37

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page