Implementation of Exponential weighting for Exploration and Exploitation with Experts.
Project description
EXP4
A python implementation of Exponential weighting for Exploration and Exploitation with Experts. Based on this blog post.
This algorithm is useful for non-stochastic Contextual Multi Armed Bandits.
Table of Contents
Installation
If you just need to use exp4
, you can just run:
$ pip install exp4
For developers, note that this project uses the poetry python package/dependency management tool. Please familarize yourself with it and then run:
$ poetry install
Usage
exp4
is centered around the exp4.exp4
function which creates a
co-routine for selecting arms given expert advice.
The protocol is as follows:
- The expert constructs an expert advice matrix.
- Each row contains the corresponding experts advice vector.
- The advice vector provides probabilities for each arm.
- The expert sends a tuple of loss and advice.
- The loss corresponds to the previous round.
- The first round's loss is ignored.
- The advice correspond to the current round.
An example is given below.
player = exp4()
loss = None # Will be ignored.
advice = [
[1/3, 1/3, 1/3], # Expert 1
[2/3, 1/3, 0], # Expert 2
]
arm = player.send((loss, advice))
assert arm in range(3)
loss = 1 / (1 + arm) # Arbitrary loss assigned to arm.
advice = [
[0, 0, 1], # Expert 1
[0, 0, 1], # Expert 2
]
arm = player.send((loss, advice))
assert arm == 2
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.