A Pythonic microframework for Multi-Armed Bandit algorithms.
Project description
bayesianbandits
bayesianbandits is a Pythonic framework for building agents to maximize rewards in multi-armed bandit (MAB) problems. These agents can handle a number of MAB subproblems, such as contextual, restless, and delayed reward bandits.
Building an agent is as simple as defining arms and using the necessary decorators. For example, to create an agent for a Bernoulli bandit:
import numpy as np
from bayesianbandits import Bandit, Arm, epsilon_greedy, DirichletClassifier
def reward_func(x):
return np.take(x, 0, axis=-1)
clf = DirichletClassifier({"yes": 1.0, "no": 1.0})
policy = epsilon_greedy(0.1)
class Agent(Bandit, learner=clf, policy=policy):
arm1 = Arm("action 1", reward_func)
arm2 = Arm("action 2", reward_func)
agent = Agent()
agent.pull() # receive some reward
agent.update("yes") # update with observed reward
Getting Started
Install this package from PyPI.
pip install -U bayesianbandits
Usage
Check out the documentation for examples and an API reference.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
bayesianbandits-0.4.5.tar.gz
(17.8 kB
view hashes)
Built Distribution
Close
Hashes for bayesianbandits-0.4.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26fc513232b63be73099118d55e3af9122c2edfe016d8521af30537c7f571a4c |
|
MD5 | 4bac2310303ee8cc5bf8365cb60cdcb9 |
|
BLAKE2b-256 | d9c4c1963057a438e0d600288374935219c02d60dd3b7e0fd1a45475ec8656c0 |