Bayesian Multi Armed Bandit
Project description
Bayesian Multi Armed Bandit
🚀 Overview
A Bayesian Multi-Armed Bandit is a statistical model used in decision-making processes under uncertainty. It is a variation of the classic multi-armed bandit problem, where you have multiple options (each represented as an arm of a bandit, or slot machine) and you must choose which to pursue to maximize your rewards. The Bayesian aspect of this model comes into play by using Bayesian inference to update the probability distribution of the rewards of each arm based on prior knowledge and observed outcomes. This approach allows for a more nuanced and dynamically adaptive decision-making process, as the model continuously updates its beliefs and predictions about the performance of each option in real time. It's especially useful in scenarios where the environment changes or when dealing with limited information.
Between several use cases, we can highlight
- Online adversising
- Website optimization
- Personalization
- Clinical trials
💻 Example Usage
For Bayesian Multi Armed Bandits (MAB), first define initialize the BayesianMAB and provide it the arm object with an index and a provided name, this will be useful for determining the winner later on.
We provide an example of for loops giving positive (1) and negative (0) rewards to each arm, you can add it as you want.
At any moment, we can check it we already have a winner, using the BayesianMAB.check_for_end method.
from bayesian_mab import BayesianMAB, BinaryReward
import numpy as np
binary_reward = BinaryReward()
bayesian_mab = BayesianMAB(
arms=[
BayesianArm(index=0, arm_name="Ad #1"),
BayesianArm(index=1, arm_name="Ad #2"),
BayesianArm(index=2, arm_name="Ad #3"),
]
)
for i in range(4):
binary_reward.update_reward(np.random.binomial(1, p=0.9))
bayesian_mab.update_arm(chosen_arm=0, reward_agent=binary_reward)
for i in range(1500):
binary_reward.update_reward(np.random.binomial(1, p=0.3))
bayesian_mab.update_arm(chosen_arm=1, reward_agent=binary_reward)
for i in range(1500):
binary_reward.update_reward(np.random.binomial(1, p=0.9))
bayesian_mab.update_arm(chosen_arm=2, reward_agent=binary_reward)
flg_end, winner_arm = bayesian_mab.check_for_end(winner_prob_threshold=0.80)
print("Is there a winner? {}. Winner: {}".format(flg_end, winner_arm))
Acknowledgments and References
- Cook, J., 2005. Exact calculation of beta inequalities. Houston: University of Texas, MD Anderson Cancer Center. Available here
- Slivkins, A., 2019. Introduction to multi-armed bandits. Foundations and Trends® in Machine Learning, 12(1-2), pp.1-286. Available here
- White, J., 2013. Bandit algorithms for website optimization. " O'Reilly Media, Inc.".
- Bruce, P., Bruce, A. and Gedeck, P., 2020. Practical statistics for data scientists: 50+ essential concepts using R and Python. O'Reilly Media.
- Praise on Vincenzo Lavorini for this Towards Data Science blog post.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file bayesian_mab-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: bayesian_mab-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e8ea561f6134501e0ce4754eb1af9858eed0502d1d4d9fe5353d7b1caa919c2d |
|
MD5 | 651da5b515ff1c64718fc66e5617d1c8 |
|
BLAKE2b-256 | ead5880d5d95cdc029f1036ec933a98d5e07e21422d589e66cddedcefa1fdd17 |