Skip to main content

Bayesian Multi Armed Bandit

Project description

Bayesian Multi Armed Bandit

🚀 Overview

A Bayesian Multi-Armed Bandit is a statistical model used in decision-making processes under uncertainty. It is a variation of the classic multi-armed bandit problem, where you have multiple options (each represented as an arm of a bandit, or slot machine) and you must choose which to pursue to maximize your rewards. The Bayesian aspect of this model comes into play by using Bayesian inference to update the probability distribution of the rewards of each arm based on prior knowledge and observed outcomes. This approach allows for a more nuanced and dynamically adaptive decision-making process, as the model continuously updates its beliefs and predictions about the performance of each option in real time. It's especially useful in scenarios where the environment changes or when dealing with limited information.

Between several use cases, we can highlight

  • Online adversising
  • Website optimization
  • Personalization
  • Clinical trials

💻 Example Usage

For Bayesian Multi Armed Bandits (MAB), first define initialize the BayesianMAB and provide it the arm object with an index and a provided name, this will be useful for determining the winner later on.

We provide an example of for loops giving positive (1) and negative (0) rewards to each arm, you can add it as you want.

At any moment, we can check it we already have a winner, using the BayesianMAB.check_for_end method.

from bayesian_mab import BayesianMAB, BinaryReward
import numpy as np

binary_reward = BinaryReward()

bayesian_mab = BayesianMAB(
    arms=[
        BayesianArm(index=0, arm_name="Ad #1"),
        BayesianArm(index=1, arm_name="Ad #2"),
        BayesianArm(index=2, arm_name="Ad #3"),
    ]
)

for i in range(4):
    binary_reward.update_reward(np.random.binomial(1, p=0.9))
    bayesian_mab.update_arm(chosen_arm=0, reward_agent=binary_reward)

for i in range(1500):
    binary_reward.update_reward(np.random.binomial(1, p=0.3))
    bayesian_mab.update_arm(chosen_arm=1, reward_agent=binary_reward)

for i in range(1500):
    binary_reward.update_reward(np.random.binomial(1, p=0.9))
    bayesian_mab.update_arm(chosen_arm=2, reward_agent=binary_reward)

flg_end, winner_arm = bayesian_mab.check_for_end(winner_prob_threshold=0.80)

print("Is there a winner? {}. Winner: {}".format(flg_end, winner_arm))

Acknowledgments and References

  • Cook, J., 2005. Exact calculation of beta inequalities. Houston: University of Texas, MD Anderson Cancer Center. Available here
  • Slivkins, A., 2019. Introduction to multi-armed bandits. Foundations and Trends® in Machine Learning, 12(1-2), pp.1-286. Available here
  • White, J., 2013. Bandit algorithms for website optimization. " O'Reilly Media, Inc.".
  • Bruce, P., Bruce, A. and Gedeck, P., 2020. Practical statistics for data scientists: 50+ essential concepts using R and Python. O'Reilly Media.
  • Praise on Vincenzo Lavorini for this Towards Data Science blog post.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

bayesian_mab-0.1.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file bayesian_mab-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: bayesian_mab-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for bayesian_mab-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e8ea561f6134501e0ce4754eb1af9858eed0502d1d4d9fe5353d7b1caa919c2d
MD5 651da5b515ff1c64718fc66e5617d1c8
BLAKE2b-256 ead5880d5d95cdc029f1036ec933a98d5e07e21422d589e66cddedcefa1fdd17

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page