Skip to main content

Multi-armed bandit algorithms

Project description

Multi-Armed Bandit Algorithms

Multi-Armed Bandit (MAB) is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice.

In the problem, each machine provides a random reward from a probability distribution specific to that machine. The objective of the gambler is to maximize the sum of rewards earned through a sequence of lever pulls. The crucial tradeoff the gambler faces at each trial is between "exploitation" of the machine that has the highest expected payoff and "exploration" to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also faced in machine learning.

The main problems that the MAB help to solve is the split of the population in online experiments.

Installing

pip install mabalgs

Algorithms (Bandit strategies)

UCB1 (Upper Confidence Bound)

Is an algorithm for the multi-armed bandit that achieves regret that grows only logarithmically with the number of actions taken, with no prior knowledge of the reward distribution required.

Get a selected arm

from mab import algs

ucb_with_two_arms = algs.UCB1(2)
ucb_with_two_arms.select()

Reward an arm

from mab import algs

ucb_with_two_arms = algs.UCB1(2)
my_arm = ucb_with_two_arms.select()
ucb_with_two_arms.reward(my_arm)

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mabalgs-0.4.3.tar.gz (3.0 kB view details)

Uploaded Source

Built Distribution

mabalgs-0.4.3-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file mabalgs-0.4.3.tar.gz.

File metadata

  • Download URL: mabalgs-0.4.3.tar.gz
  • Upload date:
  • Size: 3.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.8

File hashes

Hashes for mabalgs-0.4.3.tar.gz
Algorithm Hash digest
SHA256 1497879b28491a01fcc0cae1c807d33b8359be692b72464a336d950193490454
MD5 8a613be0b276c49d02de00a9177907fa
BLAKE2b-256 96253f84edf4b0820b04f94de64d5ea155f34a423b27df56cb032ba6e7d0aab6

See more details on using hashes here.

File details

Details for the file mabalgs-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: mabalgs-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 7.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/3.6.8

File hashes

Hashes for mabalgs-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2ece958f528a3d4519ce3d070d339f4e0260f5630461881ac02e55e639a8f340
MD5 31c6bb62ece68589e16e0198e7d5a4c8
BLAKE2b-256 6577e6314659db8b43b10d6982a89ba27eb9a237a51c9a472722a62ab37738b7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page