Skip to main content

The multi-armed bandit by Thompson Sampling, UCB-Upper confidence Bound, and randomized sampling.

Project description

Multi-armed bandit

Python PyPI Version License Downloads Downloads BuyMeCoffee

  • Thompson is Python package to evaluate the multi-armed bandit problem. In addition to thompson, Upper Confidence Bound (UCB) algorithm, and randomized results are also implemented.
  • In probability theory, the multi-armed bandit problem is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice. This is a classic reinforcement learning problem that exemplifies the exploration-exploitation tradeoff dilemma wikipedia.
  • In the problem, each machine provides a random reward from a probability distribution specific to that machine. The objective of the gambler is to maximize the sum of rewards earned through a sequence of lever pulls. The crucial tradeoff the gambler faces at each trial is between "exploitation" of the machine that has the highest expected payoff and "exploration" to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also faced in machine learning. In practice, multi-armed bandits have been used to model problems such as managing research projects in a large organization like a science foundation or a pharmaceutical company wikipedia.

Contents

Installation

  • Install thompson from PyPI (recommended). thompson is compatible with Python 3.6+ and runs on Linux, MacOS X and Windows.
  • Distributed under the MIT license.

Requirements

pip install matplotlib numpy pandas

Quick Start

pip install thompson
  • Alternatively, install thompson from the GitHub source:
git clone https://github.com/erdogant/thompson.git
cd thompson
python setup.py install

Import thompson package

import thompson as mab

Load example data:

df  = mab.example_data()

Compute multi-armed bandit using thompson

out = mab.thompson(df)
fig = mab.plot(out)

Compute multi-armed bandit using UCB-Upper confidence Bound

out = mab.UCB(df)
fig = mab.plot(out)

Compute multi-armed bandit using randomized data

out = mab.UCB_random(df)
fig = mab.plot(out)

df looks like this:

      Ad 1  Ad 2  Ad 3  Ad 4  Ad 5  Ad 6  Ad 7  Ad 8  Ad 9  Ad 10
0        1     0     0     0     1     0     0     0     1      0
1        0     0     0     0     0     0     0     0     1      0
2        0     0     0     0     0     0     0     0     0      0
3        0     1     0     0     0     0     0     1     0      0
4        0     0     0     0     0     0     0     0     0      0
   ...   ...   ...   ...   ...   ...   ...   ...   ...    ...
9995     0     0     1     0     0     0     0     1     0      0
9996     0     0     0     0     0     0     0     0     0      0
9997     0     0     0     0     0     0     0     0     0      0
9998     1     0     0     0     0     0     0     1     0      0
9999     0     1     0     0     0     0     0     0     0      0

[10000 rows x 10 columns]

References

Maintainers

Contribute

  • All kinds of contributions are welcome!
  • If you wish to buy me a Coffee for this work, it is very appreciated :)

Licence

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thompson-1.0.0.tar.gz (9.3 kB view hashes)

Uploaded Source

Built Distribution

thompson-1.0.0-py3-none-any.whl (7.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page