Skip to main content

The multi-armed bandit by Thompson Sampling, UCB-Upper confidence Bound, and randomized sampling.

Project description

Multi-armed bandit

PyPI Version License

  • Thompson is Python package to evaluate the multi-armed bandit problem. In addition to thompson, Upper Confidence Bound (UCB) algorithm, and randomized results are also implemented.
  • In probability theory, the multi-armed bandit problem is a problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain, when each choice's properties are only partially known at the time of allocation, and may become better understood as time passes or by allocating resources to the choice. This is a classic reinforcement learning problem that exemplifies the exploration-exploitation tradeoff dilemma wikipedia.
  • In the problem, each machine provides a random reward from a probability distribution specific to that machine. The objective of the gambler is to maximize the sum of rewards earned through a sequence of lever pulls. The crucial tradeoff the gambler faces at each trial is between "exploitation" of the machine that has the highest expected payoff and "exploration" to get more information about the expected payoffs of the other machines. The trade-off between exploration and exploitation is also faced in machine learning. In practice, multi-armed bandits have been used to model problems such as managing research projects in a large organization like a science foundation or a pharmaceutical company wikipedia.

Contents

Installation

  • Install thompson from PyPI (recommended). thompson is compatible with Python 3.6+ and runs on Linux, MacOS X and Windows.
  • Distributed under the MIT license.

Requirements

pip install matplotlib numpy pandas

Quick Start

pip install thompson
  • Alternatively, install thompson from the GitHub source:
git clone https://github.com/erdogant/thompson.git
cd thompson
python setup.py install

Import thompson package

import thompson as mab

Load example data:

df  = mab.example_data()

Compute multi-armed bandit using thompson

out = mab.thompson(df)
fig = mab.plot(out)

Compute multi-armed bandit using UCB-Upper confidence Bound

out = mab.UCB(df)
fig = mab.plot(out)

Compute multi-armed bandit using randomized data

out = mab.UCB_random(df)
fig = mab.plot(out)

df looks like this:

      Ad 1  Ad 2  Ad 3  Ad 4  Ad 5  Ad 6  Ad 7  Ad 8  Ad 9  Ad 10
0        1     0     0     0     1     0     0     0     1      0
1        0     0     0     0     0     0     0     0     1      0
2        0     0     0     0     0     0     0     0     0      0
3        0     1     0     0     0     0     0     1     0      0
4        0     0     0     0     0     0     0     0     0      0
   ...   ...   ...   ...   ...   ...   ...   ...   ...    ...
9995     0     0     1     0     0     0     0     1     0      0
9996     0     0     0     0     0     0     0     0     0      0
9997     0     0     0     0     0     0     0     0     0      0
9998     1     0     0     0     0     0     0     1     0      0
9999     0     1     0     0     0     0     0     0     0      0

[10000 rows x 10 columns]

Citation

Please cite thompson in your publications if this is useful for your research. Here is an example BibTeX entry:

@misc{erdogant2019thompson,
  title={thompson},
  author={Erdogan Taskesen},
  year={2019},
  howpublished={\url{https://github.com/erdogant/thompson}},
}

References

Maintainers

Contribute

  • All kinds of contributions are welcome!

© Copyright

See LICENSE for details.

Project details


Release history Release notifications

This version

0.1.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for thompson, version 0.1.2
Filename, size File type Python version Upload date Hashes
Filename, size thompson-0.1.2-py3-none-any.whl (31.6 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size thompson-0.1.2.tar.gz (49.5 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page