Skip to main content

MABWiser: Parallelizable Contextual Multi-Armed Bandits Library

Project description

ci PyPI version fury.io PyPI license PRs Welcome Downloads

MABWiser: Parallelizable Contextual Multi-Armed Bandits

MABWiser (IJAIT 2021, ICTAI 2019) is a research library written in Python for rapid prototyping of multi-armed bandit algorithms. It supports context-free, parametric and non-parametric contextual bandit models and provides built-in parallelization for both training and testing components.

The library also provides a simulation utility for comparing different policies and performing hyper-parameter tuning. MABWiser follows a scikit-learn style public interface, adheres to PEP-8 standards, and is tested heavily.

MABWiser is developed by the Artificial Intelligence Center of Excellence at Fidelity Investments. Documentation is available at fidelity.github.io/mabwiser.

Bandit-based Recommender Systems

To solve personalized recommendation problems, MABWiser is integrated into our Mab2Rec library. Mab2Rec enables building content- and context-aware recommender systems, whereby MABWiser helps selecting the next best item (arm).

Bandit-based Large-Neighborhood Search

To solve combinatorial optimization problems, MABWiser is integrated into Adaptive Large Neighborhood Search. The ALNS library enables building metaheuristics for complex optimization problems, whereby MABWiser helps selecting the next best destroy, repair operation (arm).

Quick Start

# An example that shows how to use the UCB1 learning policy
# to choose between two arms based on their expected rewards.

# Import MABWiser Library
from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy

# Data
arms = ['Arm1', 'Arm2']
decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
rewards = [20, 17, 25, 9]

# Model 
mab = MAB(arms, LearningPolicy.UCB1(alpha=1.25))

# Train
mab.fit(decisions, rewards)

# Test
mab.predict()

Available Bandit Policies

Available Learning Policies:

  • Epsilon Greedy [1, 2]
  • LinGreedy [1, 2]
  • LinTS [3]. See [11] for a formal treatment of reproducibility in LinTS
  • LinUCB [4]
  • Popularity [2]
  • Random [2]
  • Softmax [2]
  • Thompson Sampling (TS) [5]
  • Upper Confidence Bound (UCB1) [2]

Available Neighborhood Policies:

  • Clusters [6]
  • K-Nearest [7, 8]
  • LSH Nearest [9]
  • Radius [7, 8]
  • TreeBandit [10]

Installation

MABWiser requires Python 3.8+ and can be installed from PyPI using pip install mabwiser or by building from source as shown in installation instructions.

Support

Please submit bug reports and feature requests as Issues.

Citation

If you use MABWiser in a publication, please cite it as:

    @article{DBLP:journals/ijait/StrongKK21,
      author    = {Emily Strong and Bernard Kleynhans and Serdar Kadioglu},
      title     = {{MABWiser:} Parallelizable Contextual Multi-armed Bandits},
      journal   = {Int. J. Artif. Intell. Tools},
      volume    = {30},
      number    = {4},
      pages     = {2150021:1--2150021:19},
      year      = {2021},
      url       = {https://doi.org/10.1142/S0218213021500214},
      doi       = {10.1142/S0218213021500214},
    }

    @inproceedings{DBLP:conf/ictai/StrongKK19,
    author    = {Emily Strong and Bernard Kleynhans and Serdar Kadioglu},
    title     = {MABWiser: {A} Parallelizable Contextual Multi-Armed Bandit Library for Python},
    booktitle = {31st {IEEE} International Conference on Tools with Artificial Intelligence, {ICTAI} 2019, Portland, OR, USA, November 4-6, 2019},
    pages     = {909--914},
    publisher = {{IEEE}},
    year      = {2019},
    url       = {https://doi.org/10.1109/ICTAI.2019.00129},
    doi       = {10.1109/ICTAI.2019.00129},
    }

License

MABWiser is licensed under the Apache License 2.0.

References

  1. John Langford and Tong Zhang. The epoch-greedy algorithm for contextual multi-armed bandits
  2. Volodymyr Kuleshov and Doina Precup. Algorithms for multi-armed bandit problems
  3. Agrawal, Shipra and Navin Goyal. Thompson sampling for contextual bandits with linear payoffs
  4. Chu, Wei, Li, Lihong, Reyzin Lev, and Schapire Robert. Contextual bandits with linear payoff functions
  5. Osband, Ian, Daniel Russo, and Benjamin Van Roy. More efficient reinforcement learning via posterior sampling
  6. Nguyen, Trong T. and Hady W. Lauw. Dynamic clustering of contextual multi-armed bandits
  7. Melody Y. Guan and Heinrich Jiang, Nonparametric stochastic contextual bandits
  8. Philippe Rigollet and Assaf Zeevi. Nonparametric bandits with covariates
  9. Indyk, Piotr, Motwani, Rajeev, Raghavan, Prabhakar, Vempala, Santosh. Locality-preserving hashing in multidimensional spaces
  10. Adam N. Elmachtoub, Ryan McNellis, Sechan Oh, Marek Petrik, A practical method for solving contextual bandit problems using decision trees
  11. Doruk Kilitcioglu, Serdar Kadioglu, Non-deterministic behavior of thompson sampling with linear payoffs and how to avoid it

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mabwiser-2.7.4.tar.gz (95.4 kB view details)

Uploaded Source

Built Distribution

mabwiser-2.7.4-py3-none-any.whl (61.1 kB view details)

Uploaded Python 3

File details

Details for the file mabwiser-2.7.4.tar.gz.

File metadata

  • Download URL: mabwiser-2.7.4.tar.gz
  • Upload date:
  • Size: 95.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.19

File hashes

Hashes for mabwiser-2.7.4.tar.gz
Algorithm Hash digest
SHA256 82055ada833cd81617e67c87630fc9895f05790ebe4679fdf6b1eafccdcf821c
MD5 f9c01089262e319dfff6ff7d0369bc87
BLAKE2b-256 5e051ee61041c3f73e0a0df945eda76f29fe6181443475bb9d5d95b4d453112c

See more details on using hashes here.

File details

Details for the file mabwiser-2.7.4-py3-none-any.whl.

File metadata

  • Download URL: mabwiser-2.7.4-py3-none-any.whl
  • Upload date:
  • Size: 61.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.19

File hashes

Hashes for mabwiser-2.7.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4ac5a5d2d959b55685248645675d5f84f7c620eba270925bf86edce8c5712b49
MD5 81b700845c00e3e3167a109cab7096f4
BLAKE2b-256 ad85400f1740996be9afa3e2a84e96b77f1858d7025085d0441da988abc075da

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page