Skip to main content

Framework for multi-armed bandits with support for contextual, GLM, NN, GP-based and delayed methods

Project description

PyPI Python License

BanditLab

A modular framework for experimenting with multi-armed bandits (MAB).

  • 20+ algorithms (from classical to state-of-the-art)
  • unified API
  • plug-and-play models and datasets
  • config-driven experiments

BanditLab is designed for both research and practical experimentation. It provides a unified interface for combining:

  • bandit algorithms (UCB, Thompson Sampling, Neural, GP-based, etc.)
  • predictive models (linear, GLM, neural networks, Gaussian processes)
  • environments (real datasets or simulators)

Installation

pip install BanditLab

Quick Start (Python API)

from mab_framework.algorithms import ThompsonSampling
from mab_framework.environments import DatasetEnvironment

env = DatasetEnvironment("data/mushroom_bandit_5000.csv")

bandit = ThompsonSampling(...)

for context in env:
    arm = bandit.select_arm(context)
    reward = env.pull(arm)
    bandit.update(context, arm, reward)

Config-Based Experiments (Recommended)

BanditLab supports fully declarative experiment setup via configs.

Example config:

experiment:
  name: "pool_test"
  steps: 200
  n_runs: 5

environment:
  name: "DatasetEnvironment"
  params:
    dataset_path: "data/E1_dataset.npz"

algorithms:
  - name: "ThompsonSampling"
    display_name: "Thompson Sampling (TS)"
    params: {}
    model:
      name: "OnlineRidgeRegression"
      params: { l2_reg: 1.0 }
      one_model_per_arm: true

  - name: "UCBAlgorithm"
    display_name: "LinUCB (alpha=1.0)"
    params: { alpha: 1.0 }
    model:
      name: "OnlineRidgeRegression"
      params: { l2_reg: 1.0 }
      one_model_per_arm: true

metrics:
  - cumulative_regret
  - average_regret

output:
  save_path: "./results/pool_test"

Run via:

python banditlab config.yaml

This allows running experiments without writing Python code and ensures full reproducibility.


Key Features

  • 20+ algorithms — from classical (UCB, TS) to neural and GP-based methods
  • Model–Algorithm decoupling — combine any algorithm with any reward model
  • Config-driven experiments — easy experimentation without coding
  • Contextual bandits support
  • Delayed feedback support — built-in support for bandits with delays
  • Extensible — easily implement new algorithms or models
  • Reproducible experiments — runner, logging, and metrics included

Core Design

BanditLab separates decision-making from prediction:

  • Models learn to predict rewards from context
  • Algorithms decide which arm to pull using model outputs

This enables flexible combinations:

  • Thompson Sampling + Linear Model
  • Thompson Sampling + GLM
  • UCB + Neural Network
  • UCB + Gaussian Process

Architecture Overview

The framework is built around four components:

  • Environments — provide contexts and rewards
  • Models — estimate reward (typically one per arm)
  • Algorithms — handle exploration vs exploitation
  • Runner — executes experiment loops

Example: Running a Benchmark

python scripts/run_mushrooms.py

This runs multiple algorithms on a real dataset and produces:

  • cumulative regret plots
  • average regret curves

Supported Methods

Algorithms

Includes 20+ implementations, such as:

  • Epsilon-Greedy
  • UCB / LinUCB
  • Thompson Sampling
  • Neural UCB
  • GP-based methods
  • GLM-based bandits

Models

  • Linear / Ridge Regression
  • GLM (Laplace approximation)
  • Gaussian Processes (RFF)
  • Neural Networks
  • LASSO-based models

Project Structure

mab_framework/
├── algorithms/
├── models/
├── environments/
├── experiment/
└── scripts/

Extending the Framework

Custom Model

predict(context)
update(context, reward)

Custom Algorithm

select_arm(context)
update(context, arm, reward)

All components inherit from base classes, making extension straightforward.


Reproducibility

BanditLab includes:

  • experiment runner
  • logging utilities
  • regret metrics

Designed for fair comparison of algorithms across datasets.


Documentation

Detailed developer documentation is available in:

docs/DEVELOPMENT.md

License

MIT License


Citation

If you use BanditLab in research, please consider citing the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

banditlab-0.1.0-py3-none-any.whl (5.2 kB view details)

Uploaded Python 3

File details

Details for the file banditlab-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: banditlab-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for banditlab-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 30c6ef7cd01c76c07b2a4c2215a631541b6a9e0d82b7596f357b5f5188f22bb8
MD5 48fa350fcb535fa4693f05e28e3fa671
BLAKE2b-256 f76c64446ea33000c0c893152e46b0b708b8a1e60ba38b256894320f1714874b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page