Framework for multi-armed bandits with support for contextual, GLM, NN, GP-based and delayed methods

These details have not been verified by PyPI

Project description

PyPI Python License

BanditLab

A modular framework for experimenting with multi-armed bandits (MAB).

20+ algorithms (from classical to state-of-the-art)
unified API
plug-and-play models and datasets
config-driven experiments

BanditLab is designed for both research and practical experimentation. It provides a unified interface for combining:

bandit algorithms (UCB, Thompson Sampling, Neural, GP-based, etc.)
predictive models (linear, GLM, neural networks, Gaussian processes)
environments (real datasets or simulators)

Installation

pip install BanditLab

Quick Start (Python API)

from mab_framework.algorithms import ThompsonSampling
from mab_framework.environments import DatasetEnvironment

env = DatasetEnvironment("data/mushroom_bandit_5000.csv")

bandit = ThompsonSampling(...)

for context in env:
    arm = bandit.select_arm(context)
    reward = env.pull(arm)
    bandit.update(context, arm, reward)

Config-Based Experiments (Recommended)

BanditLab supports fully declarative experiment setup via configs.

Example config:

experiment:
  name: "pool_test"
  steps: 200
  n_runs: 5

environment:
  name: "DatasetEnvironment"
  params:
    dataset_path: "data/E1_dataset.npz"

algorithms:
  - name: "ThompsonSampling"
    display_name: "Thompson Sampling (TS)"
    params: {}
    model:
      name: "OnlineRidgeRegression"
      params: { l2_reg: 1.0 }
      one_model_per_arm: true

  - name: "UCBAlgorithm"
    display_name: "LinUCB (alpha=1.0)"
    params: { alpha: 1.0 }
    model:
      name: "OnlineRidgeRegression"
      params: { l2_reg: 1.0 }
      one_model_per_arm: true

metrics:
  - cumulative_regret
  - average_regret

output:
  save_path: "./results/pool_test"

Run via:

python banditlab config.yaml

This allows running experiments without writing Python code and ensures full reproducibility.

Key Features

20+ algorithms — from classical (UCB, TS) to neural and GP-based methods
Model–Algorithm decoupling — combine any algorithm with any reward model
Config-driven experiments — easy experimentation without coding
Contextual bandits support
Delayed feedback support — built-in support for bandits with delays
Extensible — easily implement new algorithms or models
Reproducible experiments — runner, logging, and metrics included

Core Design

BanditLab separates decision-making from prediction:

Models learn to predict rewards from context
Algorithms decide which arm to pull using model outputs

This enables flexible combinations:

Thompson Sampling + Linear Model
Thompson Sampling + GLM
UCB + Neural Network
UCB + Gaussian Process

Architecture Overview

The framework is built around four components:

Environments — provide contexts and rewards
Models — estimate reward (typically one per arm)
Algorithms — handle exploration vs exploitation
Runner — executes experiment loops

Example: Running a Benchmark

python scripts/run_mushrooms.py

This runs multiple algorithms on a real dataset and produces:

cumulative regret plots
average regret curves

Supported Methods

Algorithms

Includes 20+ implementations, such as:

Epsilon-Greedy
UCB / LinUCB
Thompson Sampling
Neural UCB
GP-based methods
GLM-based bandits

Models

Linear / Ridge Regression
GLM (Laplace approximation)
Gaussian Processes (RFF)
Neural Networks
LASSO-based models

Project Structure

mab_framework/
├── algorithms/
├── models/
├── environments/
├── experiment/
└── scripts/

Extending the Framework

Custom Model

predict(context)
update(context, reward)

Custom Algorithm

select_arm(context)
update(context, arm, reward)

All components inherit from base classes, making extension straightforward.

Reproducibility

BanditLab includes:

experiment runner
logging utilities
regret metrics

Designed for fair comparison of algorithms across datasets.

Documentation

Detailed developer documentation is available in:

docs/DEVELOPMENT.md

License

MIT License

Citation

If you use BanditLab in research, please consider citing the repository.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

banditlab-0.1.0-py3-none-any.whl (5.2 kB view details)

Uploaded Apr 20, 2026 Python 3

File details

Details for the file banditlab-0.1.0-py3-none-any.whl.

File metadata

Download URL: banditlab-0.1.0-py3-none-any.whl
Upload date: Apr 20, 2026
Size: 5.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for banditlab-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`30c6ef7cd01c76c07b2a4c2215a631541b6a9e0d82b7596f357b5f5188f22bb8`
MD5	`48fa350fcb535fa4693f05e28e3fa671`
BLAKE2b-256	`f76c64446ea33000c0c893152e46b0b708b8a1e60ba38b256894320f1714874b`

See more details on using hashes here.

BanditLab 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta