Skip to main content

A toolkit for preference-based online learning with dueling bandits

Project description

Dueling Bandit Toolkit

Documentation Status License: MIT PyPI - Python Version PyPI Downloads GitHub watchers GitHub stars

The Dueling Bandit Toolkit is a Python package designed for preference-based online learning using dueling bandit algorithms. It provides robust implementations of state-of-the-art algorithms, support for real-world datasets, and comprehensive evaluation metrics, making it an ideal tool for researchers and practitioners in machine learning and decision-making systems.

Features

  • Algorithms: Includes Double Thompson Sampling, PARWiS, Contextual PARWiS, and a Random Pair baseline.
  • Environment: Supports the Bradley-Terry model with optional contextual features for flexible experimentation.
  • Datasets: Compatible with synthetic data and real-world datasets like Jester and MovieLens.
  • Metrics: Evaluates performance with cumulative regret, recovery fraction, true/reported ranks, and separation (Δ₁,₂).
  • Visualization: Offers plotting functions to visualize experiment results using Matplotlib.

Installation

Install the package via pip from PyPI:

pip install dueling-bandit

Ensure you have Python 3.8 or higher.

Quick Start

Here's a simple example to get started with the toolkit:

from dueling_bandit.environment import DuelingBanditEnv
from dueling_bandit.agents import DoubleThompsonSamplingAgent
from dueling_bandit.experiments import run_simulation
from dueling_bandit.plotting import plot_metric

# Create a synthetic Bradley-Terry environment
env = DuelingBanditEnv.random_bt(k=20, d=5, seed=42)

# Initialize the Double Thompson Sampling agent
agent = DoubleThompsonSamplingAgent(k=20, seed=42)

# Run a simulation for 500 duels
results = run_simulation(env, agent, horizon=500)

# Visualize the cumulative regret
plot_metric({'500': {'Double TS': results}}, budget=500, dataset='synthetic', metric='mean_regret')

This code sets up a synthetic environment, runs a simulation with the Double Thompson Sampling algorithm, and plots the cumulative regret.

Requirements

  • Python >= 3.8
  • Dependencies: numpy, matplotlib, scipy, pandas

Install all dependencies:

pip install -r requirements.txt

Development

To contribute or experiment with the toolkit:

  1. Clone the repository:
    git clone https://github.com/shailendrabhandari/dueling_bandit.git
    cd dueling_bandit
    
  2. Install in editable mode:
    pip install -e .
    
  3. Run tests to ensure everything works:
    pytest tests/
    

Documentation

Comprehensive documentation is available at ReadTheDocs. It includes detailed API references, tutorials, and examples to help you get the most out of the toolkit.

License

This project is licensed under the MIT License.

Contributing

Contributions are welcome! Whether it's bug fixes, new features, or documentation improvements, please:

  1. Open an issue to discuss your idea.
  2. Submit a pull request with your changes.

See the Contributing Guidelines for more details.

Contact

For questions or support, please open an issue on GitHub or contact Shailendra at shailendra.bhandari@oslomet.no.


Happy dueling!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dueling_bandit-0.1.1.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dueling_bandit-0.1.1-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file dueling_bandit-0.1.1.tar.gz.

File metadata

  • Download URL: dueling_bandit-0.1.1.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for dueling_bandit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 03199a1d84f21afc74608cb5d48a3d1e75097e3f8d446139880fbf2e08902893
MD5 3f2b8c475cc39efb31dc4b4e73ddc9ea
BLAKE2b-256 65aef7b959c3a5b1d50286def5d8c12055a2b93780264f697f82a37ce6192e0c

See more details on using hashes here.

File details

Details for the file dueling_bandit-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: dueling_bandit-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for dueling_bandit-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6b4c532f60c665cdc4f42dce59e42308237c2e5f857af2cc0f28a6b19e588271
MD5 957f4484a9806edbf94e95fec4b9ee7f
BLAKE2b-256 36ef8a167bd01e6cedd38a9bcaab70249f0c9667ea18914b640148a54b8131ad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page