A toolkit for preference-based online learning with dueling bandits
Project description
Dueling Bandit Toolkit
The Dueling Bandit Toolkit is a Python package designed for preference-based online learning using dueling bandit algorithms. It provides robust implementations of state-of-the-art algorithms, support for real-world datasets, and comprehensive evaluation metrics, making it an ideal tool for researchers and practitioners in machine learning and decision-making systems.
Features
- Algorithms: Includes Double Thompson Sampling, PARWiS, Contextual PARWiS, and a Random Pair baseline.
- Environment: Supports the Bradley-Terry model with optional contextual features for flexible experimentation.
- Datasets: Compatible with synthetic data and real-world datasets like Jester and MovieLens.
- Metrics: Evaluates performance with cumulative regret, recovery fraction, true/reported ranks, and separation (Δ₁,₂).
- Visualization: Offers plotting functions to visualize experiment results using Matplotlib.
Installation
Install the package via pip from PyPI:
pip install dueling-bandit
Ensure you have Python 3.8 or higher.
Quick Start
Here's a simple example to get started with the toolkit:
from dueling_bandit.environment import DuelingBanditEnv
from dueling_bandit.agents import DoubleThompsonSamplingAgent
from dueling_bandit.experiments import run_simulation
from dueling_bandit.plotting import plot_metric
# Create a synthetic Bradley-Terry environment
env = DuelingBanditEnv.random_bt(k=20, d=5, seed=42)
# Initialize the Double Thompson Sampling agent
agent = DoubleThompsonSamplingAgent(k=20, seed=42)
# Run a simulation for 500 duels
results = run_simulation(env, agent, horizon=500)
# Visualize the cumulative regret
plot_metric({'500': {'Double TS': results}}, budget=500, dataset='synthetic', metric='mean_regret')
This code sets up a synthetic environment, runs a simulation with the Double Thompson Sampling algorithm, and plots the cumulative regret.
Requirements
- Python >= 3.8
- Dependencies:
numpy,matplotlib,scipy,pandas
Install all dependencies:
pip install -r requirements.txt
Development
To contribute or experiment with the toolkit:
- Clone the repository:
git clone https://github.com/shailendrabhandari/dueling_bandit.git cd dueling_bandit
- Install in editable mode:
pip install -e .
- Run tests to ensure everything works:
pytest tests/
Documentation
Comprehensive documentation is available at ReadTheDocs. It includes detailed API references, tutorials, and examples to help you get the most out of the toolkit.
License
This project is licensed under the MIT License.
Contributing
Contributions are welcome! Whether it's bug fixes, new features, or documentation improvements, please:
- Open an issue to discuss your idea.
- Submit a pull request with your changes.
See the Contributing Guidelines for more details.
Contact
For questions or support, please open an issue on GitHub or contact Shailendra at shailendra.bhandari@oslomet.no.
Happy dueling!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dueling_bandit-0.1.1.tar.gz.
File metadata
- Download URL: dueling_bandit-0.1.1.tar.gz
- Upload date:
- Size: 11.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03199a1d84f21afc74608cb5d48a3d1e75097e3f8d446139880fbf2e08902893
|
|
| MD5 |
3f2b8c475cc39efb31dc4b4e73ddc9ea
|
|
| BLAKE2b-256 |
65aef7b959c3a5b1d50286def5d8c12055a2b93780264f697f82a37ce6192e0c
|
File details
Details for the file dueling_bandit-0.1.1-py3-none-any.whl.
File metadata
- Download URL: dueling_bandit-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b4c532f60c665cdc4f42dce59e42308237c2e5f857af2cc0f28a6b19e588271
|
|
| MD5 |
957f4484a9806edbf94e95fec4b9ee7f
|
|
| BLAKE2b-256 |
36ef8a167bd01e6cedd38a9bcaab70249f0c9667ea18914b640148a54b8131ad
|