GuardBench: A Large-Scale Benchmark for Guardrail Models
Project description
GuardBench
⚡️ Introduction
GuardBench
is a Python library for guardrail models evaluation.
It provides a common interface to 40 evaluation datasets, which are downloaded and converted into a standardized format for improved usability.
It also allows to quickly compare results and export LaTeX
tables for scientific publications.
GuardBench
's benchmarking pipeline can also be leveraged on custom datasets.
You can find the list of supported datasets here. A few of them requires authorization. Please, see here.
If you use GuardBench
to evaluate guardrail models for your scientific publications, please consider citing our work.
✨ Features
- 40 datasets for guardrail models evaluation.
- Automated evaluation pipeline.
- User-friendly.
- Extendable.
- Reproducible and sharable evaluation.
- Exportable evaluation reports.
🔌 Requirements
python>=3.10
💾 Installation
pip install guardbench
💡 Usage
from guardbench import benchmark
def moderate(
conversations: list[list[dict[str, str]]], # MANDATORY!
# additional `kwargs` as needed
) -> list[float]:
# do moderation
# return list of floats (unsafe probabilities)
benchmark(
moderate=moderate, # User-defined moderation function
model_name="My Guardrail Model",
batch_size=32,
datasets="all",
# Note: you can pass additional `kwargs` for `moderate`
)
📖 Examples
- Follow our tutorial on benchmarking
Llama Guard
withGuardBench
. - More examples are available in the
scripts
folder.
📚 Documentation
Browse the documentation for more details about:
- The datasets and how to obtain them.
- The data format used by
GuardBench
. - How to use the
Report
class to compare models and export results asLaTeX
tables. - How to leverage
GuardBench
's benchmarking pipeline on custom datasets.
🏆 Leaderboard
You can find GuardBench
's leaderboard here.
All results can be reproduced using the provided scripts
.
If you want to submit your results, please contact us.
👨💻 Authors
- Elias Bassani (European Commission - Joint Research Centre)
🎓 Citation
@inproceedings{guardbench,
title = "{G}uard{B}ench: A Large-Scale Benchmark for Guardrail Models",
author = "Bassani, Elias and
Sanchez, Ignacio",
editor = "Al-Onaizan, Yaser and
Bansal, Mohit and
Chen, Yun-Nung",
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2024",
address = "Miami, Florida, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.emnlp-main.1022.pdf",
pages = "18393--18409",
}
🎁 Feature Requests
Would you like to see other features implemented? Please, open a feature request.
📄 License
GuardBench is provided as open-source software licensed under EUPL v1.2.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file guardbench-1.0.0.tar.gz
.
File metadata
- Download URL: guardbench-1.0.0.tar.gz
- Upload date:
- Size: 52.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b775a8b1fc306342a1465d7f877492fe7e5def8833291a474cfeecb7dc443b7 |
|
MD5 | 06ad3ad1d421e64e79305cbdf53a9c69 |
|
BLAKE2b-256 | d9cb3c80b3da7bc96b4431f5e3553595184ac95b890a8a2222d42d5daac61861 |
File details
Details for the file guardbench-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: guardbench-1.0.0-py3-none-any.whl
- Upload date:
- Size: 77.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e98d6a1a367b37308035d70c0c728cfd4dc9564d67be1a785b4d18a9b70539ad |
|
MD5 | 8859c9ea381a61099e79cf091550029d |
|
BLAKE2b-256 | 0c000f58e72e2222b7a0bd9f685d775816289a58fb9d8f0faee42ee82445dd3f |