Skip to main content

GuardBench: A Large-Scale Benchmark for Guardrail Models

Project description

PyPI version Documentation Status License: EUPL-1.2

GuardBench

⚡️ Introduction

GuardBench is a Python library for guardrail models evaluation. It provides a common interface to 40 evaluation datasets, which are downloaded and converted into a standardized format for improved usability. It also allows to quickly compare results and export LaTeX tables for scientific publications. GuardBench's benchmarking pipeline can also be leveraged on custom datasets.

You can find the list of supported datasets here. A few of them requires authorization. Please, see here.

If you use GuardBench to evaluate guardrail models for your scientific publications, please consider citing our work.

✨ Features

🔌 Requirements

python>=3.10

💾 Installation

pip install guardbench

💡 Usage

from guardbench import benchmark

def moderate(
    conversations: list[list[dict[str, str]]],  # MANDATORY!
    # additional `kwargs` as needed
) -> list[float]:
    # do moderation
    # return list of floats (unsafe probabilities)

benchmark(
    moderate=moderate,  # User-defined moderation function
    model_name="My Guardrail Model",
    batch_size=32,
    datasets="all", 
    # Note: you can pass additional `kwargs` for `moderate`
)

📖 Examples

📚 Documentation

Browse the documentation for more details about:

🏆 Leaderboard

You can find GuardBench's leaderboard here. All results can be reproduced using the provided scripts.
If you want to submit your results, please contact us.

👨‍💻 Authors

  • Elias Bassani (European Commission - Joint Research Centre)

🎓 Citation

@inproceedings{guardbench,
    title = "{G}uard{B}ench: A Large-Scale Benchmark for Guardrail Models",
    author = "Bassani, Elias  and
      Sanchez, Ignacio",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-main.1022.pdf",
    pages = "18393--18409",
}

🎁 Feature Requests

Would you like to see other features implemented? Please, open a feature request.

📄 License

GuardBench is provided as open-source software licensed under EUPL v1.2.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

guardbench-1.0.0.tar.gz (52.2 kB view details)

Uploaded Source

Built Distribution

guardbench-1.0.0-py3-none-any.whl (77.5 kB view details)

Uploaded Python 3

File details

Details for the file guardbench-1.0.0.tar.gz.

File metadata

  • Download URL: guardbench-1.0.0.tar.gz
  • Upload date:
  • Size: 52.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for guardbench-1.0.0.tar.gz
Algorithm Hash digest
SHA256 9b775a8b1fc306342a1465d7f877492fe7e5def8833291a474cfeecb7dc443b7
MD5 06ad3ad1d421e64e79305cbdf53a9c69
BLAKE2b-256 d9cb3c80b3da7bc96b4431f5e3553595184ac95b890a8a2222d42d5daac61861

See more details on using hashes here.

File details

Details for the file guardbench-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: guardbench-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 77.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for guardbench-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e98d6a1a367b37308035d70c0c728cfd4dc9564d67be1a785b4d18a9b70539ad
MD5 8859c9ea381a61099e79cf091550029d
BLAKE2b-256 0c000f58e72e2222b7a0bd9f685d775816289a58fb9d8f0faee42ee82445dd3f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page