Skip to main content

Add your description here

Project description

GQR-Bench (Guarded Query Routing Benchmark)

A benchmark and evaluation toolkit for developing and testing guarded query routing models for AI systems.

Installation

# Clone the repository
pip install gqr

Quick Start

import gqr

# Load development dataset for initial experimentation
dev_train_data, dev_eval_data = gqr.load_dev_dataset()

# Load training dataset for model development
train_data, eval_data = gqr.load_train_dataset()

# Load test datasets for final evaluation
domain_test_data = gqr.load_id_test_dataset()  # In-domain test data
ood_test_data = gqr.load_ood_test_dataset()    # Out-of-domain test data

Domain Labels

The repository provides mappings between numerical labels and domain names:

# Get label mappings
print(gqr.label2domain)  # Maps numerical labels to domain names
print(gqr.domain2label)  # Maps domain names to numerical labels

Evaluation

Important: When using the evaluate functions, ensure that the prediction and ground truth values are strings, not numerical labels. The module offers comprehensive evaluation functions:

# Evaluate on in-domain test set

results = gqr.evaluate(
    predictions=pred_id_labels,
    ground_truth=true_id_labels
)

# Evaluate on out-of-domain test set
ood_results = gqr.evaluate(
    predictions=pred_ood_labels,
    ground_truth=true_ood_labels
)

# Evaluate by dataset (grouped evaluation)
dataset_results = gqr.evaluate_by_dataset(
    ood_test_data,
    pred_col='pred',
    true_col='true',
    dataset_col='dataset'
)

Paper and Citations

If you use GQR-Bench in your research, please cite our paper:

Contributing

Contributions to GQR-Bench are welcome! Please feel free to submit a Pull Request with improvements, additional evaluation metrics, or dataset enhancements.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gqr-0.0.1.tar.gz (106.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gqr-0.0.1-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file gqr-0.0.1.tar.gz.

File metadata

  • Download URL: gqr-0.0.1.tar.gz
  • Upload date:
  • Size: 106.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for gqr-0.0.1.tar.gz
Algorithm Hash digest
SHA256 52106c2d697756b6d301ebc6c9772310b7f5745aaa144ea14cc0e4328a338d9c
MD5 73bc15fe8b837d5b8778288df9172069
BLAKE2b-256 15cea705a8f94548b3e20c1c5cce21cc907ee39fedb9f299d3c1701ee07260b1

See more details on using hashes here.

File details

Details for the file gqr-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: gqr-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for gqr-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7860ec68d9fc4fec933f9302fdbe560e577be0cd2c1ba56b5545597a3cdcf604
MD5 a3b61d56067c4fc08fc609a3024c4c33
BLAKE2b-256 17fc76676eff3997934f9431f0b0f127789fa71daa6279183f9ec89cd2f16e36

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page