No project description provided

These details have not been verified by PyPI

Project links

Homepage

Project description

RewardUQ: Uncertainty-Aware Reward Models

RewardUQ

Updates

12.11.2025: We have been accepted to the EIML workshop @ EurIPS 2025!

Introduction

RewardUQ is a unified framework for training and evaluating uncertainty-aware reward models. Built on top of the Hugging Face ecosystem using 🤗 TRL, 🤗 Transformers, and PyTorch, it provides a variety of state-of-the-art uncertainty quantification methods alongside easily accessible training pipelines.

This repository is designed to function simultaneously as an importable library and a standalone research framework. We want to encourage both usage styles to foster adoption and contribution from the community.

📦 As library: Import specific components (models, functional APIs, utilities) into external projects or production inference pipelines.
🧪 As research framework: Use Hydra configurations and entry points for rapid experimentation with version-controllable configs and seamless hyperparameter sweeps.

Available Methods

Method	Description	Config Path
MLP Head Ensemble	Multiple independent MLP heads on a shared frozen backbone. Uncertainty from prediction variance across ensemble members.	`mlp_head_ensemble/`
LoRA Ensemble	Ensemble of independent LoRA adapters, each with its own linear head.	`lora_ensemble/`
DPO-based MC Dropout	Monte Carlo dropout applied to DPO's implicit reward model. Uncertainty from stochastic forward passes during inference.	`dpo_head_dropout_ensemble/`
Bayesian Linear Head	Single linear head with Gaussian posterior via Laplace approximation.	`bayesian_linear_head/`

Installation

Python package

Install the package via pip:

pip install rewarduq

Repository

Clone the repository:

git clone https://github.com/Florian-toll/rewarduq.git
cd rewarduq

We recommend uv to manage dependencies:

uv sync

Alternatively, use pip with the requirements.txt:

pip install -r requirements.txt

After the installation verify that torch recognizes your device. If you have a CUDA-capable GPU, the following command should return True, otherwise make sure to install the correct version of PyTorch for your system from pytorch.org:

uv run python -c "import torch; print(torch.cuda.is_available())"

Development setup

# Install dev dependencies
uv sync --dev

# Or, install all extras
uv sync --all-extras

# Optionally, install pre-commit hooks (recommended)
uv run pre-commit install

# Optionally, install nbstripout hooks (recommended)
uv run nbstripout --install

Run the pre-commit hooks manually with:

uv run pre-commit run --all-files

Quick start

Using the library

from rewarduq import load_pipeline
from rewarduq.utils import get_config

# Load the config
config = get_config("configs/<config_file>.yaml")

# Load the pipeline
rm_pipeline = load_pipeline(config)

# Train the reward model
rm_pipeline.train(train_dataset, eval_dataset)

All forward passes of RewardUQ models return a tensor of shape (batch_size, 3) where the first column is the reward, the second column is the lower bound, and the third column is the upper bound.

Using the CLI

RewardUQ uses Hydra for configuration management. For example, to train a model you can use the following command:

uv run python ./scripts/train.py \
  dataset/train=ultrafeedback_binarized \
  dataset/eval=ultrafeedback_binarized \
  method=mlp_head_ensemble/default \
  model.base_model_name_or_path=Qwen/Qwen3-0.6B

By default this uses the configs/ folder in the repository and the train.yaml file. If you run experiments outside of the installed the repository, you must specify the --config-path <absolute_config_path> parameter. You can also change the config file with the --config-name <config_name> parameter, for instance to manage multiple experiments at once.

RewardBench evaluation

In our paper, we use RewardBench as our primary evaluation benchmark. To evaluate on RewardBench with automatic weighted averaging over categories, use the following config override:

dataset/eval=reward_bench

Scripts

Training: `scripts/train.py`

Train uncertainty-aware reward models using Hydra configuration.

Usage:

uv run python ./scripts/train.py \
  dataset/train=<train_dataset> \
  dataset/eval=<eval_dataset> \
  method=<method_config> \
  [additional_overrides...]

Key Parameters:

dataset/train: Training dataset configuration (e.g., ultrafeedback_binarized, tulu_3_8b_preference_mixture)
dataset/eval: Evaluation dataset configuration (can be null for no evaluation)
method: UQ method configuration (e.g., mlp_head_ensemble/default)
model.base_model_name_or_path: HuggingFace model identifier or local path
resume: Resume from checkpoint (path or True for latest)

Examples:

# Train MLP Head Ensemble on Qwen3-0.6B
uv run python ./scripts/train.py \
  dataset/train=ultrafeedback_binarized \
  dataset/eval=ultrafeedback_binarized \
  method=mlp_head_ensemble/qwen3_0.6b

To enable logging, set trainer.report_to=wandb. You can also override the entity and project directly in the command:

wandb login

# Run training
uv run python ./scripts/train.py \
  trainer.report_to=wandb \
  wandb.entity=your-entity \
  wandb.project=your-project

To run hyperparameter sweeps using for example the configs in configs/sweeps/:

# Create sweep
wandb sweep \
  --entity <your-entity> \
  --project <your-project> \
  --name <your-sweep-name> \
  ./configs/sweeps/sweep_ens_mlp.yaml

# Run sweep
wandb agent --count 1 "<your-entity>/<your-project>/<sweep-id>"

Inference: `scripts/run_inference.py`

Run inference on custom prompts and completions.

Usage:

python ./scripts/run_inference.py \
  --model <model_path> \
  --dataset <dataset_name> <split> \
  [--batch-size BATCH_SIZE] \
  [--out OUTPUT_DIR] \
  [--debug]

Parameters:

--model: Path to trained model or HuggingFace identifier (required)
--dataset: Dataset containing prompts and completions (required)
--batch-size: Inference batch size (default: 16)
--out: Output directory for predictions (default: current directory)
--debug: Limit dataset size for quick testing

Output: Saves predictions as .npy files containing reward scores with uncertainty bounds.

Configuration Structure

Configs are organized in the configs/ directory:

configs/
├── train.yaml              # Main training config with defaults
├── dataset/
│   ├── train/              # Training dataset configs
│   │   ├── ultrafeedback_binarized.yaml
│   │   ├── tulu_3_8b_preference_mixture.yaml
│   │   └── ...
│   └── eval/               # Evaluation dataset configs
│       ├── reward_bench.yaml
│       └── ...
├── method/
│   ├── base.yaml           # Base config for all methods
│   ├── mlp_head_ensemble/
│   │   ├── default.yaml    # Default config
│   │   ├── qwen3_14b.yaml  # Model-specific tuned config
│   │   └── ...
│   ├── lora_ensemble/
│   ├── dpo_head_dropout_ensemble/
│   └── bayesian_linear_head/
├── accelerate/             # Examples for distributed training configs
│   ├── default.yaml
│   └── fsdp.yaml
├── paths/                  # Path configurations
│   └── default.yaml
└── hydra/                  # Hydra-specific settings
    └── default.yaml

Architecture

Models inherit from transformers.PreTrainedModel:

transformers.PreTrainedModel
    └── rewarduq.methods.base.RewardUQModel (base class for all UQ models)
        ├── MLPHeadEnsembleModel
        ├── LoraEnsembleModel
        ├── DPOHeadDropoutEnsembleModel
        └── BayesianLinearHeadModel

Trainers extend TRL's specialized trainers:

transformers.Trainer
    └── trl.RewardTrainer / trl.DPOTrainer
        └── rewarduq.trainers.TrainerExtension (adds UQ-specific features)
            ├── rewarduq.trainers.RewardUQTrainer (extends RewardTrainer)
            └── rewarduq.trainers.DPORewardUQTrainer (extends DPOTrainer)

All forward passes of RewardUQ models return a tensor of shape (batch_size, 3) where the first column is the reward, the second column is the lower bound, and the third column is the upper bound.

Contributing

We welcome and encourage contributions from the community! Whether you want to add a new uncertainty quantification method, improve existing ones, or fix bugs, your help is appreciated. If you have an idea for a new feature or improvement:

Check existing Issues or open a new one to discuss your idea.
Fork the repository and create a feature branch.
Implement your changes (don't forget tests :D).
Submit a Pull Request.

Citation

@misc{yang2025rewarduq,
  author = {Daniel Yang and Samuel Stante and Florian Redhardt and Lena Libon and Barna Pasztor and Parnian Kassraie and Ido Hakimi and Andreas Krause},
  title = {RewardUQ: A Unified Framework for Uncertainty-Aware Reward Models},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/lasgroup/rewarduq}}
}

License

This repository's source code is available under the Apache-2.0 License.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Jan 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rewarduq-0.1.0.tar.gz (64.3 kB view details)

Uploaded Jan 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rewarduq-0.1.0-py3-none-any.whl (74.7 kB view details)

Uploaded Jan 10, 2026 Python 3

File details

Details for the file rewarduq-0.1.0.tar.gz.

File metadata

Download URL: rewarduq-0.1.0.tar.gz
Upload date: Jan 10, 2026
Size: 64.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rewarduq-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f3780aa6948d9e913e9d1c055944acb2f9c50d4cd23386ca92d537ee016fc3bd`
MD5	`9335008c73e7b48d4b9c0d0a984e41b1`
BLAKE2b-256	`96b2f06a28f55e6f9136a253fce2e129f3f5ce9af649535f540c0fecf91b9627`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rewarduq-0.1.0.tar.gz:

Publisher: publish.yml on lasgroup/rewarduq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rewarduq-0.1.0.tar.gz
- Subject digest: f3780aa6948d9e913e9d1c055944acb2f9c50d4cd23386ca92d537ee016fc3bd
- Sigstore transparency entry: 813329720
- Sigstore integration time: Jan 10, 2026
Source repository:
- Permalink: lasgroup/rewarduq@c64690585293dd5dd6b9b6baea0f2fb0bf804d46
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/lasgroup
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c64690585293dd5dd6b9b6baea0f2fb0bf804d46
- Trigger Event: release

File details

Details for the file rewarduq-0.1.0-py3-none-any.whl.

File metadata

Download URL: rewarduq-0.1.0-py3-none-any.whl
Upload date: Jan 10, 2026
Size: 74.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rewarduq-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`17aa1c4ed6a23ce2217f09452eb50036b3a82e76cd2c006e99a275052442c989`
MD5	`b378451f329e5c94dfe9919b3bd18ed5`
BLAKE2b-256	`550bf19551a8b7e7074aaa0d46a68d9d4dc910430eb5ff873a5315a5e0b73ff2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rewarduq-0.1.0-py3-none-any.whl:

Publisher: publish.yml on lasgroup/rewarduq

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rewarduq-0.1.0-py3-none-any.whl
- Subject digest: 17aa1c4ed6a23ce2217f09452eb50036b3a82e76cd2c006e99a275052442c989
- Sigstore transparency entry: 813329721
- Sigstore integration time: Jan 10, 2026
Source repository:
- Permalink: lasgroup/rewarduq@c64690585293dd5dd6b9b6baea0f2fb0bf804d46
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/lasgroup
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c64690585293dd5dd6b9b6baea0f2fb0bf804d46
- Trigger Event: release

rewarduq 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

RewardUQ: Uncertainty-Aware Reward Models

Updates

Introduction

Available Methods

Installation

Python package

Repository

Development setup

Quick start

Using the library

Using the CLI

RewardBench evaluation

Scripts

Training: scripts/train.py

Inference: scripts/run_inference.py

Configuration Structure

Architecture

Contributing

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Training: `scripts/train.py`

Inference: `scripts/run_inference.py`