Skip to main content

A unified hub for reward models in AI alignment

Project description

RewardHub

RewardHub is an end-to-end library for annotating data using state-of-the-art (SoTA) reward models, critic functions, and related processes. It is designed to facilitate the generation of preference training data or define acceptance criteria for agentic or inference scaling systems such as Best-of-N sampling or Beam-Search.

Getting Started

Installation

Clone the repository and install the necessary dependencies:

git clone https://github.com/Red-Hat-AI-Innovation-Team/reward_hub.git
cd reward_hub
pip install -e .

Usage Examples

RewardHub supports multiple types of reward models and serving methods. Here are the main ways to use the library:

Process Reward Models (PRM)

PRMs evaluate responses by analyzing the reasoning process:

from reward_hub import AutoRM

# Load a math-focused PRM using HuggingFace backend
model = AutoRM.load("Qwen/Qwen2.5-Math-PRM-7B", load_method="hf", device=0)

# Example conversation
messages = [
    [
        {"role": "user", "content": "What is 2+2?"},
        {"role": "assistant", "content": "Let me solve this step by step:\n1) 2 + 2 = 4\nTherefore, 4"}
    ]
]

# Get scores with full PRM results
results = model.score(messages, return_full_prm_result=True)
# Or just get the scores
scores = model.score(messages, return_full_prm_result=False)

Outcome Reward Models (ORM)

ORMs focus on evaluating the final response quality:

from reward_hub import AutoRM

# Load an ORM using HuggingFace backend
model = AutoRM.load("internlm/internlm2-7b-reward", load_method="hf", device=0)

scores = model.score([
    [
        {"role": "user", "content": "What is 2+2?"},
        {"role": "assistant", "content": "The answer is 4."}
    ]
])

DrSow Reward Model

DrSow uses density ratios between strong and weak models to evaluate responses:

Launch the strong and weak models first.

bash scripts/launch_drsow.sh Qwen/Qwen2.5-32B-instruct Qwen/Qwen2.5-32B

Then, you can launch client reward servers to acces the DrSow reward model.

from reward_hub import AutoRM
from reward_hub.openai import DrSowConfig


drsow_config = DrSowConfig(
    strong_model_name="Qwen/Qwen2.5-32B-instruct",
    strong_port=8305,
    weak_model_name="Qwen/Qwen2.5-32B",
    weak_port=8306
)

model = AutoRM.load("drsow", load_method="openai", drsow_config=drsow_config)

# Get scores for responses
scores = model.score([
    [
        {"role": "user", "content": "What is 2+2?"},
        {"role": "assistant", "content": "The answer is 4."}
    ]
])

Supported Backends

RewardHub supports multiple serving backends:

  • HuggingFace (load_method="hf"): Direct local model loading
  • VLLM (load_method="vllm"): Optimized local serving
  • OpenAI API (load_method="openai"): Remote API access

Supported Models

We support various reward models including:

Model Type HuggingFace VLLM OpenAI
Qwen/Qwen2.5-Math-PRM-7B PRM
internlm/internlm2-7b-reward ORM
RLHFlow/Llama3.1-8B-PRM-Deepseek-Data PRM
RLHFlow/ArmoRM-Llama3-8B-v0.1 ORM
drsow ORM

Research

RewardHub serves as the official implementation of the paper:
Dr. SoW: Density Ratio of Strong-over-weak LLMs for Reducing the Cost of Human Annotation in Preference Tuning

The paper introduces CDR, a novel approach to generating high-quality preference annotations using density ratios tailored to domain-specific needs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reward_hub-0.1.0a1.tar.gz (22.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

reward_hub-0.1.0a1-py2.py3-none-any.whl (19.3 kB view details)

Uploaded Python 2Python 3

File details

Details for the file reward_hub-0.1.0a1.tar.gz.

File metadata

  • Download URL: reward_hub-0.1.0a1.tar.gz
  • Upload date:
  • Size: 22.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for reward_hub-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 8816c9d82bafb56e0cfb186d63958edad1f6919c9a3b027f3c09b0a75fc24a1f
MD5 e72bf15d91d8232982ac4f191a29ab18
BLAKE2b-256 1bf093b0c6b9b459571c6fbb53f3ef5f1ced5120f5e5dbf64523cf37a6fff0ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for reward_hub-0.1.0a1.tar.gz:

Publisher: pypi.yaml on Red-Hat-AI-Innovation-Team/reward_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file reward_hub-0.1.0a1-py2.py3-none-any.whl.

File metadata

  • Download URL: reward_hub-0.1.0a1-py2.py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for reward_hub-0.1.0a1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 fcf4e130973897b291858756f5b7f42b1f1893557ae2ede236867c0ae9fe8e9f
MD5 85bee00c48de52f4c6367fc2e9d3525d
BLAKE2b-256 3f0c926f30de584c4c64733050ff20856665f5f535aa09e3f9b6859d3cd4b97b

See more details on using hashes here.

Provenance

The following attestation bundles were made for reward_hub-0.1.0a1-py2.py3-none-any.whl:

Publisher: pypi.yaml on Red-Hat-AI-Innovation-Team/reward_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page