Lean, modular reward functions for RL training with LLMs
Project description
LLM Rewards
A lean, modular reward functions for RLHF training with LLMs. Framework-agnostic design with built-in support for trlx, trl, and custom training loops.
Install
pip install -e .
Quick Start
from llm_rewards import RewardModel, SimpleThinkReward, LengthReward, XMLReward, create_reward_fn
# Create reward stack
rewards = [
LengthReward(target_length=1024, weight=0.1),
XMLReward(weight=0.5, partial_credit=True),
RewardModel("your/reward/model", weight=1.0),
SimpleThinkReward(weight=0.5)
]
# Get framework-agnostic reward function
reward_fn = create_reward_fn(rewards, normalize=True)
# Use with trlx
from trlx import Trainer
trainer = Trainer(reward_fn=reward_fn)
trainer.train(...)
Key Features
- Transformer reward models
- Reasoning validation (ThinkingReward)
- Length, format, XML validation
- Reference similarity
- Prompt relevance
- Framework adapters
- Batched inference
- Reward normalization
Example Training Script
See example/train_example.py for full Qwen-2.5 0.5B training example.
Custom Rewards
from llm_rewards import RewardFunction, RewardOutput
class MyReward(RewardFunction):
def compute(self, texts, **kwargs) -> RewardOutput:
rewards = [score(text) for text in texts]
return RewardOutput(values=torch.tensor(rewards))
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_rewards-0.0.1.tar.gz.
File metadata
- Download URL: llm_rewards-0.0.1.tar.gz
- Upload date:
- Size: 12.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff76c39da6c6ec3010a41df2ee9f1466b2daf5ee6e0aab158110e620ca15310b
|
|
| MD5 |
04e20464fdd9b95cf7d89e73bb4f107f
|
|
| BLAKE2b-256 |
0de35a8f721d6de20a5dd8a51e17a2800397496e07c19a2f941684ccc7ea7d16
|
Provenance
The following attestation bundles were made for llm_rewards-0.0.1.tar.gz:
Publisher:
publish.yml on dotpyu/llm-rewards
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_rewards-0.0.1.tar.gz -
Subject digest:
ff76c39da6c6ec3010a41df2ee9f1466b2daf5ee6e0aab158110e620ca15310b - Sigstore transparency entry: 165720143
- Sigstore integration time:
-
Permalink:
dotpyu/llm-rewards@6a02e4c4caa2621cd11dcfba06487b4e90eefbf0 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/dotpyu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6a02e4c4caa2621cd11dcfba06487b4e90eefbf0 -
Trigger Event:
release
-
Statement type:
File details
Details for the file llm_rewards-0.0.1-py3-none-any.whl.
File metadata
- Download URL: llm_rewards-0.0.1-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5ac23bd8081e927c670f0852b5858f367dc861c46e7160e9792fc59b6e863cf
|
|
| MD5 |
7fe01d6a39ca3179b4288d81d567cc05
|
|
| BLAKE2b-256 |
ea5e1f2a20f19497833958535d35efd8bf2695141f9f017c5837d0c82c651cfa
|
Provenance
The following attestation bundles were made for llm_rewards-0.0.1-py3-none-any.whl:
Publisher:
publish.yml on dotpyu/llm-rewards
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_rewards-0.0.1-py3-none-any.whl -
Subject digest:
a5ac23bd8081e927c670f0852b5858f367dc861c46e7160e9792fc59b6e863cf - Sigstore transparency entry: 165720144
- Sigstore integration time:
-
Permalink:
dotpyu/llm-rewards@6a02e4c4caa2621cd11dcfba06487b4e90eefbf0 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/dotpyu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6a02e4c4caa2621cd11dcfba06487b4e90eefbf0 -
Trigger Event:
release
-
Statement type: