RLVR training framework for LLMs

Project description

retrain

retrain is a TOML-first RLVR (Reinforcement Learning with Verifiable Rewards) trainer for LLMs.

If you are new, start with install -> explore commands -> run a tiny config.

Install

Requires Python 3.11+.

# CLI + docs exploration
uv tool install retrain

# Local GPU training (adds torch)
uv tool install "retrain[local]"

# Remote Tinker backend
uv tool install "retrain[tinker]"

If you are developing this repo directly:

pip install -e ".[dev]"

Explore the CLI

Use these first to understand what exists before you train:

retrain --help
retrain man
retrain man --topic quickstart
retrain man --list-topics
retrain backends
retrain doctor

Useful inspection commands while iterating:

retrain explain retrain.toml   # dry-run: what this config would do
retrain status logs            # summarize runs/campaigns under logs/
retrain plugins                # list built-ins + discovered plugins
retrain init-plugin --kind transform --name my_transform --with-test
retrain man --json --topic quickstart
retrain man --path             # editable bundled manual source

Tiny TOML Demo

Create mini.toml:

[model]
model = "Qwen/Qwen3-4B-Instruct-2507"

[algorithm]
advantage_mode = "grpo"
transform_mode = "none"

[training]
max_steps = 20
batch_size = 2
group_size = 8
max_tokens = 1024
lr = 4e-5

[backend]
backend = "local"
adapter_path = "adapters/mini"

[logging]
log_dir = "logs/mini"

Run it:

retrain mini.toml

Override fields from CLI without editing TOML:

retrain mini.toml --seed 42 --max-steps 40 --wandb-project my-project

Quick Start from Template

retrain init --template quickstart
retrain retrain.toml

Other templates:

retrain init --list
retrain init --template experiment
retrain init --template campaign
retrain init --interactive

Why retrain

Composable advantage pipeline: GRPO/MaxRL + GTPO/HICRA/SEPA
Pluggable backends and inference engines
Pluggable rewards (match, math, judge, custom)
Campaign sweeps from one TOML
LoRA-Squeeze rank analysis/compression
Checkpoint resume and run status tooling

Common Config Patterns

Use verifiers environments from TOML:

[environment]
provider = "verifiers"
id = "primeintellect/gsm8k"
args = { split = "train" }
auto_install = true
max_turns = 8

Use custom advantage + transform plugins from TOML:

[algorithm]
advantage_mode = "my_advantages.hipa_like_advantages"
transform_mode = "my_transforms.make_transform_spec"

Use a full algorithm plugin (overrides composable advantage+transform path):

[algorithm]
algorithm_mode = "my_algorithms.my_algorithm"

Documentation

Full docs: retrain.readthedocs.io

Contributor note: run retrain man --check in CI to detect stale auto-generated manual blocks, and retrain man --sync locally to refresh them.

Project details

Release history Release notifications | RSS feed

0.3.2

Apr 15, 2026

0.3.1

Apr 15, 2026

0.3.0

Mar 21, 2026

This version

0.2.1

Feb 25, 2026

0.2.0

Feb 21, 2026

0.1.1

May 16, 2025

0.1.0

Apr 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

retrain-0.2.1.tar.gz (508.8 kB view details)

Uploaded Feb 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

retrain-0.2.1-py3-none-any.whl (134.0 kB view details)

Uploaded Feb 25, 2026 Python 3

File details

Details for the file retrain-0.2.1.tar.gz.

File metadata

Download URL: retrain-0.2.1.tar.gz
Upload date: Feb 25, 2026
Size: 508.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.16

File hashes

Hashes for retrain-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`033ea57f8edebd52a452acb53ba3e3b3a588c40e74c3dcf9884a7d92f9dc21a2`
MD5	`db02f5f46d1f903d74e18eb252522400`
BLAKE2b-256	`349f6bc492bb525d82aa95a36f285322d2f656c9f9d243143e48b345ed0b88fd`

See more details on using hashes here.

File details

Details for the file retrain-0.2.1-py3-none-any.whl.

File metadata

Download URL: retrain-0.2.1-py3-none-any.whl
Upload date: Feb 25, 2026
Size: 134.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.16

File hashes

Hashes for retrain-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`941027719849063bf9e389de9eac15b8a863b5e3212065d224c553aea6fbba6a`
MD5	`4a60d8655117b623fa9d7b43b806c2c4`
BLAKE2b-256	`e2101b3de04ee418576db0f530e139f15c049e36bf8093676a331dfe043232d2`

See more details on using hashes here.

retrain 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

retrain

Install

Explore the CLI

Tiny TOML Demo

Quick Start from Template

Why retrain

Common Config Patterns

Documentation

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes