Skip to main content

RLVR training framework for LLMs

Project description

retrain

retrain is a TOML-first RLVR (Reinforcement Learning with Verifiable Rewards) trainer for LLMs, built to make experiments easier to run, compare, and repeat.

If you are new, start with install -> explore commands -> run a tiny config.

Install

Requires Python 3.11+.

# CLI + docs exploration
uv tool install retrain

# Local GPU training (adds torch)
uv tool install "retrain[local]"

# Remote Tinker backend
uv tool install "retrain[tinker]"

If you are developing this repo directly:

pip install -e ".[dev]"

Explore the CLI

Use these first to understand what exists before you train:

retrain --help
retrain man
retrain man --topic quickstart
retrain man --list-topics
retrain backends
retrain doctor

Useful inspection commands while iterating:

retrain explain retrain.toml   # dry-run: what this config would do
retrain status logs            # summarize runs/campaigns under logs/
retrain plugins                # list built-ins + discovered plugins
retrain init-plugin --kind transform --name my_transform --with-test
retrain man --json --topic quickstart
retrain man --path             # editable bundled manual source

Tiny TOML Demo

Create mini.toml:

max_tokens = 1024 below is an intentional smoke-test profile. The standard default for full runs is max_tokens = 10240.

[model]
model = "Qwen/Qwen3-4B-Instruct-2507"

[algorithm]
advantage_mode = "grpo"
transform_mode = "none"

[training]
max_steps = 20
batch_size = 2
group_size = 8
max_tokens = 1024
lr = 4e-5

[backend]
backend = "local"
adapter_path = "adapters/mini"

[logging]
log_dir = "logs/mini"

Run it:

retrain mini.toml

Override fields from CLI without editing TOML:

retrain mini.toml --seed 42 --max-steps 40 --wandb-project my-project

Quick Start from Template

retrain init --template quickstart
retrain retrain.toml

Other templates:

retrain init --list
retrain init --template experiment
retrain init --template campaign
retrain init --interactive

retrain Workflow

The normal retrain loop is:

  1. Define TOML config (retrain.toml or campaign.toml)
  2. Dry-run with retrain explain ...
  3. Train with retrain ...
  4. Inspect with retrain status logs

Use retrain man --topic capacity only when you are sizing longer runs.

Why retrain

  • Experiment-first workflow: config -> explain -> run -> compare
  • Composable advantage pipeline: GRPO/MaxRL + GTPO/HICRA/SEPA
  • Pluggable backends and inference engines
  • Pluggable rewards (match, math, judge, custom)
  • Campaign sweeps from one TOML
  • LoRA-Squeeze rank analysis/compression
  • Checkpoint resume and run status tooling

Common Config Patterns

Use verifiers environments from TOML:

[environment]
provider = "verifiers"
id = "primeintellect/gsm8k"
args = { split = "train" }
auto_install = true
max_turns = 8

Use custom advantage + transform plugins from TOML:

[algorithm]
advantage_mode = "my_advantages.hipa_like_advantages"
transform_mode = "my_transforms.make_transform_spec"

Use a full algorithm plugin (overrides composable advantage+transform path):

[algorithm]
algorithm_mode = "my_algorithms.my_algorithm"

Documentation

Full docs: retrain.readthedocs.io

Contributor note: run retrain man --check in CI to detect stale auto-generated manual blocks, run retrain man --sync locally to refresh them, run uv run mkdocs build --strict before publishing docs changes, and run make chaos-backend-workflow before pushing backend/orchestrator changes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

retrain-0.3.1.tar.gz (688.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

retrain-0.3.1-py3-none-any.whl (208.7 kB view details)

Uploaded Python 3

File details

Details for the file retrain-0.3.1.tar.gz.

File metadata

  • Download URL: retrain-0.3.1.tar.gz
  • Upload date:
  • Size: 688.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for retrain-0.3.1.tar.gz
Algorithm Hash digest
SHA256 d3a5c9e5373385b84533ddffda9474bdc5e7707eeaae8dde1788291a4686352b
MD5 df0078d1c100bfc26de7d5cf54f122a1
BLAKE2b-256 9ab9f65652f0f3c80eb2d1fa8372e481a7e59b296c8f7cdd517aeb89b3b0e40d

See more details on using hashes here.

Provenance

The following attestation bundles were made for retrain-0.3.1.tar.gz:

Publisher: publish.yml on teilomillet/retrain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file retrain-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: retrain-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 208.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for retrain-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a832c45197167b6e7f531d85ada86b53e780097a7c4f18d1b33ce944a7bfa89c
MD5 62df3f0630b4624b2f9b1447b9132c78
BLAKE2b-256 8fbf5e6156fd4e27b7a554c68b26739a0b09d9a4261ae64ddd4415594ef1a3d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for retrain-0.3.1-py3-none-any.whl:

Publisher: publish.yml on teilomillet/retrain

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page