Skip to main content

Verifiers: Environments for LLM Reinforcement Learning

Project description

Prime Intellect


Verifiers: Environments for LLM Reinforcement Learning

DocumentationEnvironments HubPRIME-RL


Style Test Envs

News & Updates

  • [05/07/26] v0.1.14 is released, featuring the v1 Taskset/Harness API, shared eval and training config shape, model-family starter configs, OpenAI Responses and renderer-backed clients, per-turn timing, GEPA prompt artifacts, Lean guard markers, and release/infrastructure hardening.
  • [04/28/26] v0.1.13.dev8 is released, featuring per-rollout wall-clock timeouts for MultiTurnEnv, CLI timeout config, sandbox timeout propagation, and smaller CliAgentEnv and RLM fixes.
  • [04/17/26] v0.1.12 is released, featuring upstreamed opencode and RLM harnesses/tasksets, major RLMEnv improvements (context dropping, prompt builder, hardened transport), multi-worker env server support, expanded vf-tui capabilities, and richer eval configuration.
  • [03/12/26] v0.1.11 is released, featuring a unified client stack, major RLMEnv and env server reliability improvements, a substantially refined eval TUI, new pass@k and ablation sweep support, and bundled opencode environments.
  • [02/10/26] v0.1.10 is released, featuring OpenEnv and BrowserEnv integrations, resumed evals, improved rollout and token tracking, safer sandbox lifecycle behavior, refreshed workspace setup, and opencode harbor improvements.
  • [01/08/26] v0.1.9 is released, featuring a number of new experimental environment class types, monitor rubrics for automatic metric collection, improved workspace setup flow, improved error handling, bug fixes, and a documentation overhaul.
  • [11/19/25] v0.1.8 is released, featuring a major refactor of the rollout system to use trajectory-based tracking for token-in token-out training across turns, as well as support for truncated or branching rollouts.
  • [11/07/25] Verifiers v0.1.7 is released! This includes an improved quickstart configuration for training with prime-rl, a new included "nano" trainer (vf.RLTrainer, replacing vf.GRPOTrainer), and a number of bug fixes and improvements to the documentation.
  • [10/27/25] A new iteration of the Prime Intellect Environments Program is live!

Overview

Verifiers is our library for creating environments to train and evaluate LLMs.

Environments contain everything required to run and evaluate a model on a particular task:

  • A dataset of task inputs
  • A harness for the model (tools, sandboxes, context management, etc.)
  • A reward function or rubric to score the model's performance

Environments can be used for training models with reinforcement learning (RL), evaluating capabilities, generating synthetic data, experimenting with agent harnesses, and more.

Verifiers is tightly integrated with the Environments Hub, as well as our training framework prime-rl and our Hosted Training platform.

Getting Started

Ensure you have uv installed, as well as the prime CLI tool:

# install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# install the prime CLI
uv tool install prime
# log in to the Prime Intellect platform
prime login

To set up a new workspace for developing environments, do:

# ~/dev/my-lab
prime lab setup 

This sets up a Python project if needed (with uv init), installs verifiers (with uv add verifiers), creates the recommended workspace structure, and downloads useful starter files:

configs/
├── endpoints.toml      # OpenAI-compatible API endpoint configuration
├── rl/                 # Example configs for Hosted Training
├── eval/               # Example multi-environment eval configs
└── gepa/               # Example configs for prompt optimization
.prime/
└── skills/             # Bundled workflow skills for create/browse/review/eval/GEPA/train/brainstorm
environments/
└── AGENTS.md           # Documentation for AI coding agents
AGENTS.md               # Top-level documentation for AI coding agents
CLAUDE.md               # Claude-specific pointer to AGENTS.md

Alternatively, add verifiers to an existing project:

uv add verifiers && prime lab setup --skip-install

Environments built with Verifiers are self-contained Python modules. To initialize a fresh environment template, do:

prime env init my-env # creates a new template in ./environments/my_env

Add an explicit harness loader when the environment owns harness behavior:

prime env init my-env --with-harness

For OpenEnv integration, use:

prime env init my-openenv --openenv

Then copy your OpenEnv project into environments/my_openenv/proj/ and build the image with:

uv run vf-build my-openenv

This will create a new module called my_env with a basic environment template.

environments/my_env/
├── my_env.py           # Main implementation
├── pyproject.toml      # Dependencies and metadata
└── README.md           # Documentation

Environment modules should expose a load_environment function which returns an environment object. For simple legacy environments, this can still be a direct constructor:

# my_env.py
import verifiers as vf

def load_environment(dataset_name: str = 'gsm8k') -> vf.Environment:
    dataset = vf.load_example_dataset(dataset_name) # 'question'
    async def correct_answer(completion, answer) -> float:
        completion_ans = completion[-1]['content']
        return 1.0 if completion_ans == answer else 0.0
    rubric = vf.Rubric(funcs=[correct_answer])
    env = vf.SingleTurnEnv(dataset=dataset, rubric=rubric)
    return env

For new environments with reusable tasksets, toolsets, custom programs, or custom harnesses, use the v1 Taskset/Harness path:

# my_env.py
import verifiers as vf


class MyTasksetConfig(vf.TasksetConfig):
    system_prompt: vf.SystemPrompt = "Reverse text exactly."


class MyTaskset(vf.Taskset[MyTasksetConfig]):
    def load_tasks(self, split: vf.TaskSplit = "train") -> vf.Tasks:
        rows = [
            {
                "prompt": [{"role": "user", "content": "Reverse abc."}],
                "answer": "cba",
                "split": "train",
                "max_turns": 1,
            }
        ]
        return [row for row in rows if row["split"] == split]

    @vf.reward(weight=1.0)
    async def contains_answer(self, task, state) -> float:
        return float(task["answer"] in str(state.get("completion") or ""))


def load_taskset(config: MyTasksetConfig) -> MyTaskset:
    return MyTaskset(config=config)


def load_environment(config: vf.EnvConfig) -> vf.Env:
    """Loader pattern for all Taskset/Harness environments."""
    return vf.Env(
        taskset=vf.load_taskset(config=config.taskset),
        harness=vf.load_harness(config=config.harness),
    )

The child loader annotation defines the taskset config shape; root load_environment stays typed as vf.EnvConfig. See BYO Harness for the advanced v1 taskset/harness API. Reusable taskset and harness packages live in tasksets and harnesses. Install them with uv add "verifiers[packages]", or with the narrower verifiers[tasksets], verifiers[harnesses], and backend-specific extras. For example, Harbor task directories can run through the bundled OpenCode CLI harness with:

from harnesses import OpenCode, OpenCodeConfig
from tasksets import HarborTaskset, HarborTasksetConfig

env = vf.Env(
    taskset=HarborTaskset(config=HarborTasksetConfig(bundle_package=__name__)),
    harness=OpenCode(config=OpenCodeConfig()),
)

The same environment package is the unit used by evals and prime-rl. The trainer owns model, endpoint, sampling, and rollout count; v1-specific options stay on the taskset or harness config that owns them:

# configs/rl/my-v1-env.toml
model = "Qwen/Qwen3-30B-A3B-Instruct-2507"
max_steps = 100
batch_size = 256
rollouts_per_example = 8

[sampling]
max_tokens = 4096

[[env]]
id = "my-env"

[env.harness]
max_turns = 1

[env.taskset]
system_prompt = "Reverse text exactly."

[env.taskset.scoring.contains_answer]
weight = 1.0
prime env install my-env

For self-managed training launch commands, use the prime-rl documentation.

To run a local evaluation with any OpenAI-compatible model, do:

prime eval run my-env -m openai/gpt-5-nano # run and save eval results locally

Evaluations use Prime Inference by default; configure your own API endpoints in ./configs/endpoints.toml.

View local evaluation results in the terminal UI:

prime eval view

To publish the environment to the Environments Hub, do:

prime env push --path ./environments/my_env

To run an evaluation directly from the Environments Hub, do:

prime eval run primeintellect/math-python

Documentation

Environments — Create datasets, rubrics, and custom multi-turn interaction protocols.

BYO Harness — Build v1 Taskset/Harness environments with custom tools, sandboxes, users, and custom programs.

Evaluation - Evaluate models using your environments.

Training — Train models in your environments with reinforcement learning.

Development — Contributing to verifiers

API Reference — Understanding the API and data structures

FAQs - Other frequently asked questions.

Citation

Originally created by Will Brown (@willccbb).

If you use this code in your research, please cite:

@misc{brown_verifiers_2025,
  author       = {William Brown},
  title        = {{Verifiers}: Environments for LLM Reinforcement Learning},
  howpublished = {\url{https://github.com/PrimeIntellect-ai/verifiers}},
  note         = {Commit abcdefg • accessed DD Mon YYYY},
  year         = {2025}
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

verifiers-0.1.15.dev176.tar.gz (741.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

verifiers-0.1.15.dev176-py3-none-any.whl (661.8 kB view details)

Uploaded Python 3

File details

Details for the file verifiers-0.1.15.dev176.tar.gz.

File metadata

  • Download URL: verifiers-0.1.15.dev176.tar.gz
  • Upload date:
  • Size: 741.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for verifiers-0.1.15.dev176.tar.gz
Algorithm Hash digest
SHA256 c7fefaa9a662f7f659da2aa2f93bd64b261689a5b9e62beb8985ee0d20d35fa6
MD5 1fb662011c98f9259bdf82e1923fee55
BLAKE2b-256 b201c485e2694229dbb94f523eeca5ea154efac11fdd3d5382fd9334cdb7ef1d

See more details on using hashes here.

Provenance

The following attestation bundles were made for verifiers-0.1.15.dev176.tar.gz:

Publisher: publish-verifiers.yml on PrimeIntellect-ai/verifiers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file verifiers-0.1.15.dev176-py3-none-any.whl.

File metadata

File hashes

Hashes for verifiers-0.1.15.dev176-py3-none-any.whl
Algorithm Hash digest
SHA256 5e233c161229c2582fc9662a5fe3c8a8cd10daf874f6cb3d96eef864b014d55a
MD5 88950567e0883defcd8cad80ff5631f6
BLAKE2b-256 aa3db27c427bef6d80b67abb705ff4d26c7f7a749e24ba7f9fe2397555cba7b8

See more details on using hashes here.

Provenance

The following attestation bundles were made for verifiers-0.1.15.dev176-py3-none-any.whl:

Publisher: publish-verifiers.yml on PrimeIntellect-ai/verifiers

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page