The OpenPipe Agent Reinforcement Training (ART) library

Reason this release was yanked:

Unsloth upgrade causing issues

Project description

Agent Reinforcement Trainer

Train multi-step agents for real-world tasks using GRPO.

🚀 W&B Training: Serverless RL

W&B Training (Serverless RL) is the first publicly available service for flexibly training models with reinforcement learning. It manages your training and inference infrastructure automatically, letting you focus on defining your data, environment and reward function—leading to faster feedback cycles, lower costs, and far less DevOps.

✨ Key Benefits:

40% lower cost - Multiplexing on shared production-grade inference cluster
28% faster training - Scale to 2000+ concurrent requests across many GPUs
Zero infra headaches - Fully managed infrastructure that stays healthy
Instant deployment - Every checkpoint instantly available via W&B Inference

# Before: Hours of GPU setup and infra management
# RuntimeError: CUDA error: out of memory 😢

# After: Serverless RL with instant feedback
from art.serverless.backend import ServerlessBackend

model = art.TrainableModel(
  project="voice-agent",
  name="agent-001",
  base_model="OpenPipe/Qwen3-14B-Instruct"
)

backend = ServerlessBackend(
    api_key="your_wandb_api_key"
)
model.register(backend)
# Edit and iterate in minutes, not hours!

📖 Learn more about W&B Training →

ART Overview

ART is an open-source RL framework that improves agent reliability by allowing LLMs to learn from experience. ART provides an ergonomic harness for integrating GRPO into any python application. For a quick hands-on introduction, run one of the notebooks below. When you're ready to learn more, check out the docs.

📒 Notebooks

Agent Task	Example Notebook	Description	Comparative Performance
ART•E [Serverless]	🏋️ Train agent	Qwen3 14B learns to search emails using RULER	benchmarks
2048 [Serverless]	🏋️ Train agent	Qwen3 14B learns to play 2048	benchmarks
ART•E LangGraph	🏋️ Train agent	Qwen 2.5 7B learns to search emails using LangGraph	[Link coming soon]
MCP•RL	🏋️ Train agent	Qwen 2.5 3B masters the NWS MCP server	[Link coming soon]
Temporal Clue	🏋️ Train agent	Qwen 2.5 7B learns to solve Temporal Clue	[Link coming soon]
Tic Tac Toe	🏋️ Train agent	Qwen 2.5 3B learns to play Tic Tac Toe	benchmarks
Codenames	🏋️ Train agent	Qwen 2.5 3B learns to play Codenames	benchmarks
AutoRL [RULER]	🏋️ Train agent	Train Qwen 2.5 7B to master any task	[Link coming soon]
Distillation (SFT)	🏋️ Train model	Distill text-to-SQL from Qwen 3 235B to Qwen 3 30B	[Link coming soon]
Summarizer (SFT + RL)	🏋️ Train model	Train a document summarizer with SFT warmup then RL	[Link coming soon]
SFT from a dataset	🏋️ Train model	Fine-tune Qwen 3 30B on text-to-SQL from a dataset	[Link coming soon]

📰 ART News

Explore our latest research and updates on building SOTA agents.

🗞️ ART now integrates seamlessly with LangGraph - Train your LangGraph agents with reinforcement learning for smarter multi-step reasoning and improved tool usage.
🗞️ MCP•RL: Teach Your Model to Master Any MCP Server - Automatically train models to effectively use MCP server tools through reinforcement learning.
🗞️ AutoRL: Zero-Data Training for Any Task - Train custom AI models without labeled data using automatic input generation and RULER evaluation.
🗞️ RULER: Easy Mode for RL Rewards is now available for automatic reward generation in reinforcement learning.
🗞️ ART·E: How We Built an Email Research Agent That Beats o3 demonstrates a Qwen 2.5 14B email agent outperforming OpenAI's o3.
🗞️ ART Trainer: A New RL Trainer for Agents enables easy training of LLM-based agents using GRPO.

📖 See all blog posts →

Why ART?

ART provides convenient wrappers for introducing RL training into existing applications. We abstract the training server into a modular service that your code doesn't need to interface with.
Train from anywhere. Run the ART client on your laptop and let the ART server kick off an ephemeral GPU-enabled environment, or run on a local GPU.
Integrations with hosted platforms like W&B, Langfuse, and OpenPipe provide flexible observability and simplify debugging.
ART is customizable with intelligent defaults. You can configure training parameters and inference engine configurations to meet specific needs, or take advantage of the defaults, which have been optimized for training efficiency and stability.

Installation

ART agents can be trained from any client machine that runs python. To add to an existing project, run this command:

pip install openpipe-art

🤖 ART•E Agent

Curious about how to use ART for a real-world task? Check out the ART•E Agent blog post, where we detail how we trained Qwen 2.5 14B to beat o3 at email retrieval!

🔁 Training Loop Overview

ART's functionality is divided into a client and a server. The OpenAI-compatible client is responsible for interfacing between ART and your codebase. Using the client, you can pass messages and get completions from your LLM as it improves. The server runs independently on any machine with a GPU. It abstracts away the complexity of the inference and training portions of the RL loop while allowing for some custom configuration. An outline of the training loop is shown below:

Inference
1. Your code uses the ART client to perform an agentic workflow (usually executing several rollouts in parallel to gather data faster).
2. Completion requests are routed to the ART server, which runs the model's latest LoRA in vLLM.
3. As the agent executes, each system, user, and assistant message is stored in a Trajectory.
4. When a rollout finishes, your code assigns a reward to its Trajectory, indicating the performance of the LLM.
Training
1. When each rollout has finished, Trajectories are grouped and sent to the server. Inference is blocked while training executes.
2. The server trains your model using GRPO, initializing from the latest checkpoint (or an empty LoRA on the first iteration).
3. The server saves the newly trained LoRA to a local directory and loads it into vLLM.
4. Inference is unblocked and the loop resumes at step 1.

This training loop runs until a specified number of inference and training iterations have completed.

🧩 Supported Models

ART should work with most vLLM/HuggingFace-transformers compatible causal language models, or at least the ones supported by Unsloth. Gemma 3 does not appear to be supported for the time being. If any other model isn't working for you, please let us know on Discord or open an issue on GitHub!

🤝 Contributing

ART is in active development, and contributions are most welcome! Please see the CONTRIBUTING.md file for more information.

📖 Citation

@misc{hilton2025art,
  author = {Brad Hilton and Kyle Corbitt and David Corbitt and Saumya Gandhi and Angky William and Bohdan Kovalevskyi and Andie Jones},
  title = {ART: Agent Reinforcement Trainer},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/openpipe/art}}
}

⚖️ License

This repository's source code is available under the Apache-2.0 License.

🙏 Credits

ART stands on the shoulders of giants. While we owe many of the ideas and early experiments that led to ART's development to the open source RL community at large, we're especially grateful to the authors of the following projects:

Finally, thank you to our partners who've helped us test ART in the wild! We're excited to see what you all build with it.

Project details

Release history Release notifications | RSS feed

0.5.17

Mar 13, 2026

0.5.16

Mar 4, 2026

0.5.15 yanked

Mar 4, 2026

Reason this release was yanked:

Unsloth upgrade causing issues

This version

0.5.14 yanked

Mar 3, 2026

Reason this release was yanked:

Unsloth upgrade causing issues

0.5.13 yanked

Mar 3, 2026

Reason this release was yanked:

Unsloth upgrade causing issues

0.5.11

Feb 19, 2026

0.5.10

Feb 19, 2026

0.5.9

Jan 30, 2026

0.5.7

Jan 8, 2026

0.5.6

Jan 6, 2026

0.5.5

Jan 6, 2026

0.5.4

Dec 15, 2025

0.5.3

Nov 24, 2025

0.5.2

Nov 14, 2025

0.5.1

Oct 22, 2025

0.5.0

Oct 7, 2025

0.4.12

Oct 7, 2025

0.4.11

Aug 27, 2025

0.4.10

Aug 26, 2025

0.4.10b1 pre-release

Aug 26, 2025

0.4.9

Aug 25, 2025

0.4.9b2 pre-release

Aug 25, 2025

0.4.9b1 pre-release

Aug 25, 2025

0.4.8

Aug 21, 2025

0.4.7

Aug 12, 2025

0.4.6 yanked

Aug 12, 2025

0.4.5

Aug 5, 2025

0.4.4

Jul 17, 2025

0.4.3

Jul 15, 2025

0.4.2

Jul 14, 2025

0.4.1

Jul 14, 2025

0.4.0

Jul 11, 2025

0.3.13

Jul 2, 2025

0.3.12

Jun 30, 2025

0.3.11.post5

Aug 4, 2025

0.3.11.post4

Aug 4, 2025

0.3.11.post3

Aug 1, 2025

0.3.11.post2

Jul 12, 2025

0.3.11.post1

Jul 8, 2025

0.3.11

May 30, 2025

0.3.10

May 28, 2025

0.3.9

May 23, 2025

0.3.7 yanked

May 14, 2025

Reason this release was yanked:

Bug in updated dependencies

0.3.6

May 13, 2025

0.3.5

May 6, 2025

0.3.4

May 5, 2025

0.3.3

May 5, 2025

0.3.2

May 3, 2025

0.3.1

May 1, 2025

0.3.0

May 1, 2025

0.2.0

Apr 28, 2025

0.1.25

Apr 18, 2025

0.1.24

Apr 17, 2025

0.1.23

Apr 16, 2025

0.1.22

Apr 15, 2025

0.1.21

Apr 15, 2025

0.1.20

Apr 15, 2025

0.1.19

Apr 14, 2025

0.1.18

Apr 14, 2025

0.1.17

Apr 14, 2025

0.1.16

Apr 14, 2025

0.1.15

Apr 14, 2025

0.1.14

Apr 14, 2025

0.1.13

Apr 14, 2025

0.1.12

Apr 14, 2025

0.1.11

Apr 14, 2025

0.1.10

Apr 11, 2025

0.1.9

Apr 11, 2025

0.1.8

Apr 11, 2025

0.1.7

Apr 11, 2025

0.1.6

Apr 11, 2025

0.1.5

Apr 11, 2025

0.1.4

Apr 11, 2025

0.1.3

Apr 11, 2025

0.1.2

Apr 11, 2025

0.1.1

Apr 11, 2025

0.1.0

Apr 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openpipe_art-0.5.14.tar.gz (8.2 MB view details)

Uploaded Mar 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openpipe_art-0.5.14-py3-none-any.whl (23.3 kB view details)

Uploaded Mar 3, 2026 Python 3

File details

Details for the file openpipe_art-0.5.14.tar.gz.

File metadata

Download URL: openpipe_art-0.5.14.tar.gz
Upload date: Mar 3, 2026
Size: 8.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openpipe_art-0.5.14.tar.gz
Algorithm	Hash digest
SHA256	`380fa1c1e39c8275a68979d12b3a6be583d06dc4c292c09c8d04e0ab946568d2`
MD5	`9e75c79afc62004269a3e192a270eedd`
BLAKE2b-256	`129f9c91165f6d65bacb07875e2b61a5f9efb8001e6ddb00c2bb84b17b3ae526`

See more details on using hashes here.

File details

Details for the file openpipe_art-0.5.14-py3-none-any.whl.

File metadata

Download URL: openpipe_art-0.5.14-py3-none-any.whl
Upload date: Mar 3, 2026
Size: 23.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openpipe_art-0.5.14-py3-none-any.whl
Algorithm	Hash digest
SHA256	`053bf12850f55b6ba691409872df5f4b91d79b83addac404d105ad0082bbee4d`
MD5	`adba2850a25c3c87d6d350c564e26795`
BLAKE2b-256	`a8a85cff1f9ce8be91f6e067e391707169225a6fba178c8b6d760f6f30668f99`

See more details on using hashes here.

openpipe-art 0.5.14

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Agent Reinforcement Trainer

🚀 W&B Training: Serverless RL

ART Overview

📒 Notebooks

📰 ART News

Why ART?

Installation

🤖 ART•E Agent

🔁 Training Loop Overview

🧩 Supported Models

🤝 Contributing

📖 Citation

⚖️ License

🙏 Credits

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes