A framework for optimizing DSPy programs
Project description
A framework for optimizing DSPy programs with RL.
🚀 Installation
Install Arbor via uv (recommended) or pip:
uv pip install -U arbor-ai
# or: pip install -U arbor-ai
If you need the latest DSPy features that haven't landed on PyPI yet, install directly from the main branch:
uv pip install -U git+https://github.com/stanfordnlp/dspy.git@main
Optionally, you can also install flash attention to speed up inference.
This can take 15+ minutes to install on some setups:
uv pip install flash-attn --no-build-isolation
⚡ Quick Start
Optimize a DSPy Program with RL
import random
import dspy
from datasets import load_dataset
import arbor
from arbor import ArborGRPO, ArborProvider
# Start Arbor server (starts in background)
arbor_server_info = arbor.init()
# Load a small English→French dataset
raw_dataset = load_dataset("Helsinki-NLP/opus_books", "en-fr")
raw_data = [
dspy.Example(english=ex["translation"]["en"], french=ex["translation"]["fr"]).with_inputs("english")
for ex in raw_dataset["train"]
][:2000]
# Train / validation split
random.Random(43).shuffle(raw_data)
trainset = raw_data[:1000]
valset = raw_data[1000:1100]
# Define the task
translate_program = dspy.Predict("english -> french")
# Connect DSPy to Arbor
provider = ArborProvider()
lm_name = "Qwen/Qwen2.5-1.5B-Instruct"
lm = dspy.LM(
model=f"openai/arbor:{student_lm_name}",
provider=provider,
api_base=arbor_server_info["base_url"],
api_key="arbor",
# Arbor checks to make sure these match the training config
temperature=1.0,
top_p=1.0,
top_k=-1,
repetition_penalty=1.0,
max_tokens=2048,
)
translate_program.set_lm(lm)
# Simple reward: number of unique letters in the French output
def unique_letter_reward(example, pred, trace=None) -> float:
letters = [ch.lower() for ch in pred.french if ch.isalpha()]
return float(len(set(letters)))
# NOTE: Training on 4 GPUs.
train_kwargs = {
"per_device_train_batch_size": 2,
"gradient_accumulation_steps": 24/6, # 21 (rollouts per example) * 6 (num dspy examples per grpo step) / 6 (gpus * per device batch size)
"temperature": 1.0,
"top_k": -1,
"top_p": 1.0,
"repetition_penalty": 1.0,
"beta": 0.00,
"learning_rate": 1e-6,
"gradient_checkpointing": True,
"fp16": True,
"lr_scheduler_type": "constant_with_warmup",
"loss_type": "dapo",
"max_steps": 1000,
"report_to": "wandb",
"log_completions": True,
"logging_steps": 1,
"max_prompt_length": None,
"max_completion_length": None,
"scale_rewards": False,
"max_grad_norm": 1.0,
"lora_config": {
"lora_alpha": 16,
"lora_dropout": 0.05,
"r": 8,
"target_modules": ["q_proj", "k_proj", "v_proj", "o_proj", "up_proj", "down_proj", "gate_proj"],
},
"num_training_gpus": 3,
"num_inference_gpus": 1,
"weight_decay": 0.001,
}
# Optimize with Arbor's GRPO trainer
compiler = ArborGRPO(
metric=unique_letter_reward,
num_dspy_examples_per_grpo_step=6,
num_rollouts_per_grpo_step=24,
exclude_demos=True,
num_train_steps=1000,
num_threads=16,
use_train_as_val=False,
num_steps_for_val=50,
train_kwargs=train_kwargs,
checkpoint="single-best",
)
# Run optimization
optimized_translate = compiler.compile(
student=student_translate,
trainset=trainset,
valset=valset,
)
print(optimized_translate(english="hello"))
Troubleshooting
NCCL Errors Certain GPU setups, particularly with newer GPUs, seem to have issues with NCCL that cause Arbor to crash. Often times of these can be fixed with the following environment variables:
export NCCL_P2P_DISABLE=1
export NCCL_IB_DISABLE=1
NVCC If you run into issues, double check that you have nvcc installed:
nvcc --version
Community
- Join our Discord for help, updates, and discussions: Arbor Discord
- Join the DSPy Discord for help, updates, and discussion on DSPy: DSPy Discord
🙏 Acknowledgements
Arbor builds on the shoulders of great work. We extend our thanks to:
📚 Citation
If you use this code in your research, please cite:
@article{ziems2025multi,
title={Multi-module GRPO: Composing policy gradients and prompt optimization for language model programs},
author={Ziems, Noah and Soylu, Dilara and Agrawal, Lakshya A and Miller, Isaac and Lai, Liheng and Qian, Chen and Song, Kaiqiang and Jiang, Meng and Klein, Dan and Zaharia, Matei and others},
journal={arXiv preprint arXiv:2508.04660},
year={2025}
}
@article{agrawal2025gepa,
title={GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning},
author={Agrawal, Lakshya A and Tan, Shangyin and Soylu, Dilara and Ziems, Noah and Khare, Rishi and Opsahl-Ong, Krista and Singhvi, Arnav and Shandilya, Herumb and Ryan, Michael J and Jiang, Meng and others},
journal={arXiv preprint arXiv:2507.19457},
year={2025}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arbor_ai-0.2.17.tar.gz.
File metadata
- Download URL: arbor_ai-0.2.17.tar.gz
- Upload date:
- Size: 92.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5da78af62bdd6e6e62dd12d3264eb7700920299453f6ae4bcf88b948c87bf75e
|
|
| MD5 |
b5aae499d4621337bd1a50eae21b91e5
|
|
| BLAKE2b-256 |
812b986a6057aa9e8142af925e524a57ecab42576cb9a9f494022e01bf7d293b
|
File details
Details for the file arbor_ai-0.2.17-py3-none-any.whl.
File metadata
- Download URL: arbor_ai-0.2.17-py3-none-any.whl
- Upload date:
- Size: 105.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4314e811e7505220e663b153dd4cbfda54cef7e6727c5c1373258a4f22261b42
|
|
| MD5 |
58f4af3d1ebcd3e98ab9a0388cffa647
|
|
| BLAKE2b-256 |
1281ec935d1cbcaa5d01a9831ad165bc3a27b40e99a470da2821a1cf26aa5153
|