trinity-rft

Trinity-RFT: A Framework for Training Large Language Models with Reinforcement Fine-Tuning

These details have not been verified by PyPI

Project links

Project description

中文主页 | Tutorial | FAQ

Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models

license

💡 What is Trinity-RFT?

Trinity-RFT is a flexible, general-purpose framework for reinforcement fine-tuning (RFT) of large language models (LLMs). It decouples the RFT process into three key components: Explorer, Trainer, and Buffer, and provides functionalities for users with different backgrounds and objectives:

🤖 For agent application developers. [tutorial]
- Train agent applications to improve their ability to complete tasks in specific environments.
- Examples: Multi-Turn Interaction, ReAct Agent
🧠 For RL algorithm researchers. [tutorial]
- Design and validate new reinforcement learning algorithms using compact, plug-and-play modules.
- Example: Mixture of SFT and GRPO
📊 For data engineers. [tutorial]
- Create datasets and build data pipelines for cleaning, augmentation, and human-in-the-loop scenarios.
- Example: Data Processing Foundations, Online Task Curriculum

🌟 Key Features

Flexible RFT Modes:
- Supports synchronous/asynchronous, on-policy/off-policy, and online/offline RL.
- Rollout and training can run separately and scale independently across devices.
- Boost sample and time efficiency by experience replay.
Agentic RL Support:
- Supports both concatenated and general multi-step agentic workflows.
- Able to directly train agent applications developed using agent frameworks like AgentScope.
Full-Lifecycle Data Pipelines:
- Enables pipeline processing of rollout tasks and experience samples.
- Active data management (e.g., prioritization, cleaning, augmentation) throughout the RFT lifecycle.
- Native support for multi-task joint learning.
User-Friendly Design:
- Plug-and-play modules and decoupled architecture, facilitating easy adoption and development.
- Rich graphical user interfaces enable low-code usage.

🔨 Tutorials and Guidelines

Category	Tutorial / Guideline
Run diverse RFT modes	+ Quick example: GRPO on GSM8k + Off-policy RFT + Fully asynchronous RFT + Offline learning by DPO or SFT
Multi-step agentic scenarios	+ Concatenated multi-turn workflow + General multi-step workflow + ReAct workflow with an agent framework
Advanced data pipelines	+ Rollout task mixing and selection + Online task curriculum (paper) + Experience replay + Advanced data processing & human-in-the-loop
Algorithm development / research	+ RL algorithm development with Trinity-RFT (paper) + Non-verifiable domains: RULER, trainable RULER, rubric-as-reward + Research project: group-relative REINFORCE (paper)
Going deeper into Trinity-RFT	+ Full configurations + Benchmark toolkit for quick verification and experimentation + Understand the coordination between explorer and trainer

[!NOTE] For more tutorials, please refer to the Trinity-RFT documentation.

🚀 News

[2025-11] Introducing BOTS: online RL task selection for efficient LLM fine-tuning (paper).
[2025-11] [Release Notes] Trinity-RFT v0.3.2 released: bug fixes and advanced task selection & scheduling.
[2025-10] [Release Notes] Trinity-RFT v0.3.1 released: multi-stage training support, improved agentic RL examples, LoRA support, debug mode and new RL algorithms.
[2025-09] [Release Notes] Trinity-RFT v0.3.0 released: enhanced Buffer, FSDP2 & Megatron support, multi-modal models, and new RL algorithms/examples.
[2025-08] Introducing CHORD: dynamic SFT + RL integration for advanced LLM fine-tuning (paper).
[2025-08] [Release Notes] Trinity-RFT v0.2.1 released.
[2025-07] [Release Notes] Trinity-RFT v0.2.0 released.
[2025-07] Technical report (arXiv v2) updated with new features, examples, and experiments: link.
[2025-06] [Release Notes] Trinity-RFT v0.1.1 released.
[2025-05] [Release Notes] Trinity-RFT v0.1.0 released, plus technical report.
[2025-04] Trinity-RFT open sourced.

Quick Start
Contribution guide
Acknowledgements
Citation

Quick Start

[!NOTE] This project is currently under active development. Comments and suggestions are welcome!

Step 1: installation

Before installing, make sure your system meets the following requirements:

Python: version 3.10 to 3.12 (inclusive)
CUDA: version >= 12.6
GPUs: at least 2 GPUs

From Source (Recommended)

If you plan to customize or contribute to Trinity-RFT, this is the best option.

1. Clone the Repository

git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

2. Set Up a Virtual Environment

Choose one of the following options:

Using Conda

conda create -n trinity python=3.10
conda activate trinity

pip install -e ".[dev]"
pip install -e ".[flash_attn]"
# if you encounter issues when installing flash-attn, try:
# pip install flash-attn==2.8.1 --no-build-isolation

Using venv

python3.10 -m venv .venv
source .venv/bin/activate

pip install -e ".[dev]"
pip install -e ".[flash_attn]"
# if you encounter issues when installing flash-attn, try:
# pip install flash-attn==2.8.1 --no-build-isolation

Using `uv`

uv is a modern Python package installer.

uv sync --extra dev --extra flash_attn

Via PyPI

If you just want to use the package without modifying the code:

pip install trinity-rft
pip install flash-attn==2.8.1

Or with uv:

uv pip install trinity-rft
uv pip install flash-attn==2.8.1

Using Docker

We provide a Docker setup for hassle-free environment configuration.

git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

# Build the Docker image
## Tip: You can modify the Dockerfile to add mirrors or set API keys
docker build -f scripts/docker/Dockerfile -t trinity-rft:latest .

# Run the container, replacing <path_to_your_data_and_checkpoints> with your actual path
docker run -it \
  --gpus all \
  --shm-size="64g" \
  --rm \
  -v $PWD:/workspace \
  -v <path_to_your_data_and_checkpoints>:/data \
  trinity-rft:latest

For training with Megatron-LM, please refer to Megatron-LM Backend.

Step 2: prepare dataset and model

Trinity-RFT supports most datasets and models from Huggingface and ModelScope.

Prepare the model in the local directory $MODEL_PATH/{model_name}:

# Using Huggingface
huggingface-cli download {model_name} --local-dir $MODEL_PATH/{model_name}

# Using Modelscope
modelscope download {model_name} --local_dir $MODEL_PATH/{model_name}

For more details about model downloading, see Huggingface or ModelScope.

Prepare the dataset in the local directory $DATASET_PATH/{dataset_name}:

# Using Huggingface
huggingface-cli download {dataset_name} --repo-type dataset --local-dir $DATASET_PATH/{dataset_name}

# Using Modelscope
modelscope download --dataset {dataset_name} --local_dir $DATASET_PATH/{dataset_name}

For more details about dataset downloading, see Huggingface or ModelScope.

Step 3: configurations

Trinity-RFT provides a web interface for configuring your RFT process.

[!NOTE] This is an experimental feature, and we will continue to improve it.

To launch the web interface for minimal configurations, you can run

trinity studio --port 8080

Then you can configure your RFT process in the web page and generate a config file. You can save the config file for later use or run it directly as described in the following section.

Advanced users can also edit the config file directly. We provide example config files in examples.

For complete GUI features, please refer to the monorepo for Trinity-Studio.

Example: config manager GUI

config-manager

Step 4: run the RFT process

Start a ray cluster:

# On master node
ray start --head

# On worker nodes
ray start --address=<master_address>

(Optional) You may use Wandb / TensorBoard / MLFlow for better monitoring. Please refer to this documentation for the corresponding configurations. For example, to log in to Wandb:

export WANDB_API_KEY=<your_api_key>
wandb login

For command-line users, run the RFT process:

trinity run --config <config_path>

For example, below is the command for fine-tuning Qwen2.5-1.5B-Instruct on GSM8k with GRPO:

trinity run --config examples/grpo_gsm8k/gsm8k.yaml

For studio users, click "Run" in the web interface.

Contribution Guide

This project is currently under active development, and we welcome contributions from the community!

See CONTRIBUTING.md for detailed contribution guidelines.

Acknowledgements

This project is built upon many excellent open-source projects, including:

verl, FSDP and Megatron-LM for LLM training;
vLLM for LLM inference;
Data-Juicer for data processing pipelines;
AgentScope for agentic workflow;
Ray for distributed systems;
we have also drawn inspirations from RL frameworks like OpenRLHF, TRL and ChatLearn;
......

Citation

@misc{trinity-rft,
      title={Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models},
      author={Xuchen Pan and Yanxi Chen and Yushuo Chen and Yuchang Sun and Daoyuan Chen and Wenhao Zhang and Yuexiang Xie and Yilun Huang and Yilei Zhang and Dawei Gao and Yaliang Li and Bolin Ding and Jingren Zhou},
      year={2025},
      eprint={2505.17826},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.17826},
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.2

Apr 8, 2026

0.5.1

Feb 12, 2026

0.5.0

Feb 5, 2026

0.4.1

Jan 16, 2026

0.4.0

Dec 30, 2025

0.3.3

Nov 27, 2025

This version

0.3.2

Nov 6, 2025

0.3.1

Oct 17, 2025

0.3.0

Sep 9, 2025

0.2.1

Aug 20, 2025

0.2.1.dev0 pre-release

Aug 20, 2025

0.2.0

Jul 15, 2025

0.1.1

Jun 20, 2025

0.1.0

May 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trinity_rft-0.3.2.tar.gz (253.2 kB view details)

Uploaded Nov 6, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

trinity_rft-0.3.2-py3-none-any.whl (331.8 kB view details)

Uploaded Nov 6, 2025 Python 3

File details

Details for the file trinity_rft-0.3.2.tar.gz.

File metadata

Download URL: trinity_rft-0.3.2.tar.gz
Upload date: Nov 6, 2025
Size: 253.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for trinity_rft-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`c6e1af436ff318f6bb86a33909a16d342d867cacfba8d0e7199dcf043ba7041f`
MD5	`90f2110f74ecb4391d92c31970bb1a45`
BLAKE2b-256	`bae4baeed018b590dabe62d3bc70e8c3c4160751443599e98f6b1d81374392f2`

See more details on using hashes here.

File details

Details for the file trinity_rft-0.3.2-py3-none-any.whl.

File metadata

Download URL: trinity_rft-0.3.2-py3-none-any.whl
Upload date: Nov 6, 2025
Size: 331.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for trinity_rft-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`783af64fdb962a42c22f07a759722d76190f05ae9ffcc64d256d3fb95fc0354c`
MD5	`7153b70804f01198e98beed7b1329684`
BLAKE2b-256	`afacb630b3c58cf923170419bfbffefb25d819930cdee9cf418ad07b308cfbb1`

See more details on using hashes here.

trinity-rft 0.3.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models

💡 What is Trinity-RFT?

🌟 Key Features

🔨 Tutorials and Guidelines

🚀 News

Table of Contents

Quick Start

Step 1: installation

From Source (Recommended)

1. Clone the Repository

2. Set Up a Virtual Environment

Using Conda

Using venv

Using uv

Via PyPI

Using Docker

Step 2: prepare dataset and model

Step 3: configurations

Step 4: run the RFT process

Contribution Guide

Acknowledgements

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Using `uv`