trinity-rft

Trinity-RFT: A Framework for Training Large Language Models with Reinforcement Fine-Tuning

These details have not been verified by PyPI

Project links

Project description

中文主页 | Tutorial | FAQ

Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models

license

🚀 News

[2025-08] ✨ Trinity-RFT v0.2.1 is released with enhanced features for Agentic RL and Async RL.
[2025-08] 🎵 We introduce CHORD, a dynamic integration of SFT and RL for enhanced LLM fine-tuning (paper).
[2025-08] We now support training on general multi-step workflows! Please check out examples for ALFWorld and ReAct.
[2025-07] Trinity-RFT v0.2.0 is released.
[2025-07] We update the technical report (arXiv v2) with new features, examples, and experiments.
[2025-06] Trinity-RFT v0.1.1 is released.
[2025-05] We release Trinity-RFT v0.1.0 and a technical report.
[2025-04] The initial codebase of Trinity-RFT is open.

💡 What is Trinity-RFT?

Trinity-RFT is a general-purpose, flexible and easy-to-use framework for reinforcement fine-tuning (RFT) of large language models (LLM). It is designed to support diverse application scenarios and serve as a unified platform for exploring advanced RL paradigms in the era of experience.

✨ Key Features

Unified RFT Core:

Supports synchronous/asynchronous, on-policy/off-policy, and online/offline training. Rollout and training can run separately and scale independently on different devices.
First-Class Agent-Environment Interaction:

Handles lagged feedback, long-tailed latencies, and agent/env failures gracefully. Supports multi-turn agent-env interaction.
Optimized Data Pipelines:

Treats rollout tasks and experiences as dynamic assets, enabling active management (prioritization, cleaning, augmentation) throughout the RFT lifecycle.
User-Friendly Design:

Modular and decoupled architecture for easy adoption and development, plus rich graphical user interfaces for low-code usage.

Figure: The high-level design of Trinity-RFT

Figure: The architecture of RFT-core

Trinity-RFT-core-architecture

Figure: Some RFT modes supported by Trinity-RFT

Trinity-RFT-modes

Figure: Concatenated and general multi-step workflows

Trinity-RFT-multi-step

Figure: The architecture of data processors

Trinity-RFT-data-pipeline-buffer

Figure: The high-level design of data pipelines in Trinity-RFT

Trinity-RFT-data-pipelines

🛠️ What can I use Trinity-RFT for?

Adaptation to New Scenarios:

Implement agent-environment interaction logic in a single Workflow or MultiTurnWorkflow class. (Example)
RL Algorithm Development:

Develop custom RL algorithms (loss design, sampling, data processing) in compact, plug-and-play classes. (Example)
Low-Code Usage:

Use graphical interfaces for easy monitoring and tracking of the learning process.

Getting started
Further tutorials
Upcoming features
Contribution guide
Acknowledgements
Citation

Getting started

[!NOTE] This project is currently under active development. Comments and suggestions are welcome!

Step 1: installation

Requirements:

Python version >= 3.10, <= 3.12
CUDA version >= 12.4, <= 12.8
At least 2 GPUs

Installation from source (recommended):

# Pull the source code from GitHub
git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

# Create a new environment using Conda or venv
# Option 1: Conda
conda create -n trinity python=3.10
conda activate trinity

# Option 2: venv
python3.10 -m venv .venv
source .venv/bin/activate

# Install the package in editable mode
# for bash
pip install -e .[dev]
# for zsh
pip install -e .\[dev\]

# Install flash-attn after all dependencies are installed
# Note: flash-attn will take a long time to compile, please be patient.
# for bash
pip install -e .[flash_attn]
# for zsh
pip install -e .\[flash_attn\]
# Try the following command if you encounter errors during flash-attn installation
# pip install flash-attn==2.8.0.post2 -v --no-build-isolation

Installation using pip:

pip install trinity-rft==0.2.0
# install flash-attn separately
pip install flash-attn==2.8.0.post2

Installation from docker: we have provided a dockerfile for Trinity-RFT (trinity)

git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

# build the docker image
# Note: you can edit the dockerfile to customize the environment
# e.g., use pip mirrors or set api key
docker build -f scripts/docker/Dockerfile -t trinity-rft:latest .

# run the docker image
docker run -it --gpus all --shm-size="64g" --rm -v $PWD:/workspace -v <root_path_of_data_and_checkpoints>:/data trinity-rft:latest

Step 2: prepare dataset and model

Trinity-RFT supports most datasets and models from Huggingface and ModelScope.

Prepare the model in the local directory $MODEL_PATH/{model_name}:

# Using Huggingface
huggingface-cli download {model_name} --local-dir $MODEL_PATH/{model_name}

# Using Modelscope
modelscope download {model_name} --local_dir $MODEL_PATH/{model_name}

For more details about model downloading, see Huggingface or ModelScope.

Prepare the dataset in the local directory $DATASET_PATH/{dataset_name}:

# Using Huggingface
huggingface-cli download {dataset_name} --repo-type dataset --local-dir $DATASET_PATH/{dataset_name}

# Using Modelscope
modelscope download --dataset {dataset_name} --local_dir $DATASET_PATH/{dataset_name}

For more details about dataset downloading, see Huggingface or ModelScope.

Step 3: configurations

Trinity-RFT provides a web interface for configuring your RFT process.

[!NOTE] This is an experimental feature, and we will continue to improve it.

To launch the web interface for minimal configurations, you can run

trinity studio --port 8080

Then you can configure your RFT process in the web page and generate a config file. You can save the config file for later use or run it directly as described in the following section.

Advanced users can also edit the config file directly. We provide example config files in examples.

For complete GUI features, please refer to the monorepo for Trinity-Studio.

Example: config manager GUI

config-manager

Step 4: run the RFT process

Start a ray cluster:

# On master node
ray start --head

# On worker nodes
ray start --address=<master_address>

(Optional) Log in to wandb for better monitoring:

export WANDB_API_KEY=<your_api_key>
wandb login

For command-line users, run the RFT process:

trinity run --config <config_path>

For example, below is the command for fine-tuning Qwen2.5-1.5B-Instruct on GSM8k with GRPO:

trinity run --config examples/grpo_gsm8k/gsm8k.yaml

For studio users, click "Run" in the web interface.

Further tutorials

Tutorials for running different RFT modes:

Tutorials for adapting Trinity-RFT to a new multi-turn agentic scenario:

Concatenated Multi-turn tasks

Tutorials for adapting Trinity-RFT to a general multi-step agentic scenario:

Tutorials for data-related functionalities:

Advanced data processing & human-in-the-loop

Tutorials for RL algorithm development/research with Trinity-RFT:

RL algorithm development with Trinity-RFT

Guidelines for full configurations: see this document

Guidelines for developers and researchers:

Upcoming features

A tentative roadmap: #51

Contribution guide

This project is currently under active development, and we welcome contributions from the community!

Code style check:

pre-commit run --all-files

Unit tests:

python -m pytest tests

Acknowledgements

This project is built upon many excellent open-source projects, including:

verl and PyTorch's FSDP for LLM training;
vLLM for LLM inference;
Data-Juicer for data processing pipelines;
AgentScope for agentic workflow;
Ray for distributed systems;
we have also drawn inspirations from RL frameworks like OpenRLHF, TRL and ChatLearn;
......

Citation

@misc{trinity-rft,
      title={Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models},
      author={Xuchen Pan and Yanxi Chen and Yushuo Chen and Yuchang Sun and Daoyuan Chen and Wenhao Zhang and Yuexiang Xie and Yilun Huang and Yilei Zhang and Dawei Gao and Yaliang Li and Bolin Ding and Jingren Zhou},
      year={2025},
      eprint={2505.17826},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.17826},
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.2

Apr 8, 2026

0.5.1

Feb 12, 2026

0.5.0

Feb 5, 2026

0.4.1

Jan 16, 2026

0.4.0

Dec 30, 2025

0.3.3

Nov 27, 2025

0.3.2

Nov 6, 2025

0.3.1

Oct 17, 2025

0.3.0

Sep 9, 2025

0.2.1

Aug 20, 2025

This version

0.2.1.dev0 pre-release

Aug 20, 2025

0.2.0

Jul 15, 2025

0.1.1

Jun 20, 2025

0.1.0

May 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trinity_rft-0.2.1.dev0.tar.gz (184.4 kB view details)

Uploaded Aug 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

trinity_rft-0.2.1.dev0-py3-none-any.whl (239.7 kB view details)

Uploaded Aug 20, 2025 Python 3

File details

Details for the file trinity_rft-0.2.1.dev0.tar.gz.

File metadata

Download URL: trinity_rft-0.2.1.dev0.tar.gz
Upload date: Aug 20, 2025
Size: 184.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for trinity_rft-0.2.1.dev0.tar.gz
Algorithm	Hash digest
SHA256	`3d624861fdfbf693247e028382dc4b02812de180922ec0c389b1c6a009597973`
MD5	`d3858001bb26c08640f4af4cc3478d33`
BLAKE2b-256	`9bb5334237f7d97ca5c00ce4c81512a45ada7822e6bc5bc28831242af2954760`

See more details on using hashes here.

File details

Details for the file trinity_rft-0.2.1.dev0-py3-none-any.whl.

File metadata

Download URL: trinity_rft-0.2.1.dev0-py3-none-any.whl
Upload date: Aug 20, 2025
Size: 239.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for trinity_rft-0.2.1.dev0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58361630e2dcd374171bd92950b3553fe34efecd82bc09e067277648f4eca338`
MD5	`18e2a5ddd1d5928f7f3fecd31ed3d19d`
BLAKE2b-256	`abdcb5c99e94cc6d8067d06b67a80cc876e3edaf32cb81e008f5a393531721bc`

See more details on using hashes here.

trinity-rft 0.2.1.dev0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models

🚀 News

💡 What is Trinity-RFT?

✨ Key Features

🛠️ What can I use Trinity-RFT for?

Table of contents

Getting started

Step 1: installation

Step 2: prepare dataset and model

Step 3: configurations

Step 4: run the RFT process

Further tutorials

Upcoming features

Contribution guide

Acknowledgements

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes