TRL Jobs.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

🏭 TRL Jobs

TRL Jobs is a simple wrapper around Hugging Face Jobs that makes it easy to run TRL (Transformer Reinforcement Learning) workflows directly on 🤗 Hugging Face infrastructure.

Think of it as the quickest way to kick off Supervised Fine-Tuning (SFT) and more, without worrying about all the boilerplate setup.

📦 Installation

Get started with a single command:

pip install trl-jobs

⚡ Quick Start

Run your first supervised fine-tuning job in just one line:

trl-jobs sft --model_name Qwen/Qwen3-0.6B --dataset_name trl-lib/Capybara

The training is tracked with Trackio and the fine-tuned model is automatically pushed to the 🤗 Hub.

trackio_sft trained_model_sft

🛠 Available Commands

Right now, SFT (Supervised Fine-Tuning) is supported. More workflows will be added soon!

🔹 SFT (Supervised Fine-Tuning)

trl-jobs sft --model_name Qwen/Qwen3-0.6B --dataset_name trl-lib/Capybara

Required arguments

--model_name → Model to fine-tune (e.g. Qwen/Qwen3-0.6B)
--dataset_name → Dataset to train on (e.g. trl-lib/Capybara)

Optional arguments

--peft → Use PEFT (LoRA) (default: False)
--flavor → Hardware flavor (default: a100-large, only option for now)
--timeout → Max runtime (1h by default). Supports s, m, h, d
-d, --detach → Run in background and print job ID
--namespace → Namespace where the job will run (default: your user namespace)
--token → Hugging Face token (only needed if not logged in)

➡️ You can also pass any arguments supported by trl sft. E.g.

trl-jobs sft --model_name Qwen/Qwen3-0.6B --dataset_name trl-lib/Capybara --learning_rate 3e-5

For the full list, see the TRL CLI docs.

Dataset format

SFT supports various 4 dataset formats.

Standard language modeling
```
example = {"text": "The sky is blue."}
```

Standard prompt-completion

example = {"prompt": "The sky is", "completion": " blue."}

Conversationanl language modeling

example = {"messages": [
    {"role": "user", "content": "What color is the sky?"},
    {"role": "assistant", "content": "It is blue."}
]}

Conversational prompt-completion

example = {"prompt": [{"role": "user", "content": "What color is the sky?"}],
           "completion": [{"role": "assistant", "content": "It is blue."}]}

[!IMPORTANT] When using conversational dataset, ensure that the model has a chat template.

[!NOTE] When using prompt-completion dataset, the loss is only computed on the completion part.

For more details, see the TRL docs - Dataset formats.

📊 Supported Configurations

Here are some ready-to-go setups you can use out of the box.

🦙 Meta LLaMA 3

Model	Max context length	Tokens / batch	Example command
Meta-Llama-3-8B	4096	262,144	`trl-jobs sft --model_name meta-llama/Meta-Llama-3-8B --dataset_name ...`
Meta-Llama-3-8B-Instruct	4096	262,144	`trl-jobs sft --model_name meta-llama/Meta-Llama-3-8B-Instruct --dataset_name ...`

🦙 Meta LLaMA 3 with PEFT

Model	Max context length	Tokens / batch	Example command
Meta-Llama-3-8B	24,576	196,608	`trl-jobs sft --model_name meta-llama/Meta-Llama-3-8B --peft --dataset_name ...`
Meta-Llama-3-8B-Instruct	24,576	196,608	`trl-jobs sft --model_name meta-llama/Meta-Llama-3-8B-Instruct --peft --dataset_name ...`

🐧 Qwen3

Model	Max context length	Tokens / batch	Example command
Qwen3-0.6B-Base	32,768	65,536	`trl-jobs sft --model_name Qwen/Qwen3-0.6B-Base --dataset_name ...`
Qwen3-0.6B	32,768	65,536	`trl-jobs sft --model_name Qwen/Qwen3-0.6B --dataset_name ...`
Qwen3-1.7B-Base	24,576	98,304	`trl-jobs sft --model_name Qwen/Qwen3-1.7B-Base --dataset_name ...`
Qwen3-1.7B	24,576	98,304	`trl-jobs sft --model_name Qwen/Qwen3-1.7B --dataset_name ...`
Qwen3-4B-Base	20,480	163,840	`trl-jobs sft --model_name Qwen/Qwen3-4B-Base --dataset_name ...`
Qwen3-4B	20,480	163,840	`trl-jobs sft --model_name Qwen/Qwen3-4B --dataset_name ...`
Qwen3-8B-Base	4,096	262,144	`trl-jobs sft --model_name Qwen/Qwen3-8B-Base --dataset_name ...`
Qwen3-8B	4,096	262,144	`trl-jobs sft --model_name Qwen/Qwen3-8B --dataset_name ...`

🐧 Qwen3 with PEFT

Model	Max context length	Tokens / batch	Example command
Qwen3-8B-Base	24,576	196,608	`trl-jobs sft --model_name Qwen/Qwen3-8B-Base --peft --dataset_name ...`
Qwen3-8B	24,576	196,608	`trl-jobs sft --model_name Qwen/Qwen3-8B --peft --dataset_name ...`
Qwen3-14B-Base	20,480	163,840	`trl-jobs sft --model_name Qwen/Qwen3-14B-Base --peft --dataset_name ...`
Qwen3-14B	20,480	163,840	`trl-jobs sft --model_name Qwen/Qwen3-14B --peft --dataset_name ...`
Qwen3-32B	4,096	131,072	`trl-jobs sft --model_name Qwen/Qwen3-32B --peft --dataset_name ...`

SmolLM3

Model	Max context length	Tokens / batch	Example command
HuggingFaceTB/SmolLM3-3B-Base	28,672	114,688	`trl-jobs sft --model_name HuggingFaceTB/SmolLM3-3B --dataset_name ...`
HuggingFaceTB/SmolLM3-3B	28,672	114,688	`trl-jobs sft --model_name HuggingFaceTB/SmolLM3-3B --dataset_name ...`

🤖 OpenAI GPT-OSS (with PEFT)

🚧 Coming soon!

💡 Want support for another model?

Open an issue or submit a PR—we’d love to hear from you!

🔑 Authentication

You’ll need a Hugging Face token to run jobs. You can provide it in any of these ways:

Login with huggingface-cli login
Set the environment variable HF_TOKEN
Pass it directly with --token

📜 License

This project is under the MIT License. See the LICENSE file for details.

🤝 Contributing

We welcome contributions! Please open an issue or a PR on GitHub.

Before committing, run formatting checks:

ruff check . --fix && ruff format . --line-length 119

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.4.0

Nov 13, 2025

0.3.0

Nov 12, 2025

0.2.0

Sep 10, 2025

This version

0.1.14

Sep 10, 2025

0.1.13

Sep 9, 2025

0.1.12

Sep 8, 2025

0.1.11

Sep 6, 2025

0.1.10

Sep 6, 2025

0.1.9

Sep 5, 2025

0.1.8

Aug 30, 2025

0.1.7

Aug 30, 2025

0.1.6

Aug 30, 2025

0.1.5

Aug 29, 2025

0.1.4

Aug 29, 2025

0.1.3

Aug 29, 2025

0.1.2

Aug 29, 2025

0.1.0

Jun 9, 2025

0.0.0

Aug 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trl_jobs-0.1.14.tar.gz (7.2 kB view details)

Uploaded Sep 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

trl_jobs-0.1.14-py3-none-any.whl (21.0 kB view details)

Uploaded Sep 10, 2025 Python 3

File details

Details for the file trl_jobs-0.1.14.tar.gz.

File metadata

Download URL: trl_jobs-0.1.14.tar.gz
Upload date: Sep 10, 2025
Size: 7.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for trl_jobs-0.1.14.tar.gz
Algorithm	Hash digest
SHA256	`601f338eb4f5372b28f1588c4104500cb789909c8fd224155d7551628f22c620`
MD5	`83aeef5dbccf927f47eb1b7323993c54`
BLAKE2b-256	`2dbe6fd1cdc204f7d5c1bca798918f61928b4c7823d211d3b3e6b7a858ba2184`

See more details on using hashes here.

File details

Details for the file trl_jobs-0.1.14-py3-none-any.whl.

File metadata

Download URL: trl_jobs-0.1.14-py3-none-any.whl
Upload date: Sep 10, 2025
Size: 21.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for trl_jobs-0.1.14-py3-none-any.whl
Algorithm	Hash digest
SHA256	`17bb3531081caced61ffafa36fd648d160e109591e118b872112dc06737e16cc`
MD5	`ca53fd719d15e9d08c2330f853ed18a7`
BLAKE2b-256	`a5bc4aac09b9a15ba1d240604f3ec6c9dc746b37eee2def01596de83306332ad`

See more details on using hashes here.

trl-jobs 0.1.14

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🏭 TRL Jobs

📦 Installation

⚡ Quick Start

🛠 Available Commands

🔹 SFT (Supervised Fine-Tuning)

Required arguments

Optional arguments

Dataset format

📊 Supported Configurations

🦙 Meta LLaMA 3

🦙 Meta LLaMA 3 with PEFT

🐧 Qwen3

🐧 Qwen3 with PEFT

🤖 OpenAI GPT-OSS (with PEFT)

💡 Want support for another model?

🔑 Authentication

📜 License

🤝 Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes