Skip to main content

TRL Jobs.

Project description

TRL Jobs

A convenient wrapper around hfjobs for running TRL (Transformer Reinforcement Learning) specific workflows on Hugging Face infrastructure.

Installation

pip install trl-jobs

Quick Start

trl-jobs sft --model_name Qwen/Qwen3-0.6B --dataset_name trl-lib/Capybara

Available Commands

For now only SFT (Supervised Fine-Tuning) is supported.

SFT (Supervised Fine-Tuning)

trl-jobs sft --flavor a100-large --model_name Qwen/Qwen3-0.6B --dataset_name trl-lib/Capybara

Required Arguments

  • --model_name: Model name (e.g., Qwen/Qwen3-0.6B)
  • --dataset_name: Dataset name (e.g., trl-lib/Capybara)

Optional Arguments

  • --flavor: Hardware flavor (default: a100-large)
  • -d, --detach: Run job in background and print job ID
  • --token: Hugging Face access token

and any other arguments supported by trl sft. Please refer to the TRL documentation

Supported Configurations

OpenAI GPT-OSS with PEFT

Coming soon!

Meta LLaMA 3

Model Maximum context length # of tokens per effective batch size Command
meta-llama/Meta-Llama-3-8B 4096 262144 trl-jobs sft --model_name meta-llama/Meta-Llama-3-8B --dataset_name ...
meta-llama/Meta-Llama-3-8B-Instruct 4096 262144 trl-jobs sft --model_name meta-llama/Meta-Llama-3-8B-Instruct --dataset_name ...

Meta LLaMA 3 with PEFT

Model Maximum context length # of tokens per effective batch size Command
meta-llama/Meta-Llama-3-8B 24576 196608 trl-jobs sft --model_name meta-llama/Meta-Llama-3-8B --peft --dataset_name ...
meta-llama/Meta-Llama-3-8B-Instruct 24576 196608 trl-jobs sft --model_name meta-llama/Meta-Llama-3-8B-Instruct --peft --dataset_name ...

Qwen3

Model Maximum context length # of tokens per effective batch size Command
Qwen/Qwen3-0.6B 32768 65536 trl-jobs sft --model_name Qwen/Qwen3-0.6B --dataset_name ...
Qwen/Qwen3-1.7B 24576 98304 trl-jobs sft --model_name Qwen/Qwen3-1.7B --dataset_name ...
Qwen/Qwen3-4B 20480 163840 trl-jobs sft --model_name Qwen/Qwen3-1.7B --dataset_name ...
Qwen/Qwen3-8B 4096 262144 trl-jobs sft --model_name Qwen/Qwen3-8B --dataset_name ...

Qwen3 with PEFT

Model Maximum context length # of tokens per effective batch size Command
Qwen/Qwen3-8B 24576 196608 trl-jobs sft --model_name Qwen/Qwen3-8B --peft --dataset_name ...
Qwen/Qwen3-14B 20480 163840 trl-jobs sft --model_name Qwen/Qwen3-14B --peft --dataset_name ...
Qwen/Qwen3-32B 4096 131072 trl-jobs sft --model_name Qwen/Qwen3-32B --peft --dataset_name ...

Authentication

You can provide your Hugging Face token in several ways:

  1. Using huggingface-hub login: huggingface-cli login
  2. Setting the HF_TOKEN environment variable
  3. Using the --token argument

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

Run command to check and format code:

ruff check . --fix && ruff format . --line-length 119

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trl_jobs-0.1.10.tar.gz (5.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

trl_jobs-0.1.10-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file trl_jobs-0.1.10.tar.gz.

File metadata

  • Download URL: trl_jobs-0.1.10.tar.gz
  • Upload date:
  • Size: 5.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for trl_jobs-0.1.10.tar.gz
Algorithm Hash digest
SHA256 5874023ff1185096c6995a3c14e0e8aecfa32a21c9ddd10b677b23db362241f6
MD5 5f4ff0b710aad5b64d1efd0ae1bc4f39
BLAKE2b-256 cc5a9e3154899fff590c6470fadd1ffbdcd562d244c2bc5ce5024501ecb4999b

See more details on using hashes here.

File details

Details for the file trl_jobs-0.1.10-py3-none-any.whl.

File metadata

  • Download URL: trl_jobs-0.1.10-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for trl_jobs-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 fa5935a7527c2f12c482f6c86686366a8ce3945467fdf5a9c812e6c479bcbd26
MD5 dce1a8bee322276cce5aa8deb0c259c1
BLAKE2b-256 3c827f82b82dc7273dce7c929264e74fb777d1e7b0047614deab46b697183ee3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page