FMS HF Tuning

These details have not been verified by PyPI

Project description

FMS HF Tuning

Installation
Tuning Techniques
Training and Training Parameter Selection
Supported Models
Data format support
Additional Frameworks

This repo provides basic tuning scripts with support for specific models. The repo relies on Hugging Face SFTTrainer and PyTorch FSDP. Our approach to tuning is:

Models are loaded from Hugging Face transformers or the foundation-model-stack -- models are either optimized to use Flash Attention v2 directly or through SDPA
Hugging Face SFTTrainer for the training loop
FSDP as the backend for multi gpu training

Installation

Refer our Installation guide for details on how to install the library.

Tuning Techniques:

Please refer to our tuning techniques document for details on how to perform -

Training and Training Parameters:

Please refer our document on training to see how to start Single GPU or Multi-GPU runs with fms-hf-tuning.
You can also refer the same a different section of the same document on tips to set various training arguments.

Debug recommendation:

While training, if you encounter flash-attn errors such as undefined symbol, you can follow the below steps for clean installation of flash binaries. This may occur when having multiple environments sharing the pip cache directory or torch version is updated.

pip uninstall flash-attn
pip cache purge
pip install fms-hf-tuning[flash-attn]

Supported Models

While we expect most Hugging Face decoder models to work, we have primarily tested fine-tuning for below family of models.
- IBM Granite
- Meta Llama
- Mistral Ai and
- OpenAI GPT-OSS
LoRA Layers supported : All the linear layers of a model + output lm_head layer. Users can specify layers as a list or use all-linear as a shortcut. Layers are specific to a model architecture and can be specified as noted here

An extended list for tested models is maintaned in the supported models document but might have outdated information.

Data Support

Users can pass training data as either a single file or a Hugging Face dataset ID using the --training_data_path argument along with other arguments required for various use cases. If user choose to pass a file, it can be in any of the supported formats. Alternatively, you can use our powerful data preprocessing backend to preprocess datasets on the fly.

Below, we mention the list of supported data usecases via --training_data_path argument. For details of our advanced data preprocessing see more details in Advanced Data Preprocessing.

EOS tokens are added to all data formats listed below (EOS token is appended to the end of each data point, like a sentence or paragraph within the dataset), except for pretokenized data format at this time. For more info, see pretokenized.

Offline Data Preprocessing

We also provide an interface for the user to perform standalone data preprocessing. This is especially useful if:

The user is working with a large dataset and wants to perform the processing in one shot and then train the model directly on the processed dataset.
The user wants to test out the data preprocessing outcome before training.

Please refer to this document for details on how to perform offline data processing.

Additional Frameworks

Inference

Currently, we do not offer inference support as part of the library, but we provide a standalone script for running inference on tuned models for testing purposes. For a full list of options run python scripts/run_inference.py --help. Note that no data formatting / templating is applied at inference time.

Running a single example

If you want to run a single example through a model, you can pass it with the --text flag.

python scripts/run_inference.py \
--model my_checkpoint \
--text "This is a text the model will run inference on" \
--max_new_tokens 50 \
--out_file result.json

Running multiple examples

To run multiple examples, pass a path to a file containing each source text as its own line. Example:

Contents of source_texts.txt

This is the first text to be processed.
And this is the second text to be processed.

python scripts/run_inference.py \
--model my_checkpoint \
--text_file source_texts.txt \
--max_new_tokens 50 \
--out_file result.json

Inference Results Format

After running the inference script, the specified --out_file will be a JSON file, where each text has the original input string and the predicted output string, as follows. Note that due to the implementation of .generate() in Transformers, in general, the input string will be contained in the output string as well.

[
    {
        "input": "{{Your input string goes here}}",
        "output": "{{Generate result of processing your input string goes here}}"
    },
    ...
]

Changing the Base Model for Inference

If you tuned a model using a local base model, then a machine-specific path will be saved into your checkpoint by Peft, specifically the adapter_config.json. This can be problematic if you are running inference on a different machine than you used for tuning.

As a workaround, the CLI for inference provides an arg for --base_model_name_or_path, where a new base model may be passed to run inference with. This will patch the base_model_name_or_path in your checkpoint's adapter_config.json while loading the model, and restore it to its original value after completion. Alternatively, if you like, you can change the config's value yourself.

NOTE: This can also be an issue for tokenizers (with the tokenizer_name_or_path config entry). We currently do not allow tokenizer patching since the tokenizer can also be explicitly configured within the base model and checkpoint model, but may choose to expose an override for the tokenizer_name_or_path in the future.

Validation

For examples on how to run inference on models trained via fms-hf-tuning see Inference document.

We can use lm-evaluation-harness from EleutherAI for evaluating the generated model. For example, for the Llama-13B model, using the above command and the model at the end of Epoch 5, we evaluated MMLU score to be 53.9 compared to base model to be 52.8.

How to run the validation:

pip install -U transformers
pip install -U datasets
git clone https://github.com/EleutherAI/lm-evaluation-harness
cd lm-evaluation-harness
pip install -e .
python main.py \ 
--model hf-causal \
--model_args pretrained=$MODEL_PATH \ 
--output_path $OUTPUT_PATH/results.json \ 
--tasks boolq,piqa,hellaswag,winogrande,arc_easy,arc_challenge,hendrycksTest-*

The above runs several tasks with hendrycksTest-* being MMLU.

Trainer Controller Framework

Trainer controller is a framework for controlling the trainer loop using user-defined rules and metrics. For details about how you can use set a custom stopping criteria and perform custom operations, see examples/trainercontroller_configs/Readme.md

More Examples

A good simple example can be found here which launches a Kubernetes-native PyTorchJob using the Kubeflow Training Operator with Kueue for the queue management of tuning jobs.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

3.1.1rc3 pre-release

Dec 18, 2025

3.1.1rc1 pre-release

Dec 8, 2025

This version

3.1.0

Nov 11, 2025

3.0.0

Jul 22, 2025

3.0.0rc2 pre-release

Jul 21, 2025

3.0.0rc1 pre-release

Jul 11, 2025

2.8.2

Apr 30, 2025

2.8.2rc1 pre-release

Apr 30, 2025

2.8.1

Apr 28, 2025

2.8.1rc1 pre-release

Apr 28, 2025

2.8.0

Apr 28, 2025

2.8.0rc2 pre-release

Apr 25, 2025

2.8.0rc1 pre-release

Apr 21, 2025

2.7.1

Mar 20, 2025

2.6.0

Feb 18, 2025

2.6.0rc1 pre-release

Feb 17, 2025

2.5.0

Jan 27, 2025

2.4.0

Jan 16, 2025

2.4.0rc2 pre-release

Jan 16, 2025

2.4.0rc1 pre-release

Jan 16, 2025

2.3.1

Dec 23, 2024

2.3.0

Dec 19, 2024

2.2.1

Dec 6, 2024

2.2.0.post2

Dec 6, 2024

2.2.0

Dec 6, 2024

2.2.0rc1 pre-release

Dec 5, 2024

2.1.2

Nov 19, 2024

2.1.2rc1 pre-release

Nov 18, 2024

2.1.1

Nov 6, 2024

2.1.1rc2 pre-release

Nov 1, 2024

2.1.0

Oct 18, 2024

2.1.0rc1 pre-release

Oct 16, 2024

2.0.1

Oct 1, 2024

2.0.0rc2 pre-release

Sep 27, 2024

2.0.0rc1 pre-release

Sep 27, 2024

1.2.2

Sep 3, 2024

1.2.1

Aug 19, 2024

1.2.1rc1 pre-release

Aug 16, 2024

1.2.0

Aug 14, 2024

1.2.0rc1 pre-release

Aug 14, 2024

1.1.0

Aug 1, 2024

1.1.0rc1 pre-release

Jul 30, 2024

1.0.0

Jul 19, 2024

1.0.0rc1 pre-release

Jul 18, 2024

1.0.dev3 pre-release

Apr 4, 2024

1.0.dev2 pre-release

Apr 3, 2024

1.0.dev1 pre-release

Apr 3, 2024

0.4.0

Jun 28, 2024

0.4.0rc3 pre-release

Jun 28, 2024

0.4.0rc2 pre-release

Jun 28, 2024

0.3.0

Jun 12, 2024

0.3.0rc1 pre-release

Jun 12, 2024

0.2.0

May 30, 2024

0.2.0rc1 pre-release

May 29, 2024

0.1.0

May 18, 2024

0.1.0rc1 pre-release

May 16, 2024

0.0.2rc2 pre-release

May 13, 2024

0.0.1

Apr 3, 2024

0.0.1rc10 pre-release

May 16, 2024

0.0.1rc9 pre-release

May 15, 2024

0.0.1rc8 pre-release

May 15, 2024

0.0.1rc7 pre-release

Apr 24, 2024

0.0.1rc6 pre-release

Apr 11, 2024

0.0.1rc5 pre-release

Apr 9, 2024

0.0.0

Nov 10, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fms_hf_tuning-3.1.0-py3-none-any.whl (120.2 kB view details)

Uploaded Nov 11, 2025 Python 3

File details

Details for the file fms_hf_tuning-3.1.0-py3-none-any.whl.

File metadata

Download URL: fms_hf_tuning-3.1.0-py3-none-any.whl
Upload date: Nov 11, 2025
Size: 120.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fms_hf_tuning-3.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fcde71b5578eb05ee44dbf5b7429a480da1381dc7302194318488787305cc31e`
MD5	`edeae030bef678054a50a1cba63219f9`
BLAKE2b-256	`70690ed363d7adca1ecc11d7d82bf5d2de68b168336bbee145e031209cacc1dc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fms_hf_tuning-3.1.0-py3-none-any.whl:

Publisher: build-and-publish.yaml on foundation-model-stack/fms-hf-tuning

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fms_hf_tuning-3.1.0-py3-none-any.whl
- Subject digest: fcde71b5578eb05ee44dbf5b7429a480da1381dc7302194318488787305cc31e
- Sigstore transparency entry: 692790863
- Sigstore integration time: Nov 11, 2025
Source repository:
- Permalink: foundation-model-stack/fms-hf-tuning@9aca2139f4244f500cf2f5b1a0fe2ef3f8251a82
- Branch / Tag: refs/tags/v3.1.0
- Owner: https://github.com/foundation-model-stack
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: build-and-publish.yaml@9aca2139f4244f500cf2f5b1a0fe2ef3f8251a82
- Trigger Event: release

fms-hf-tuning 3.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

FMS HF Tuning

Installation

Tuning Techniques:

Training and Training Parameters:

Debug recommendation:

Supported Models

Data Support

Offline Data Preprocessing

Additional Frameworks

Inference

Running a single example

Running multiple examples

Inference Results Format

Changing the Base Model for Inference

Validation

Trainer Controller Framework

More Examples

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

Provenance