Fine-tuning LLMs for instruction using QLoRA.
Project description
tLLM is a comprehensive Python library that simplifies the fine-tuning of LLMs for instruction-tuning on custom datasets. It is specifically designed to abstract away complexities associated with various libraries from the Hugging Face ecosystem, offering a more encapsulated and user-friendly approach.
Our goal with tLLM is to simplify the process of fine-tuning LLMs, making it more accessible, especially for individuals who are new to the realm of AI. The aim is to lower the barrier of entry using the state-of-the-art open source stack 🚀🚀.
The purpose of fine-tuning LLMs is to tailor or constrain its output to a downstream task's format. The more specific the format requirements of your task, the more beneficial fine-tuning will be.
Features
- LLM Fine-Tuning: Fine-tunes LLMs using HF's
Trainer
for custom datasets stored in the Hub. - Bits and Bytes: Loads model with 4-bit quantization.
- PEFT: Uses LoRA under the hood, a popular and lightweight training technique that significantly reduces the number of trainable parameters. We combine 4-bit quantization lowering the barrier for the amount of memory required during training (QLoRA).
- Dataset Preprocessing: Converts dataset into a prompt template for fine-tuning.
- Weights & Biases Integration: Track and log your experiments using wandb.
TODO
- Surface the prompt templates.
- Model eval functionality post-training.
- Add full list of training args to yaml.
- Provide inference snippet for testing after training.
- Fully Sharded Data Parallel (FSDP): Utilizes efficient training across distributed systems.
Install
pip install trainllm
Dataset Format
The trainer expects to ingest a train
and validation
split from your dataset prior to training with a specific format. More specifically, your custom dataset will require schema
, input
, and output
headers. Here is an example of a placeholder dataset, designed to demonstrate the expected format.
The following tasks are supported for downstream text generation:
text2sql
: generates SQL query from question in natural languagetext2cypher
: generates Cypher query from question in natural languageinput2output
: custom task for generating custom output. If this is selected, you must include acontext
parameter in the Trainer constructor. This context provides guidance to the LLM on interpreting the input and output texts. For instance, if your task is to summarize news articles, the context would be defined as: "Given a news article, construct a summary paragraph ..." etc.
Prompt templates for supported tasks can be found in the PromptHandler class.
Start Training
To start training, the only requirements are a project name
, task
, your Hugging Face model
/dataset
stubs and the path to your YAML config_file
. This file includes the essential tokenizer, LoRA and training arguments for fine-tuning. Before beginning the training process, ensure you download the YAML file from this repository using either the curl or wget commands to access its contents. As previously stated, if your task
is input2output, you'll need to add an additional context
parameter to the constructor.
curl -o config.yml https://raw.githubusercontent.com/InquestGeronimo/tllm/main/config.yml
wget -O config.yml https://raw.githubusercontent.com/InquestGeronimo/tllm/main/config.yml
Run the trainer:
from tllm import Trainer
tllm = Trainer(
project_name="tllm-training-run1",
task="text2cypher", # tex2sql or input2output
model_id="codellama/CodeLlama-7b-Instruct-hf",
dataset_id="zeroshot/text-2-cypher",
config_file="path/to/config.yml"
)
tllm.train()
After training completes, the adapter will be saved in your output directory. The pre-trained model will not be saved.
HyperParameter Knowledge
Configuring hyperparameters and LoRA settings can be a complex task, especially for those new to AI engineering. Our repository is designed to lessen the dependence on hyperparameters. We hope to establish a set of baseline hyperparameters known to yield good results on specific tasks, and we will save them to this repository, thereby streamlining the fine-tuning process for users. However, having a thorough understanding of these hyperparameters is still advantageous, particularly if you intend to modify them yourself.
Three key factors affect hyperparameters during training:
- The type and size of the model.
- The type and quantity of hardware.
- The size of the dataset.
For accessing tllm's hyperparameters, you can refer to the config_file. Present parameters in the serve as placeholders.
The first set of parameters pertains to the LoRA settings:
# LoRA configuration settings
r=8, # The size of the LoRA's rank. Opting for a higher rank could negate the efficiency benefits of using LoRA. The higher the rank the largar the checkpoint file is.
lora_alpha=16, # This is the scaling factor for LoRA. It controls the magnitude of the adjustments made by LoRA.
target_modules=[ # Specifies the parts of the model where LoRA is applied. These can be components of the transformer architecture.
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj",
"lm_head",
],
bias="none", # Indicates that biases are not adapted as part of the LoRA process.
lora_dropout=0.05, # The dropout rate for LoRA layers. It's a regularization technique to prevent overfitting.
task_type="CAUSAL_LM" # Specifies the type of task. Here, it indicates the model is for causal language modeling.
For further details, refer to the PEFT documentation or read the blogs found at the end of this README.
The following set pertains specifically to the training arguments:
# Trainer configuration settings
output_dir="./output", # Directory where the training outputs and model checkpoints will be written.
warmup_steps=1, # Number of steps to perform learning rate warmup.
per_device_train_batch_size=32, # Batch size per device during training.
gradient_accumulation_steps=1, # Number of updates steps to accumulate before performing a backward/update pass.
gradient_checkpointing=True, # Enables gradient checkpointing to save memory at the expense of slower backward pass.
max_steps=1000, # Total number of training steps to perform.
learning_rate=1e-4, # Initial learning rate for the optimizer.
bf16=True, # Use bfloat16 mixed precision training instead of the default fp32.
optim="paged_adamw_8bit", # The optimizer to use, here it's a variant of AdamW optimized for 8-bit computing.
logging_dir="./logs", # Directory to store logs.
save_strategy="steps", # Strategy to use for saving a model checkpoint ('steps' means saving at every specified number of steps).
save_steps=25, # Number of steps to save a checkpoint after.
evaluation_strategy="steps", # Strategy to use for evaluation ('steps' means evaluating at every specified number of steps).
eval_steps=50, # Number of training steps to perform evaluation after.
do_eval=True, # Whether to run evaluation on the validation set.
report_to="wandb", # Tool to use for logging and tracking (Weights & Biases in this case).
remove_unused_columns=True, # Whether to remove columns not used by the model when using a dataset.
run_name="run-name", # Name of the experiment run, usually containing the project name and timestamp.
The provided parameters, while not comprehensive, cover the most critical ones for fine-tuning. Particularly, per_device_train_batch_size
and learning_rate
are the most sensitive and influential during this process.
Resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file trainllm-0.0.4.tar.gz
.
File metadata
- Download URL: trainllm-0.0.4.tar.gz
- Upload date:
- Size: 17.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d53550e449954fbb3c0807492c10469d43166ca5d3e2d79d2758d4bb7ffb406e |
|
MD5 | 1704a05340e35204eb2c19d4169e9324 |
|
BLAKE2b-256 | 1d7a0b687a923221ef5af38f05c8d56aba43c989ec1e71b87067c5b11655a474 |
File details
Details for the file trainllm-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: trainllm-0.0.4-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58374c33465c6e8cb3e1135ae33fc094c66075ad0b8618b91690e0ab0a83c451 |
|
MD5 | 2882de2d45f010443d04e5fb44eb7aca |
|
BLAKE2b-256 | 37a0b1614588f2a74443563ca0c3765f0c950671d4c3f348ea9e2a27296f7a7e |