Intel extension for peft with PyTorch and DENAS
Project description
Deltatuner
Deltatuner is an extension for Peft to improve LLM fine-tuning speed through multiple optimizations, including: leverage the compact model constructor DE-NAS to construct/modify the compact delta layers in a hardware-aware and train-free approach, and adding more new deltatuning algorithms.
Introduction
Key Components
- Supported parameter efficient finetuning algorithms
- De-Nas: Automatically construct compact and optimal delta layers with train-free and hardware-aware mode (more details here)
- step1: Generate search space for delta layers
- step2: Search algorithm populates delta layers for LM
- step3: Train-free score evaluates LM with adaptive delta layers
Features
- Easy-to-use: provide package install, just need to inject few codes into the original code
- Auto-tuning: automatically select best algorithms and delta structure for finetuning model
Values
- Saving computation power: reduce the computation power and time required to fine-tune a model by reducing parameter size as well as memory footprint.
- Improve accuracy: ensure same or no accuracy regression.
Get Started
Installation
- install the python package
pip install deltatuner
Fast Fine-tuning on Base models
Below is an example of optimizing MPT model by adding the following few-lines to use the delatuner optimizations. It use the DE-NAS in delatuner to optimize a LLM with LoRA layers to a LLM with compact LoRA layers, so as to efficiently improve the LLM fine-tuning process in peak memory reduction and time speedup.
from delta import deltatuner, deltatuner_args
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
from deltatuner import deltatuner, deltatuner_args
# import model from huggingface
model_id = "mosaicml/mpt-7b"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# adding the lora componenents with peft
config = LoraConfig()
lora_model = get_peft_model(model, config)
# delatuner optimize the model with best lora layer configuration
deltatuning_args = deltatuner_args.DeltaTunerArguments()
deltatuner_model = deltatuner.optimize(model=lora_model, tokenizer=tokenizer, deltatuning_args=deltatuning_args)
...
API reference
In above examples, deltatuner.optimize
is a python function to using deltatuner supported optimization algorithms to optimize the model.
def optimize(model, tokenizer, algo: str="auto", deltatuning_args: DeltaTunerArguments=None) -> DeltaTunerModel:
'''
Parameters:
model - a PreTrainedModel or LoraModel. Specifies what model should be optimized
tokenizer - a tokenizer for preprocess text
algo (str, optional) – the algorithm. Specifies what type of adapter algorithm (default: “auto”)
"auto" – If the input model is mpt, the algorithm is ssf; elif the algorithm is lora
"lora" – use the lora algotihm
"ssf" – use the ssf algotithm
deltatuning_args (optional) – the deltatuner configuration. Specifically, deltatuning_args.denas is to use the denas in the optimization (default: True)
Return
DeltaTunerModel - a wrapper of model, which composed of the original properties/function together with adavance properties/function provided by deltatuner
'''
Detailed examples
Please refer to example page for more use cases on fine-tuning other LLMs with the help of DeltaTuner.
Model supported matrix
Causal Language Modeling
Model | LoRA | SSF |
---|---|---|
GPT-2 | ✅ | |
GPT-J | ✅ | ✅ |
Bloom | ✅ | ✅ |
OPT | ✅ | ✅ |
GPT-Neo | ✅ | ✅ |
Falcon | ✅ | ✅ |
LLaMA | ✅ | ✅ |
MPT | ✅ | ✅ |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.