Skip to main content

Fine-tune an LLM with Unsloth

Project description

local_llm_finetune

Fine-tune an LLM with Unsloth.

Installation

  • install unsloth with:
conda create --name unsloth_env \
    python=3.11 \
    pytorch-cuda=12.1 \
    pytorch cudatoolkit xformers -c pytorch -c nvidia -c xformers \
    -y
conda activate unsloth_env

pip install unsloth
  • install the library with pip install local-llm-finetune

Usage

Create training data

The library uses a Pandas DataFrame as the basis of the training data. This dataframe should have two columns, query, and response, corresponding to what you are asking the model and what its response should be. If you don't have your data set up in this way and only have raw documents, you can use the library to get it into this format.

First, convert your documents into .txt format. Then, create a metadata.csv file. This file can contain any columns desired, but must at least have one column named filepath. This column should contain the filenames of the .txt files.

There are two methods for converting the documents to query-response pairs: 1. Synthetic LLM responses 2. Text completion responses

In both cases the documents will be split into chunks as long as specified in the chunk_size parameter.

The first method will query an existing LLM with the prompt format specified in the prompt_format parameter. This string should be in a format like "Metadata: {}, instruction: {}". Where the first brackets will be filled in with the document's metadata, and the second with the chunk's content. This prompt will become the query column in the eventual dataset output. The LLM's response will become the response column.

The second method will split the chunk into two sections, with the ratio determined by the completion_ratio parameter. E.g., if the parameter is 0.75, 75% of the chunk will go into the prompt, and the remaining 25% will go in as the response. Below is a code example of both methods.

from local_llm_finetune.data_prep import process_data
import pandas as pd

metadata = pd.read_csv("path/metadata.csv")
files_path = "path_to_txt_files/"

# LLM dataset
dataset = process_data(
    metadata=metadata,
    files_path=files_path,
    chunk_size=500,
    chunk_overlap=150,
    prompt_format="This is an excerpt from the document with the the following metadata: {}. It is currently about 500 words long. Summarize the information to about 100 words. Here is the excerpt:\n\n'{}'",
    llm_url=llm_url, # this should be the url of the LLM to use, for instance, 'http://localhost:8081/v1/' for a llama.cpp local LLM, or 'https://generativelanguage.googleapis.com/v1beta/openai/' for google
    llm_model=model_name, # can be anything for a local LLM, for a cloud provider, the name of the model, e.g., 'gemini-2.0-flash'
    api_Key=api_key, # api key for the cloud provider, 'sk-no-key-required' for a local llama.cpp server
)

# text completion dataset
dataset = process_data(
    metadata=metadata,
    files_path=files_path,
    chunk_size=1000,
    chunk_overlap=200,
    prompt_format="This is an excerpt from the document with the the following metadata: {}. It is currently about 750 words long. Complete the excerpt with a new text about one third as long. Here is the excerpt: '{}'",
    completion_ratio=0.75, # what % of each chunk to have as the question, 1-ratio = what proportion to have as the answer
)

Fine-tuning a model

from local_llm_finetune.modelling import format_training_data, initialize_model, save_model, setup_training, train

# set up the model training
unsloth_model, tokenizer, max_seq_length = initialize_model(
    base_model_name="unsloth/Llama-3.1-8B-Instruct-bnb-4bit", # set to whatever base model to finetune on HuggingFace
    max_seq_length=2048,
)

# train the model
train_dataset = format_training_data(
    tokenizer,
    dataset,
    input_split=" Here is the excerpt:", # how to split the question into 'Instruction' and 'Input'
)

trainer = setup_training(
    unsloth_model,
    tokenizer,
    train_dataset,
    max_seq_length=2048,
    output_dir="outputs",
)

trainer_stats = train(trainer)

# save the output to GGUF
save_model(
    unsloth_model, tokenizer, quantization_method="q5_k_m", output_dir="trained_model"
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

local_llm_finetune-0.0.2.tar.gz (6.4 kB view details)

Uploaded Source

File details

Details for the file local_llm_finetune-0.0.2.tar.gz.

File metadata

  • Download URL: local_llm_finetune-0.0.2.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for local_llm_finetune-0.0.2.tar.gz
Algorithm Hash digest
SHA256 f950778f861f9fa65f63113548ee0084c39db16612baf3b3d2fb3be7db6b0c16
MD5 9862eb8c4e643bfcb661d8b84f68c889
BLAKE2b-256 3e97a36a1a13deb4fee7dd64206b9cae115dc59e9db2910d1ea25110aa85db6f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page