Skip to main content

Simple LLM fine-tuning with LoRA and DeepSpeed

Project description

neotune

Simple LLM fine-tuning with LoRA and DeepSpeed.

Three inputs: a model, your datasets, and hyperparameters.

Installation

pip install neotune

With optional extras:

pip install "neotune[ray]"       # Ray distributed training
pip install "neotune[logging]"   # MLflow + Weights & Biases
pip install "neotune[all]"       # everything

Quick Start

from neotune import finetune

results = finetune(
    model="meta-llama/Llama-3.1-8B-Instruct",
    datasets={"train": train_ds, "validation": val_ds},
    hyperparameters={"learning_rate": 2e-4, "num_train_epochs": 3},
)

Each dataset is a HuggingFace Dataset with a "text" column containing fully-formatted prompt/response strings, or pre-tokenized columns (input_ids, attention_mask, labels).

Preparing Your Datasets

neotune expects you to bring your own HuggingFace Dataset objects.

From a HuggingFace dataset

from datasets import load_dataset
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")

ds = load_dataset("tatsu-lab/alpaca", split="train")

def format_example(example):
    messages = [
        {"role": "user", "content": example["instruction"]},
        {"role": "assistant", "content": example["output"]},
    ]
    return {"text": tokenizer.apply_chat_template(messages, tokenize=False)}

ds = ds.map(format_example, remove_columns=ds.column_names)
splits = ds.train_test_split(test_size=0.1, seed=42)

# splits["train"] and splits["test"] each have a "text" column

From a CSV file

from datasets import load_dataset
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")

ds = load_dataset("csv", data_files="data.csv", split="train")

def format_example(example):
    messages = [
        {"role": "user", "content": example["prompt"]},
        {"role": "assistant", "content": example["response"]},
    ]
    return {"text": tokenizer.apply_chat_template(messages, tokenize=False)}

ds = ds.map(format_example, remove_columns=ds.column_names)
splits = ds.train_test_split(test_size=0.1, seed=42)

API Reference

finetune(model, datasets, hyperparameters) -> dict

One-call convenience function. Returns test-set metrics if a "test" split was provided, otherwise an empty dict.

from neotune import finetune

results = finetune(
    model="meta-llama/Llama-3.1-8B-Instruct",
    datasets={"train": train_ds, "validation": val_ds, "test": test_ds},
    hyperparameters={"learning_rate": 2e-4},
)

NeoTune(model, datasets, hyperparameters)

Class-based API.

from neotune import NeoTune

nt = NeoTune(
    model="meta-llama/Llama-3.1-8B-Instruct",
    datasets={"train": train_ds, "validation": val_ds},
    hyperparameters={"num_train_epochs": 5, "output_dir": "./my-adapter"},
)

results = nt.train()

Parameters

model -- str HuggingFace model ID or local path.

datasets -- dict[str, Dataset] A dict of HuggingFace Dataset objects. "train" is required. "validation" and "test" are optional.

hyperparameters -- dict, optional Override any default. All keys are optional:

Key Default Description
Training
learning_rate 1e-4 Learning rate
num_train_epochs 3 Number of training epochs
batch_size 1 Per-device batch size
gradient_accumulation_steps 4 Gradient accumulation steps
warmup_ratio 0.03 Warmup ratio
weight_decay 0.01 Weight decay
bf16 True bfloat16 mixed precision
gradient_checkpointing False Gradient checkpointing
logging_steps 10 Log every N steps
eval_steps 50 Evaluate every N steps
save_steps 100 Checkpoint every N steps
save_total_limit 3 Max checkpoints to keep
LoRA
lora_r 16 LoRA rank
lora_alpha 32 LoRA alpha
lora_dropout 0.05 LoRA dropout
lora_target_modules "all-linear" Target modules (auto-detects all linear layers)
Output
output_dir "./adapter-output" Where to save the adapter
hf_repo None Push to HuggingFace Hub
DeepSpeed
ds_config None None (auto: DeepSpeed ZeRO-2 on multi-GPU, disabled on single GPU), "auto" (force ZeRO-2), False (force off), a file path, or a dict
Data
max_len 2048 Max sequence length
dataset_text_field "text" Column name for training text

Methods

  • .train() -> dict -- Fine-tune and return test metrics (if test split provided).
  • .tokenizer -- Access the underlying tokenizer.

GPU & DeepSpeed

neotune auto-detects your GPU setup (single-node):

GPUs Default behavior
0 (CPU) Trains on CPU
1 Standard single-GPU training (no DeepSpeed)
2+ DeepSpeed ZeRO-2 auto-enabled

Override with ds_config:

# Force DeepSpeed off (e.g. multi-GPU notebook without mpi4py)
finetune(model, datasets, {"ds_config": False})

# Force DeepSpeed on (even on single GPU)
finetune(model, datasets, {"ds_config": "auto"})

# Custom DeepSpeed config
finetune(model, datasets, {"ds_config": "my_ds_config.json"})

Notebook users: If you have multiple GPUs but get ModuleNotFoundError: No module named 'mpi4py', either install it (pip install mpi4py) or disable DeepSpeed with "ds_config": False.

Advanced Usage

Generative evaluation

from neotune.eval import generate_and_evaluate

results = generate_and_evaluate(
    model_id="meta-llama/Llama-3.1-8B-Instruct",
    adapter_dir="./my-adapter",
    test_ds=test_ds,
    prompt_col="instruction",
    label_col="expected_output",
)

Distributed training with DeepSpeed (CLI)

deepspeed --num_gpus 4 -m neotune.train --config config.yaml --mode train

Distributed training with Ray

python -m neotune.ray_train --config config.yaml --num_workers 4

Kubernetes (KubeRay)

See k8s/rayjob-lora-sft.yaml for a KubeRay RayJob template.

Environment Variables

Variable Description
HF_TOKEN HuggingFace access token (for gated models)
WANDB_API_KEY Weights & Biases API key (optional)

Create a .env file in your working directory:

HF_TOKEN=your_token_here
WANDB_API_KEY=your_wandb_key_here

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neotune-0.8.0.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

neotune-0.8.0-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file neotune-0.8.0.tar.gz.

File metadata

  • Download URL: neotune-0.8.0.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for neotune-0.8.0.tar.gz
Algorithm Hash digest
SHA256 4f7c3288e6f400b4558166a1abeda925973ff8e72fef6ccf12789b2dbc11a614
MD5 c0b76b7a600ea7ea05ae1f6111e1aed0
BLAKE2b-256 15d1905176c34ed05f215bb2f83f4322c9dbab812953faf1d0e79f42ecfe19d6

See more details on using hashes here.

File details

Details for the file neotune-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: neotune-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for neotune-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f3865845381ebf7b316a7bfb1bc2ea848a78cdd8a03253ddc721ebff8a9a0450
MD5 5f0341fa82b7c6dc93413ce5cecda9d4
BLAKE2b-256 eae39fbeebfb055a1f88f2c6b0086d467083cae8db18c256182cda005dc998dd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page