Skip to main content

Production-grade LLM fine-tuning framework with CLI

Project description

xlmtec

Production-grade LLM fine tuning, distillation, and pruning from the command line.

Deploy Stars Forks Contributors Python License: MIT


What it does

xlmtec is a modular Python framework for fine-tuning, distilling, and pruning large language models. It wraps HuggingFace Transformers + PEFT in a clean CLI, a validated config system, a composable trainer stack, an interactive TUI, and a full test suite — all CPU-runnable for unit tests.


Install

git clone https://github.com/Abdur-azure/xlmtec.git
cd xlmtec
pip install -e .

5-minute quickstart

# 1. Generate sample training data (no network required)
python examples/generate_sample_data.py

# 2. Not sure which method to use? Ask
xlmtec recommend gpt2 --output my_config.yaml

# 3. Train with the generated config
xlmtec train --config my_config.yaml

# 4. Or use a ready-made config
xlmtec train --config examples/configs/lora_gpt2.yaml

# 5. Launch the interactive TUI
xlmtec tui

CLI commands

Command What it does
xlmtec train Fine-tune using a YAML config or inline flags (LoRA / QLoRA / Full / Instruction / DPO / Distillation)
xlmtec evaluate Score a saved checkpoint (ROUGE, BLEU, Perplexity)
xlmtec benchmark Before/after comparison: base vs fine-tuned
xlmtec merge Merge LoRA adapter into base model → standalone model
xlmtec upload Push adapter or merged model to HuggingFace Hub
xlmtec recommend Inspect model size + VRAM, output optimal YAML config
xlmtec prune Structured pruning — zero lowest-magnitude attention heads
xlmtec wanda WANDA unstructured pruning — zero weights by |W|×activation score
xlmtec tui Interactive Textual TUI — all commands via a terminal UI

Training methods

Method Flag Notes
LoRA --method lora Default. Adapter-based, memory-efficient
QLoRA --method qlora 4-bit quantised LoRA — large models on limited VRAM
Full Fine-Tuning --method full_finetuning All parameters — small models only
Instruction Tuning --method instruction_tuning Alpaca-style {instruction, input, response} data
DPO --method dpo Direct Preference Optimization — requires pip install trl
Response Distillation --method vanilla_distillation Student mimics teacher logits (KL + CE loss)
Feature Distillation --method feature_distillation Student mimics teacher hidden states (MSE + KL + CE)

Pruning commands

# Structured pruning — zero lowest-magnitude attention heads
xlmtec prune ./outputs/gpt2_lora \
    --output ./outputs/gpt2_pruned \
    --sparsity 0.3 \
    --method heads

# WANDA unstructured pruning — weight × activation scoring, zero-shot
xlmtec wanda ./outputs/gpt2_lora \
    --output ./outputs/gpt2_wanda \
    --sparsity 0.5 \
    --dataset ./data/sample.jsonl

Example configs

Config Method Model Data
lora_gpt2.yaml LoRA GPT-2 data/sample.jsonl
qlora_llama.yaml QLoRA LLaMA-3.2-1B HF Hub (needs token)
instruction_tuning.yaml Instruction GPT-2 data/instructions.jsonl
full_finetuning.yaml Full GPT-2 data/sample.jsonl
dpo.yaml DPO GPT-2 data/dpo_sample.jsonl
response_distillation.yaml Response Distillation GPT-2 (student) ← GPT-2-medium data/sample.jsonl
feature_distillation.yaml Feature Distillation GPT-2 (student) ← GPT-2-medium data/sample.jsonl
structured_pruning.yaml Structured Pruning GPT-2
wanda.yaml WANDA Pruning GPT-2 data/sample.jsonl (calibration)

Python API

from xlmtec.core.config import ConfigBuilder
from xlmtec.core.types import TrainingMethod, DatasetSource
from xlmtec.models.loader import load_model_and_tokenizer
from xlmtec.data import prepare_dataset
from xlmtec.trainers import TrainerFactory

config = (
    ConfigBuilder()
    .with_model("gpt2")
    .with_dataset("./data/sample.jsonl", source=DatasetSource.LOCAL_FILE)
    .with_tokenization(max_length=256)
    .with_training(TrainingMethod.LORA, "./output", num_epochs=3)
    .with_lora(r=8, lora_alpha=16)
    .build()
)

model, tokenizer = load_model_and_tokenizer(config.model.to_config())
dataset = prepare_dataset(config.dataset.to_config(), config.tokenization.to_config(), tokenizer)
result = TrainerFactory.train(
    model, tokenizer, dataset,
    config.training.to_config(),
    config.lora.to_config(),
)
print(f"Done. Loss: {result.train_loss:.4f}{result.output_dir}")

Docs


Tests

# Unit tests (no GPU needed)
pytest tests/ -v --ignore=tests/test_integration.py

# Integration tests (CPU ok, ~30s — downloads GPT-2 once)
pytest tests/test_integration.py -v -s

# Full suite
pytest tests/ -v

Project status

Aspect Status
Version 3.13.0
Tests 200+ unit + integration, all green
CI pytest on Python 3.10 / 3.11 / 3.12
Platform Windows / macOS / Linux
License MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlmtec-3.17.0.tar.gz (103.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xlmtec-3.17.0-py3-none-any.whl (101.8 kB view details)

Uploaded Python 3

File details

Details for the file xlmtec-3.17.0.tar.gz.

File metadata

  • Download URL: xlmtec-3.17.0.tar.gz
  • Upload date:
  • Size: 103.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for xlmtec-3.17.0.tar.gz
Algorithm Hash digest
SHA256 6f1ed14202750645a54000521013e877871fed06f4d090028c2f39765e7af292
MD5 c57c568bc993f4d03fc3b7d7e02d9be0
BLAKE2b-256 f278f7cf200978292b21bba07c5dbbca3555c2dd7d18d5c247cc4ad865469795

See more details on using hashes here.

File details

Details for the file xlmtec-3.17.0-py3-none-any.whl.

File metadata

  • Download URL: xlmtec-3.17.0-py3-none-any.whl
  • Upload date:
  • Size: 101.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for xlmtec-3.17.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aaaee042377bdee0c39c62c2bd3a6cffe7fa1d7f96da91680741a062698822f0
MD5 a2275ee1f2d66862b03dab647e07052b
BLAKE2b-256 8343a587da31ba2e458f0fcfcadea265a6db8d826c0342f2c9b0e67783e86745

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page