Skip to main content

Production-grade LLM fine tuning framework with CLI

Project description

xlmtec

Production-grade LLM fine tuning, distillation, and pruning from the command line.

CI Python License: MIT


What it does

xlmtec is a modular Python framework for fine-tuning, distilling, and pruning large language models. It wraps HuggingFace Transformers + PEFT in a clean CLI, a validated config system, a composable trainer stack, an interactive TUI, and a full test suite — all CPU-runnable for unit tests.


Install

git clone https://github.com/Abdur-azure/xlmtec.git
cd xlmtec
pip install -e .

5-minute quickstart

# 1. Generate sample training data (no network required)
python examples/generate_sample_data.py

# 2. Not sure which method to use? Ask
xlmtec recommend gpt2 --output my_config.yaml

# 3. Train with the generated config
xlmtec train --config my_config.yaml

# 4. Or use a ready-made config
xlmtec train --config examples/configs/lora_gpt2.yaml

# 5. Launch the interactive TUI
xlmtec tui

CLI commands

Command What it does
xlmtec train Fine-tune using a YAML config or inline flags (LoRA / QLoRA / Full / Instruction / DPO / Distillation)
xlmtec evaluate Score a saved checkpoint (ROUGE, BLEU, Perplexity)
xlmtec benchmark Before/after comparison: base vs fine-tuned
xlmtec merge Merge LoRA adapter into base model → standalone model
xlmtec upload Push adapter or merged model to HuggingFace Hub
xlmtec recommend Inspect model size + VRAM, output optimal YAML config
xlmtec prune Structured pruning — zero lowest-magnitude attention heads
xlmtec wanda WANDA unstructured pruning — zero weights by |W|×activation score
xlmtec tui Interactive Textual TUI — all commands via a terminal UI

Training methods

Method Flag Notes
LoRA --method lora Default. Adapter-based, memory-efficient
QLoRA --method qlora 4-bit quantised LoRA — large models on limited VRAM
Full Fine-Tuning --method full_finetuning All parameters — small models only
Instruction Tuning --method instruction_tuning Alpaca-style {instruction, input, response} data
DPO --method dpo Direct Preference Optimization — requires pip install trl
Response Distillation --method vanilla_distillation Student mimics teacher logits (KL + CE loss)
Feature Distillation --method feature_distillation Student mimics teacher hidden states (MSE + KL + CE)

Pruning commands

# Structured pruning — zero lowest-magnitude attention heads
xlmtec prune ./outputs/gpt2_lora \
    --output ./outputs/gpt2_pruned \
    --sparsity 0.3 \
    --method heads

# WANDA unstructured pruning — weight × activation scoring, zero-shot
xlmtec wanda ./outputs/gpt2_lora \
    --output ./outputs/gpt2_wanda \
    --sparsity 0.5 \
    --dataset ./data/sample.jsonl

Example configs

Config Method Model Data
lora_gpt2.yaml LoRA GPT-2 data/sample.jsonl
qlora_llama.yaml QLoRA LLaMA-3.2-1B HF Hub (needs token)
instruction_tuning.yaml Instruction GPT-2 data/instructions.jsonl
full_finetuning.yaml Full GPT-2 data/sample.jsonl
dpo.yaml DPO GPT-2 data/dpo_sample.jsonl
response_distillation.yaml Response Distillation GPT-2 (student) ← GPT-2-medium data/sample.jsonl
feature_distillation.yaml Feature Distillation GPT-2 (student) ← GPT-2-medium data/sample.jsonl
structured_pruning.yaml Structured Pruning GPT-2
wanda.yaml WANDA Pruning GPT-2 data/sample.jsonl (calibration)

Python API

from xlmtec.core.config import ConfigBuilder
from xlmtec.core.types import TrainingMethod, DatasetSource
from xlmtec.models.loader import load_model_and_tokenizer
from xlmtec.data import prepare_dataset
from xlmtec.trainers import TrainerFactory

config = (
    ConfigBuilder()
    .with_model("gpt2")
    .with_dataset("./data/sample.jsonl", source=DatasetSource.LOCAL_FILE)
    .with_tokenization(max_length=256)
    .with_training(TrainingMethod.LORA, "./output", num_epochs=3)
    .with_lora(r=8, lora_alpha=16)
    .build()
)

model, tokenizer = load_model_and_tokenizer(config.model.to_config())
dataset = prepare_dataset(config.dataset.to_config(), config.tokenization.to_config(), tokenizer)
result = TrainerFactory.train(
    model, tokenizer, dataset,
    config.training.to_config(),
    config.lora.to_config(),
)
print(f"Done. Loss: {result.train_loss:.4f}{result.output_dir}")

Docs


Tests

# Unit tests (no GPU needed)
pytest tests/ -v --ignore=tests/test_integration.py

# Integration tests (CPU ok, ~30s — downloads GPT-2 once)
pytest tests/test_integration.py -v -s

# Full suite
pytest tests/ -v

Project status

Aspect Status
Version 3.13.0
Tests 200+ unit + integration, all green
CI pytest on Python 3.10 / 3.11 / 3.12
Platform Windows / macOS / Linux
License MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xlmtec-3.15.0.tar.gz (100.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xlmtec-3.15.0-py3-none-any.whl (94.2 kB view details)

Uploaded Python 3

File details

Details for the file xlmtec-3.15.0.tar.gz.

File metadata

  • Download URL: xlmtec-3.15.0.tar.gz
  • Upload date:
  • Size: 100.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for xlmtec-3.15.0.tar.gz
Algorithm Hash digest
SHA256 747ddbba45957584fcff858370b47186b742195d8e7926cd16050f17364e1af6
MD5 c6cbed9997f178b0856aabaf4df90140
BLAKE2b-256 c40e67a0257cf0fa6801d0a3d020344925d48789db2517355ebdf4406aa74fcc

See more details on using hashes here.

File details

Details for the file xlmtec-3.15.0-py3-none-any.whl.

File metadata

  • Download URL: xlmtec-3.15.0-py3-none-any.whl
  • Upload date:
  • Size: 94.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for xlmtec-3.15.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5f00ac845924a424af9e3c2cd916ef70c45bf9041eb66067554d8820fefff6e3
MD5 05aef23eec45284e2a672701d63be256
BLAKE2b-256 2a5a6b4c6fe7be57b749dcd829d97257f9b46950cff32ade73d9c1cda1a6bc4a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page