Production-grade LLM fine tuning framework with CLI
Project description
xlmtec
Production-grade LLM fine tuning, distillation, and pruning from the command line.
What it does
xlmtec is a modular Python framework for fine-tuning, distilling, and pruning large language models. It wraps HuggingFace Transformers + PEFT in a clean CLI, a validated config system, a composable trainer stack, an interactive TUI, and a full test suite — all CPU-runnable for unit tests.
Install
git clone https://github.com/Abdur-azure/xlmtec.git
cd xlmtec
pip install -e .
5-minute quickstart
# 1. Generate sample training data (no network required)
python examples/generate_sample_data.py
# 2. Not sure which method to use? Ask
xlmtec recommend gpt2 --output my_config.yaml
# 3. Train with the generated config
xlmtec train --config my_config.yaml
# 4. Or use a ready-made config
xlmtec train --config examples/configs/lora_gpt2.yaml
# 5. Launch the interactive TUI
xlmtec tui
CLI commands
| Command | What it does |
|---|---|
xlmtec train |
Fine-tune using a YAML config or inline flags (LoRA / QLoRA / Full / Instruction / DPO / Distillation) |
xlmtec evaluate |
Score a saved checkpoint (ROUGE, BLEU, Perplexity) |
xlmtec benchmark |
Before/after comparison: base vs fine-tuned |
xlmtec merge |
Merge LoRA adapter into base model → standalone model |
xlmtec upload |
Push adapter or merged model to HuggingFace Hub |
xlmtec recommend |
Inspect model size + VRAM, output optimal YAML config |
xlmtec prune |
Structured pruning — zero lowest-magnitude attention heads |
xlmtec wanda |
WANDA unstructured pruning — zero weights by |W|×activation score |
xlmtec tui |
Interactive Textual TUI — all commands via a terminal UI |
Training methods
| Method | Flag | Notes |
|---|---|---|
| LoRA | --method lora |
Default. Adapter-based, memory-efficient |
| QLoRA | --method qlora |
4-bit quantised LoRA — large models on limited VRAM |
| Full Fine-Tuning | --method full_finetuning |
All parameters — small models only |
| Instruction Tuning | --method instruction_tuning |
Alpaca-style {instruction, input, response} data |
| DPO | --method dpo |
Direct Preference Optimization — requires pip install trl |
| Response Distillation | --method vanilla_distillation |
Student mimics teacher logits (KL + CE loss) |
| Feature Distillation | --method feature_distillation |
Student mimics teacher hidden states (MSE + KL + CE) |
Pruning commands
# Structured pruning — zero lowest-magnitude attention heads
xlmtec prune ./outputs/gpt2_lora \
--output ./outputs/gpt2_pruned \
--sparsity 0.3 \
--method heads
# WANDA unstructured pruning — weight × activation scoring, zero-shot
xlmtec wanda ./outputs/gpt2_lora \
--output ./outputs/gpt2_wanda \
--sparsity 0.5 \
--dataset ./data/sample.jsonl
Example configs
| Config | Method | Model | Data |
|---|---|---|---|
lora_gpt2.yaml |
LoRA | GPT-2 | data/sample.jsonl |
qlora_llama.yaml |
QLoRA | LLaMA-3.2-1B | HF Hub (needs token) |
instruction_tuning.yaml |
Instruction | GPT-2 | data/instructions.jsonl |
full_finetuning.yaml |
Full | GPT-2 | data/sample.jsonl |
dpo.yaml |
DPO | GPT-2 | data/dpo_sample.jsonl |
response_distillation.yaml |
Response Distillation | GPT-2 (student) ← GPT-2-medium | data/sample.jsonl |
feature_distillation.yaml |
Feature Distillation | GPT-2 (student) ← GPT-2-medium | data/sample.jsonl |
structured_pruning.yaml |
Structured Pruning | GPT-2 | — |
wanda.yaml |
WANDA Pruning | GPT-2 | data/sample.jsonl (calibration) |
Python API
from xlmtec.core.config import ConfigBuilder
from xlmtec.core.types import TrainingMethod, DatasetSource
from xlmtec.models.loader import load_model_and_tokenizer
from xlmtec.data import prepare_dataset
from xlmtec.trainers import TrainerFactory
config = (
ConfigBuilder()
.with_model("gpt2")
.with_dataset("./data/sample.jsonl", source=DatasetSource.LOCAL_FILE)
.with_tokenization(max_length=256)
.with_training(TrainingMethod.LORA, "./output", num_epochs=3)
.with_lora(r=8, lora_alpha=16)
.build()
)
model, tokenizer = load_model_and_tokenizer(config.model.to_config())
dataset = prepare_dataset(config.dataset.to_config(), config.tokenization.to_config(), tokenizer)
result = TrainerFactory.train(
model, tokenizer, dataset,
config.training.to_config(),
config.lora.to_config(),
)
print(f"Done. Loss: {result.train_loss:.4f} → {result.output_dir}")
Docs
- Usage Guide — all 9 commands with examples
- Configuration Reference — YAML config fields for all methods
- API Reference — Python API for all trainers and pruners
- TUI Guide — interactive terminal interface
- Architecture — module design
- Contributing — how to add trainers or commands
Tests
# Unit tests (no GPU needed)
pytest tests/ -v --ignore=tests/test_integration.py
# Integration tests (CPU ok, ~30s — downloads GPT-2 once)
pytest tests/test_integration.py -v -s
# Full suite
pytest tests/ -v
Project status
| Aspect | Status |
|---|---|
| Version | 3.13.0 |
| Tests | 200+ unit + integration, all green |
| CI | pytest on Python 3.10 / 3.11 / 3.12 |
| Platform | Windows / macOS / Linux |
| License | MIT |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file xlmtec-3.15.0.tar.gz.
File metadata
- Download URL: xlmtec-3.15.0.tar.gz
- Upload date:
- Size: 100.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
747ddbba45957584fcff858370b47186b742195d8e7926cd16050f17364e1af6
|
|
| MD5 |
c6cbed9997f178b0856aabaf4df90140
|
|
| BLAKE2b-256 |
c40e67a0257cf0fa6801d0a3d020344925d48789db2517355ebdf4406aa74fcc
|
File details
Details for the file xlmtec-3.15.0-py3-none-any.whl.
File metadata
- Download URL: xlmtec-3.15.0-py3-none-any.whl
- Upload date:
- Size: 94.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f00ac845924a424af9e3c2cd916ef70c45bf9041eb66067554d8820fefff6e3
|
|
| MD5 |
05aef23eec45284e2a672701d63be256
|
|
| BLAKE2b-256 |
2a5a6b4c6fe7be57b749dcd829d97257f9b46950cff32ade73d9c1cda1a6bc4a
|