Training framework for Zen Coder models - fine-tune 4B to 1T parameter models on agentic coding data
Project description
Zen Trainer
Training framework for Zen Coder models - fine-tune 4B to 1T parameter models on agentic coding data.
Supported Models
| Model | Base | Size | VRAM (QLoRA) | Context | License |
|---|---|---|---|---|---|
| Zen Coder 4B | Qwen3-4B-Instruct | 4B | 8 GB | 32K | Apache 2.0 |
| Zen Coder 24B | Devstral-Small-2-24B | 24B | 24 GB | 256K | Apache 2.0 |
| Zen Coder 123B | Devstral-2-123B | 123B | 128 GB | 256K | Mistral Research |
| Zen Coder MAX | GLM-4.7 (358B MoE) | 358B | 180 GB | 200K | GLM-4 License |
| Zen Coder ULTRA | Kimi-K2 (1T MoE) | 1T | 400 GB | 128K | MIT |
Installation
# Basic installation
pip install zen-trainer
# With MLX support (Apple Silicon)
pip install zen-trainer[mlx]
# With Unsloth (2x faster NVIDIA training)
pip install zen-trainer[unsloth]
# With DeepSpeed (multi-GPU)
pip install zen-trainer[deepspeed]
# With evaluation suite
pip install zen-trainer[eval]
# Everything
pip install zen-trainer[all]
Quick Start
Training
from zen_trainer import ZenTrainer
# Train Zen Coder 4B on your data
trainer = ZenTrainer(
model_key="qwen3-4b",
dataset_path="path/to/your/dataset",
output_dir="./output/zen-coder-4b",
)
trainer.train()
Benchmarking
from zen_trainer import ZenBenchmark
# Benchmark against SoTA models
bench = ZenBenchmark(
model_path="./output/zen-coder-4b",
model_key="qwen3-4b",
)
results = bench.run_all()
bench.compare_to_baseline()
Command Line
# Train
zen-train --model qwen3-4b --dataset ./data --output ./output
# Benchmark
zen-benchmark --model ./output/zen-coder-4b --suite all
Training Costs
For 3.35M samples (8.47B tokens) on 8xH200 @ $35/hr:
| Model | Cloud Hours | Cloud Cost | Local (Mac Studio 512GB) |
|---|---|---|---|
| Zen Coder 4B | 9h | $326 | 2 days (FREE) |
| Zen Coder 24B | 23h | $814 | 5 days (FREE) |
| Zen Coder 123B | 62h | $2,171 | 13 days (FREE) |
| Zen Coder MAX | 116h | $4,071 | 19 days (FREE) |
| Zen Coder ULTRA | 310h | $10,856 | Too large |
✓ = Fits in 128GB (M3 Ultra / single GPU node) ◆ = Fits Mac Studio 512GB or 8xH200
Backends
The trainer automatically selects the best backend:
| Backend | Hardware | Speed | Models |
|---|---|---|---|
| MLX | Apple Silicon | 1x | 4B, 24B, MAX |
| Unsloth | NVIDIA GPU | 2x | 4B, 24B, 123B, MAX |
| DeepSpeed | Multi-GPU | 1x | All (required for ULTRA) |
12 ARC Benchmarks
Evaluation suite based on GLM-4.5:
Agentic:
- TAU-Bench (tool-agent-user interaction)
- BFCL V3 (Berkeley Function Call Leaderboard)
- BrowseComp (web browsing agent)
Reasoning:
- MMLU-Pro, AIME-24, MATH-500, SciCode
- GPQA, HLE (Humanity's Last Exam)
- LiveCodeBench
Coding:
- SWE-bench Verified (real GitHub issues)
- Terminal-Bench (terminal environment tasks)
Dataset
Models are designed for the Zen Agentic Dataset:
| Metric | Value |
|---|---|
| Total Tokens | 8.47 billion |
| Training Samples | 3.35 million |
| Validation Samples | 100,000 |
| Size | ~27 GB |
| Time Span | 15 years (2010-2025) |
Data Composition
- 29% Claude Code debug sessions (real agentic programming)
- 23% Claude conversations and interactions
- 48% Git history (commits, diffs, source files)
Model Architecture
Hyperparameters
| Model | LoRA r | LoRA α | Batch | LR | Epochs |
|---|---|---|---|---|---|
| 4B | 64 | 128 | 4 | 2e-4 | 2 |
| 24B | 32 | 64 | 2 | 1e-4 | 2 |
| 123B | 16 | 32 | 1 | 5e-5 | 1 |
| MAX | 16 | 32 | 1 | 5e-6 | 1 |
| ULTRA | 8 | 16 | 1 | 1e-6 | 1 |
API Reference
ZenTrainer
trainer = ZenTrainer(
model_key: str, # Model identifier (qwen3-4b, devstral-24b, etc.)
dataset_path: str, # HuggingFace dataset or local path
output_dir: str, # Output directory for checkpoints
backend: str = "auto", # mlx, unsloth, deepspeed, or auto
epochs: int = None, # Override default epochs
batch_size: int = None, # Override default batch size
learning_rate: float = None, # Override default learning rate
)
trainer.train() # Start training
trainer.save_model() # Save final checkpoint
trainer.push_to_hub(repo) # Push to HuggingFace
ZenBenchmark
bench = ZenBenchmark(
model_path: str, # Path to model checkpoint
model_key: str, # Model identifier for config
benchmarks: list = None, # Specific benchmarks or None for all
)
results = bench.run_all() # Run all 12 ARC benchmarks
bench.run_agentic() # TAU-Bench, BFCL V3, BrowseComp
bench.run_reasoning() # MMLU, AIME, MATH, etc.
bench.run_coding() # SWE-bench, Terminal-Bench
bench.compare_to_baseline() # Compare to SoTA models
Model Configurations
from zen_trainer import get_model_config, list_models_by_vram, estimate_training_cost
# Get model config
cfg = get_model_config("glm47-358b")
print(cfg.vram_qlora) # 180 GB
# List models for your hardware
models = list_models_by_vram(128) # Models that fit 128GB
# Estimate training cost
cost = estimate_training_cost("devstral-24b", num_samples=100000)
print(cost) # {'hours_estimate': (2.0, 4.0), 'cost_estimate_usd': (70, 140), ...}
Related Projects
- Zen Agentic Dataset - Training data
- Zen Coder Models - Fine-tuned models
- GLM Simple Evals - Evaluation toolkit
- Hanzo MCP - Model Context Protocol tools
- Hanzo AI - AI infrastructure platform
Citation
@software{zen_trainer,
author = {Kelling, Zach},
title = {Zen Trainer: Fine-tuning Framework for Agentic Coding Models},
year = {2025},
publisher = {Zoo Labs Foundation},
url = {https://github.com/zenlm/zen-trainer}
}
License
Apache 2.0
Maintainer: z@hanzo.ai
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zen_trainer-0.1.0.tar.gz.
File metadata
- Download URL: zen_trainer-0.1.0.tar.gz
- Upload date:
- Size: 18.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3e00561b06cb6d9a7d77ff2f2cf0e0acdef6e9622a15adb0fbbf2f2362d7743
|
|
| MD5 |
444019418e9a09b733d0dae7b1f48a37
|
|
| BLAKE2b-256 |
53e5c99e4427cd70222be63bbc805882b8863aeafa1355dc4374a0b0cd80aebe
|
File details
Details for the file zen_trainer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: zen_trainer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8be03318a835248522bd37d8adc5b19e7bb94832aae5bd73e778f7df0a16f630
|
|
| MD5 |
3869c564c99ee173adb07167e2dd3cfe
|
|
| BLAKE2b-256 |
5283a0514c83a6abc4e3d9fc80cec653d5404959ced964bbd7ab2a755ee38c58
|