MLX-powered LLM fine-tuning for Apple Silicon - A drop-in replacement for Unsloth
Project description
Unsloth-MLX
Fine-tune LLMs on your Mac with Apple Silicon
Prototype locally, scale to cloud. Same code, just change the import.
Quick Start · Training Methods · Examples · Status
[!NOTE] Why I Built This (A Personal Note)
I rely on Unsloth for my daily fine-tuning on cloud GPUs—it's the gold standard for me. But recently, I started working on a MacBook M4 and hit a friction point: I wanted to prototype locally on my Mac, then scale up to the cloud without rewriting my entire training script.
Since Unsloth relies on Triton (which Macs don't have, yet), I couldn't use it locally. I built
unsloth-mlxto solve this specific "Context Switch" problem. It wraps Apple's native MLX framework in an Unsloth-compatible API.The goal isn't to replace Unsloth or claim superior performance. The goal is code portability: allowing you to write
FastLanguageModelcode once on your Mac, test it, and then push that exact same script to a CUDA cluster. It solves a workflow problem, not just a hardware one.This is an "unofficial" project built by a fan, for fans who happen to use Macs. It's helping me personally, and if it helps others like me, then I'll have my satisfaction.
Why Unsloth-MLX?
Bringing the Unsloth experience to Mac users via Apple's MLX framework.
- 🚀 Fine-tune LLMs locally on your Mac (M1/M2/M3/M4/M5)
- 💾 Leverage unified memory (up to 512GB on Mac Studio)
- 🔄 Same API as Unsloth - your existing code just works!
- 📦 Export anywhere - HuggingFace format, GGUF for Ollama/llama.cpp
# Unsloth (CUDA) # Unsloth-MLX (Apple Silicon)
from unsloth import FastLanguageModel from unsloth_mlx import FastLanguageModel
from trl import SFTTrainer from unsloth_mlx import SFTTrainer
# Rest of your code stays exactly the same!
What This Is (and Isn't)
This is NOT a replacement for Unsloth or an attempt to compete with it. Unsloth is incredible - it's the gold standard for efficient LLM fine-tuning on CUDA.
This IS a bridge for Mac users who want to:
- 🧪 Prototype locally - Experiment with fine-tuning before committing to cloud GPU costs
- 📚 Learn & iterate - Develop your training pipeline with fast local feedback loops
- 🔄 Then scale up - Move to cloud NVIDIA GPUs + original Unsloth for production training
Local Mac (Unsloth-MLX) → Cloud GPU (Unsloth)
Prototype & experiment Full-scale training
Small datasets Large datasets
Quick iterations Production runs
Project Status
🚀 v0.3.2 - Full dataset handling + multi-turn conversations!
| Feature | Status | Notes |
|---|---|---|
| SFT Training | ✅ Stable | Native MLX training |
| Model Loading | ✅ Stable | Any HuggingFace model |
| Save/Export | ✅ Stable | HF format, GGUF |
| DPO Training | ✅ Stable | Full DPO loss |
| ORPO Training | ✅ Stable | Full ORPO loss |
| GRPO Training | ✅ Stable | Multi-generation + reward |
| KTO/SimPO | ✅ Stable | Proper loss implementations |
| Chat Templates | ✅ Stable | 15 models (llama, gemma, qwen, phi, mistral) |
| Response-Only Training | ✅ Stable | train_on_responses_only() |
| Multi-turn Merging | ✅ NEW | to_sharegpt() + conversation_extension |
| Column Mapping | ✅ NEW | apply_column_mapping() auto-rename |
| Dataset Config | ✅ NEW | HFDatasetConfig structured loading |
| Vision Models | ⚠️ Beta | Via mlx-vlm |
| PyPI Package | ✅ Available | uv pip install unsloth-mlx |
Installation
# Using uv (recommended - faster and more reliable)
uv pip install unsloth-mlx
# Or using pip
pip install unsloth-mlx
# From source (for development)
git clone https://github.com/ARahim3/unsloth-mlx.git
cd unsloth-mlx
uv pip install -e .
Quick Start
from unsloth_mlx import FastLanguageModel, SFTTrainer, SFTConfig
from datasets import load_dataset
# Load any HuggingFace model (1B model for quick start)
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="mlx-community/Llama-3.2-1B-Instruct-4bit",
max_seq_length=2048,
load_in_4bit=True,
)
# Add LoRA adapters
model = FastLanguageModel.get_peft_model(
model,
r=16,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
lora_alpha=16,
)
# Load a dataset (or create your own)
dataset = load_dataset("yahma/alpaca-cleaned", split="train[:100]")
# Train with SFTTrainer (same API as TRL!)
trainer = SFTTrainer(
model=model,
train_dataset=dataset,
tokenizer=tokenizer,
args=SFTConfig(
output_dir="outputs",
per_device_train_batch_size=2,
learning_rate=2e-4,
max_steps=50,
),
)
trainer.train()
# Save (same API as Unsloth!)
model.save_pretrained("lora_model") # Adapters only
model.save_pretrained_merged("merged", tokenizer) # Full model
model.save_pretrained_gguf("model", tokenizer, quantization_method="q4_k_m") # GGUF
Chat Templates & Response-Only Training
from unsloth_mlx import get_chat_template, train_on_responses_only
# Apply chat template (supports llama-3, gemma, qwen, phi, mistral, etc.)
tokenizer = get_chat_template(tokenizer, chat_template="llama-3")
# Or auto-detect from model name
tokenizer = get_chat_template(tokenizer, chat_template="auto")
# Train only on responses (not prompts) - more efficient!
trainer = train_on_responses_only(
trainer,
instruction_part="<|start_header_id|>user<|end_header_id|>\n\n",
response_part="<|start_header_id|>assistant<|end_header_id|>\n\n",
)
Supported Training Methods
| Method | Trainer | Implementation | Use Case |
|---|---|---|---|
| SFT | SFTTrainer |
✅ Native MLX | Instruction fine-tuning |
| DPO | DPOTrainer |
✅ Native MLX | Preference learning (proper log-prob loss) |
| ORPO | ORPOTrainer |
✅ Native MLX | Combined SFT + odds ratio preference |
| GRPO | GRPOTrainer |
✅ Native MLX | Reasoning with multi-generation (DeepSeek R1 style) |
| KTO | KTOTrainer |
✅ Native MLX | Kahneman-Tversky optimization |
| SimPO | SimPOTrainer |
✅ Native MLX | Simple preference optimization |
| VLM | VLMSFTTrainer |
⚠️ Beta | Vision-Language models |
Examples
Check examples/ for working code:
- Basic model loading and inference
- Complete SFT fine-tuning pipeline
- RL training methods (DPO, GRPO, ORPO)
Requirements
- Hardware: Apple Silicon Mac (M1/M2/M3/M4/M5)
- OS: macOS 13.0+ (15.0+ recommended for large models)
- Memory: 16GB+ unified RAM (32GB+ for 7B+ models)
- Python: 3.9+
Comparison with Unsloth
| Feature | Unsloth (CUDA) | Unsloth-MLX |
|---|---|---|
| Platform | NVIDIA GPUs | Apple Silicon |
| Backend | Triton Kernels | MLX Framework |
| Memory | VRAM (limited) | Unified (up to 512GB) |
| API | Original | 100% Compatible |
| Best For | Production training | Local dev, large models |
Contributing
Contributions welcome! Areas that need help:
- Custom MLX kernels for even faster training
- More comprehensive test coverage
- Documentation and examples
- Testing on different M-series chips (M1, M2, M3, M4, M5)
- VLM training improvements
License
Apache 2.0 - See LICENSE file.
Acknowledgments
- Unsloth - The original, incredible CUDA library
- MLX - Apple's ML framework
- MLX-LM - LLM utilities for MLX
- MLX-VLM - Vision model support
Community project, not affiliated with Unsloth AI or Apple.
⭐ Star this repo if you find it useful!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file unsloth_mlx-0.3.2.tar.gz.
File metadata
- Download URL: unsloth_mlx-0.3.2.tar.gz
- Upload date:
- Size: 74.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.12 {"installer":{"name":"uv","version":"0.9.12"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0040714025a92dfaf2950f6ead525aad74037d671492e27b73a70278e12d4429
|
|
| MD5 |
430d642033716fc9bb01c7df17b75a35
|
|
| BLAKE2b-256 |
e5c73547b28ba8defa306ed0c2cc3755f5b8593b2e6dafd9420319744a5239b2
|
File details
Details for the file unsloth_mlx-0.3.2-py3-none-any.whl.
File metadata
- Download URL: unsloth_mlx-0.3.2-py3-none-any.whl
- Upload date:
- Size: 57.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.12 {"installer":{"name":"uv","version":"0.9.12"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
675aeb5d935608dc60db998700c851c5ac4abfed14de632d1170dca4a7a4bba5
|
|
| MD5 |
2109492c82767e72557aa97f9b6d3685
|
|
| BLAKE2b-256 |
e4ff113763e82ab472b35cdbb929ccf91591f576c56c5cb1ef203b182be6b4bc
|