Apple Silicon MLX fine-tuning toolkit — SFT, DPO/ORPO, GRPO, distillation, and OpenAI-compatible serving.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

MLXSmith

Fine-tune language models on Apple Silicon. SFT, preference optimization, reinforcement learning, distillation, and serving — all native to MLX.

Status: Alpha (v0.1.9) · Validated on Qwen3-4B and Qwen3-1.7B

Features

Supervised fine-tuning — LoRA and QLoRA with configurable optimizers
Preference optimization — DPO, ORPO, IPO, CPO, SimPO, and more
Reinforcement learning — GRPO with verifier-based rewards
Knowledge distillation — Offline and online preference distillation
KTO — Kahneman-Tversky Optimization from binary feedback
Online DPO — Live preference tuning with LLM judge scoring
Self-verification training — Policy gradient from self-assessed rewards
Synthetic data generation — Generate, evolve, and filter training data
External model backends — Use Codex, Claude, Gemini CLIs or any OpenAI-compatible API for data generation and judging
Recursive training — Self-improving RLM loop with task generation and gating
Serving — OpenAI-compatible API with streaming
Web dashboard (Next.js) — Models, adapters, training, eval, chat, and serving UI
Environment plugins — Reusable task and verifier packages for RL training
Experimental mHC adapters — Optional block-local mHC patching for MLX transformer blocks (not a speedup)

Requirements

macOS with Apple Silicon (M1 or later)
Python 3.10+

Data tools, configuration, and project scaffolding work on any platform.

Install

pip install "mlxsmith[all]"

Selective install

# Core only (data tools, config, scaffolding)
pip install mlxsmith

# Apple Silicon training
pip install "mlxsmith[mlx,llm]"

# Training + serving
pip install "mlxsmith[mlx,llm,serve]"

Quickstart

# 1. Create a project
mlxsmith init myproj && cd myproj

# 2. Verify your environment
mlxsmith doctor

# 3. Pull a model
mlxsmith pull mlx-community/Qwen3-4B-Instruct-2507-4bit

# 4. Pull training data
mlxsmith data pull --preset alpaca

# 5. Fine-tune
mlxsmith sft \
  --model cache/mlx/mlx-community__Qwen3-4B-Instruct-2507-4bit \
  --data data/sft

# 6. Serve the result
mlxsmith serve --model runs/sft_0001/adapter --port 8080

See Getting Started for a complete walkthrough.

End-to-end Smoke (Qwen3-1.7B)

This repo includes an end-to-end smoke run that validates the full pipeline (SFT → Pref → RFT → RLM) on Qwen/Qwen3-1.7B-MLX-4bit.

mlxsmith pull Qwen/Qwen3-1.7B-MLX-4bit
./scripts/exp_qwen3_1.7b_mlx_4bit_e2e_smoke.sh

# Optional: also smoke-test `mlxsmith serve` + OpenAI-compatible endpoint
SMOKE_SERVE=1 ./scripts/exp_qwen3_1.7b_mlx_4bit_e2e_smoke.sh

The smoke run uses qwen3_1.7b_mlx_4bit_smoke.yaml and the tiny datasets in data/sft and data/prefs.

Repo SFT (Qwen3-1.7B)

To build a small “MLXSmith repo assistant” adapter on top of Qwen/Qwen3-1.7B-MLX-4bit, use the repo-grounded SFT script:

# 1) Generate seed prompts from the repo
python3 scripts/make_repo_seed_prompts.py --out data/mlxsmith_prompts.jsonl

# 2) Generate responses (via Codex) + train LoRA
NUM=300 BATCH=4 ITERS=2000 LR=2e-4 ./scripts/exp_qwen3_1.7b_mlx_4bit_repo_sft.sh

Notes:

The script uses codex exec by default. Override with MLXSMITH_CLI_CODEX_CMD if needed.
Qwen output sanitization is enabled in the included configs via infer.strip_think: true.

Web Dashboard (Optional)

Run the API server, then start the Next.js dashboard:

# Terminal 1: start the OpenAI-compatible API
mlxsmith serve --model cache/mlx/mlx-community__Qwen3-4B-Instruct-2507-4bit --port 8080

# Terminal 2: start the dashboard
cd apps/web
npm install
npm run dev

The dashboard defaults to http://localhost:8080 for the API base URL (change in Settings if needed).

Training Modes

Mode	Command	Input Format	Use Case
SFT	`mlxsmith sft`	`{prompt, response}`	Instruction-following via LoRA
Preference	`mlxsmith pref`	`{prompt, chosen, rejected}`	Alignment with DPO, ORPO, and others
KTO	`mlxsmith kto`	`{prompt, response, label}`	Binary good/bad feedback
GRPO	`mlxsmith rft`	Environment + verifier	Reward-driven reinforcement learning
Online DPO	`mlxsmith online-dpo`	`{prompt}`	Online preference with LLM judge
Self-verify	`mlxsmith self-verify`	`{prompt}`	Self-verification reward signal
Distillation	`mlxsmith distill`	`{prompt}`	Teacher-to-student transfer
Judge	`mlxsmith judge`	Judge-format data	Train a scoring model
Pipeline	`mlxsmith pipeline`	Combined	SFT then Pref then RFT then RLM

See Concepts for an explanation of each training mode.

Tools

Tool	Command	Description
Data	`mlxsmith data`	Import, split, validate, and pull datasets
Synthetic	`mlxsmith synthetic`	Generate and evolve training data
Eval	`mlxsmith eval`	Run evaluation suites with pass@k
Bench	`mlxsmith bench`	Benchmark inference and training throughput
Serve	`mlxsmith serve`	OpenAI-compatible model server
RLM	`mlxsmith rlm`	Recursive training loop + REPL-based inference

External Model Backends

MLXSmith can use powerful cloud models for synthetic data generation and judging while keeping fine-tuning local on Apple Silicon.

Supported backends:

cli — shell out to Codex/Claude/Gemini CLIs (or any command you provide)
openai — call any OpenAI-compatible Chat Completions endpoint

Note: training commands (sft, pref, rft, rlm loop) still require a local training backend like mlx-lm.

CLI Backend — Shell out to Codex, Claude, or Gemini CLIs:

# Use a CLI model for prompt generation
export MLXSMITH__MODEL__BACKEND=cli
export MLXSMITH_CLI_CODEX_CMD='codex exec --full-auto --model gpt-5.2'

# If your CLI expects the prompt as an argument instead of stdin:
# export MLXSMITH_CLI_PROMPT_FLAG='--prompt'

mlxsmith synthetic prompts \
  --model codex \
  --seed-prompts data/seeds.jsonl \
  --num 100 \
  --out data/prompts.jsonl

# Use a CLI model as judge for filtering
mlxsmith synthetic sft \
  --model codex \
  --judge-backend cli \
  --judge-model claude \
  --prompts data/prompts.jsonl \
  --out data/sft.jsonl

OpenAI Backend — Use any OpenAI-compatible API:

export MLXSMITH__MODEL__BACKEND=openai
export OPENAI_API_KEY="sk-..."
export MLXSMITH_API_BASE="https://api.openai.com/v1"  # or any compatible endpoint

mlxsmith synthetic prompts \
  --model gpt-4o \
  --out data/prompts.jsonl

This enables cloud-quality data generation with local training — use frontier models to create and filter training data, then fine-tune efficiently on your Mac.

Documentation

Section	Description
Getting Started	Full setup walkthrough
Concepts	Training modes explained
CLI Reference	All commands with examples
Verifiers	Verifier API and composition
Environments	Task environment plugins
Project Format	Run artifacts and layout
Configuration	Config system and options
Compatibility	Tested versions and models
Troubleshooting	Common issues and fixes
FAQ	Frequently asked questions
Contributing	How to contribute and run tests
Changelog	Release notes

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

hmbown

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.9

Feb 5, 2026

0.1.8

Feb 4, 2026

0.1.7

Feb 3, 2026

0.1.3

Feb 2, 2026

0.1.2

Feb 2, 2026

0.1.1

Feb 2, 2026

0.1.0

Feb 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlxsmith-0.1.9.tar.gz (177.4 kB view details)

Uploaded Feb 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mlxsmith-0.1.9-py3-none-any.whl (193.8 kB view details)

Uploaded Feb 5, 2026 Python 3

File details

Details for the file mlxsmith-0.1.9.tar.gz.

File metadata

Download URL: mlxsmith-0.1.9.tar.gz
Upload date: Feb 5, 2026
Size: 177.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlxsmith-0.1.9.tar.gz
Algorithm	Hash digest
SHA256	`6ecdb8c42c86aa893b2c3e7bffbac9d9cd549d0b75e3a66f32d64ea23a7f1849`
MD5	`3addb6c410f87134bf89239c4934fad8`
BLAKE2b-256	`5014f47eb398d353202ae46bf206a6e7602cffb879057f7d605d90922e9deb89`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlxsmith-0.1.9.tar.gz:

Publisher: publish.yml on Hmbown/MLXSmith

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mlxsmith-0.1.9.tar.gz
- Subject digest: 6ecdb8c42c86aa893b2c3e7bffbac9d9cd549d0b75e3a66f32d64ea23a7f1849
- Sigstore transparency entry: 919656328
- Sigstore integration time: Feb 5, 2026
Source repository:
- Permalink: Hmbown/MLXSmith@5c33e1c2d4019bed7e7708f103dadb7cbd223d61
- Branch / Tag: refs/tags/v0.1.9
- Owner: https://github.com/Hmbown
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5c33e1c2d4019bed7e7708f103dadb7cbd223d61
- Trigger Event: release

File details

Details for the file mlxsmith-0.1.9-py3-none-any.whl.

File metadata

Download URL: mlxsmith-0.1.9-py3-none-any.whl
Upload date: Feb 5, 2026
Size: 193.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mlxsmith-0.1.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9efa8f224a42e5cb44c2daae69d7f01e0817f752d43a9fb0df376a34e6b65a38`
MD5	`6be61081c0ea76e19ed44d8b5f487e43`
BLAKE2b-256	`56f96e667d7c0b1a2fcb849391ba9bc662d317211e7f779bfdb5680148dd1a87`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mlxsmith-0.1.9-py3-none-any.whl:

Publisher: publish.yml on Hmbown/MLXSmith

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mlxsmith-0.1.9-py3-none-any.whl
- Subject digest: 9efa8f224a42e5cb44c2daae69d7f01e0817f752d43a9fb0df376a34e6b65a38
- Sigstore transparency entry: 919656331
- Sigstore integration time: Feb 5, 2026
Source repository:
- Permalink: Hmbown/MLXSmith@5c33e1c2d4019bed7e7708f103dadb7cbd223d61
- Branch / Tag: refs/tags/v0.1.9
- Owner: https://github.com/Hmbown
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5c33e1c2d4019bed7e7708f103dadb7cbd223d61
- Trigger Event: release

mlxsmith 0.1.9

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

MLXSmith

Features

Requirements

Install

Quickstart

End-to-end Smoke (Qwen3-1.7B)

Repo SFT (Qwen3-1.7B)

Web Dashboard (Optional)

Training Modes

Tools

External Model Backends

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance