Research platform for language model experimentation on Apple Silicon

These details have not been verified by PyPI

Project description

lmxlab

A research platform for language model experimentation on Apple Silicon.

Why lmxlab?

Most transformer implementations optimize for production at the cost of readability. lmxlab takes the opposite approach: every layer is implemented from scratch in MLX, with clarity that lets you quickly iterate on ideas and understand what each component does.

The core insight is that GPT, LLaMA, DeepSeek, Mamba, and dozens of other architectures are not fundamentally different models. They are different configurations of the same building blocks: attention, SSMs, feed-forward networks, normalization, and positional encoding. lmxlab makes this concrete by using config factories instead of class hierarchies.

from lmxlab.models.llama import llama_config
from lmxlab.models.deepseek import deepseek_config
from lmxlab.models.base import LanguageModel

# Same LanguageModel class, different configs
llama = LanguageModel(llama_config(d_model=512, n_heads=8, n_kv_heads=4, n_layers=6))
deepseek = LanguageModel(deepseek_config(d_model=512, n_heads=8, n_layers=6, kv_lora_rank=64))

No subclassing. No LlamaModel vs DeepSeekModel. One LanguageModel class, assembled from registry components based on what the config asks for.

What's included

24 architectures as config factories: GPT, LLaMA, Gemma, Gemma 3 (sliding window), Qwen, Qwen 3 MoE, Qwen 3.5 (hybrid DeltaNet), Qwen-Next (gated attention), Mixtral (MoE), DeepSeek V2/V3 (MLA + MoE), Nemotron (hybrid Mamba-Transformer MoE), Llama 4 Scout/Maverick (iRoPE + chunked attention), Mistral Small (sliding window), OLMo 2 (QK-norm), GPT-OSS (QK-norm), Grok (SharedExpertMoE), Kimi K2.5 (DeltaNet + MoE), SmolLM3 (iRoPE), Falcon H1 (hybrid Mamba-2), Jamba (Mamba-2 + MoE), Bamba (hybrid Mamba-2), GLM-4.5 (MLA NoPE)
Building blocks: MHA, GQA, MLA, GatedGQA, SlidingWindowGQA, ChunkedGQA, SparseGQA (DSA), Mamba-2 SSD, Mamba-3 (trapezoidal), GatedDeltaNet, MoE, SharedExpertMoE, LatentMoE, QK-norm, SwiGLU, squared ReLU
Compiled training with mx.compile, functional gradients, gradient clipping, cosine schedules, dropout, muP parameterization
Advanced training: DPO, GRPO, multi-token prediction, curriculum learning, knowledge distillation
LoRA & QLoRA: parameter-efficient fine-tuning with optional 4-bit quantization
Inference: autoregressive generation, speculative decoding, best-of-N sampling, beam search, reward model scoring
HuggingFace integration: load pretrained weights from the Hub
Experiment framework: time/FLOP-budgeted runs, MLflow tracking, results logging, hyperparameter sweeps, MLX profiling
35 recipe scripts: training, fine-tuning, DPO, GRPO, MTP, distillation, curriculum learning, DeltaNet hybrid, MoE, best-of-N sampling, evaluation, quantization, callbacks, optimizer comparison, KV cache analysis, experiment sweeps, benchmarking

Quick start

pip install lmxlab

import mlx.core as mx
from lmxlab.models.llama import llama_config
from lmxlab.models.base import LanguageModel
from lmxlab.training.config import TrainConfig
from lmxlab.training.trainer import Trainer

# Build a small LLaMA
config = llama_config(vocab_size=256, d_model=128, n_heads=4, n_kv_heads=2, n_layers=4)
model = LanguageModel(config)
mx.eval(model.parameters())
print(f"Parameters: {model.count_parameters():,}")

# Train
trainer = Trainer(model, TrainConfig(learning_rate=1e-3, max_steps=100))

See the Quickstart guide for a complete walkthrough.

Recipes

Ready-to-run scripts in recipes/:

uv run python recipes/train_tiny_gpt.py              # Train a tiny GPT
uv run python recipes/train_llama_shakespeare.py      # LLaMA on Shakespeare
uv run python recipes/compare_training.py             # Compare architectures
uv run python recipes/compare_architectures.py        # Side-by-side architecture comparison
uv run python recipes/ablation_gpt_to_llama.py        # Feature ablation study
uv run python recipes/finetune_lora.py --rank 8       # LoRA fine-tuning
uv run python recipes/finetune_qlora.py --bits 4      # QLoRA (4-bit + LoRA)
uv run python recipes/train_dpo.py                    # DPO preference optimization
uv run python recipes/train_grpo.py                   # GRPO reward optimization
uv run python recipes/train_curriculum.py              # Curriculum learning
uv run python recipes/train_mtp.py --n-predict 2      # Multi-token prediction
uv run python recipes/train_deltanet.py                # Hybrid DeltaNet vs GQA
uv run python recipes/train_moe.py --experts 4        # Mixture of Experts
uv run python recipes/advanced_sampling.py             # Best-of-N and majority vote
uv run python recipes/speculative_decoding.py         # Draft-then-verify generation
uv run python recipes/evaluate_model.py               # Evaluate with perplexity/BPB
uv run python recipes/interactive_generate.py         # Streaming token-by-token generation
uv run python recipes/checkpoint_resume.py            # Save and resume training
uv run python recipes/run_experiment.py               # Structured experiment with logging
uv run python recipes/sweep_learning_rate.py          # Hyperparameter sweep
uv run python recipes/load_pretrained.py              # Load HuggingFace model
uv run python recipes/profile_models.py               # Architecture profiling
uv run python recipes/benchmark_compile.py            # mx.compile speedup benchmark
uv run python recipes/distill_model.py                # Knowledge distillation
uv run python recipes/quantize_and_generate.py        # 4-bit/8-bit quantization
uv run python recipes/train_with_callbacks.py         # Logging, throughput, early stopping
uv run python recipes/train_with_datasets.py          # TextDataset vs TokenDataset
uv run python recipes/compare_schedules.py            # LR schedules and optimizers
uv run python recipes/compare_optimizers.py           # Optimizer comparison (Experiment 3)
uv run python recipes/compare_kv_cache.py             # MLA vs GQA KV cache (Experiment 4)
uv run python recipes/analyze_experiments.py          # Statistical analysis tools

CLI

lmxlab list                    # List all architectures
lmxlab info llama --tiny       # Show config details
lmxlab count deepseek --detail # Parameter breakdown

Design principles

Clarity for rapid iteration. Code is written to be read and modified quickly, not for maximum production performance.
MLX-native. Uses MLX idioms directly: nn.value_and_grad, mx.compile, unified memory.
Config factories, not subclasses. Architecture variants are configs, not class hierarchies.
Progressive complexity. Start with GPT-style, swap in LLaMA-style, then try MLA or Mamba. Same model class throughout.
Reproducible experiments. Time/FLOP budgets, train/val splits, MLflow tracking, and structured results logging.

Requirements

Python 3.12+
Apple Silicon Mac (M1 or later) for GPU acceleration
MLX will also run on Intel Macs and Linux using CPU

Development

git clone https://github.com/michaelellis003/lmxlab.git
cd lmxlab
uv sync --extra dev
uv run pre-commit install
uv run pre-commit install --hook-type commit-msg

# Run tests
uv run pytest

# Lint
uv run ruff check src/ tests/ recipes/

# Build docs
uv run mkdocs serve

Documentation

Full documentation at michaelellis003.github.io/lmxlab.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.0

Apr 10, 2026

0.3.1

Mar 15, 2026

This version

0.3.0

Mar 15, 2026

0.2.0

Mar 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lmxlab-0.3.0.tar.gz (537.8 kB view details)

Uploaded Mar 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lmxlab-0.3.0-py3-none-any.whl (142.4 kB view details)

Uploaded Mar 15, 2026 Python 3

File details

Details for the file lmxlab-0.3.0.tar.gz.

File metadata

Download URL: lmxlab-0.3.0.tar.gz
Upload date: Mar 15, 2026
Size: 537.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lmxlab-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`79899933b0fb41dde6a23ba7bce74ff1cff1fc645aca71f65e268b6b4f37e7a4`
MD5	`87b0e6b4fc3257a5b76139cc9e4c04ab`
BLAKE2b-256	`c40b74f04704d4cfc54d9d70ba54d0f88cc82d259c056de2ccc55fdfccc30f36`

See more details on using hashes here.

Provenance

The following attestation bundles were made for lmxlab-0.3.0.tar.gz:

Publisher: publish.yml on michaelellis003/lmxlab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: lmxlab-0.3.0.tar.gz
- Subject digest: 79899933b0fb41dde6a23ba7bce74ff1cff1fc645aca71f65e268b6b4f37e7a4
- Sigstore transparency entry: 1108152552
- Sigstore integration time: Mar 15, 2026
Source repository:
- Permalink: michaelellis003/lmxlab@9458bddda6394de0b553aeb19882face0a269621
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/michaelellis003
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9458bddda6394de0b553aeb19882face0a269621
- Trigger Event: release

File details

Details for the file lmxlab-0.3.0-py3-none-any.whl.

File metadata

Download URL: lmxlab-0.3.0-py3-none-any.whl
Upload date: Mar 15, 2026
Size: 142.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lmxlab-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`57597373276dec06df07ea6a21ac36f635a4b69e9b1e810268959fb340ae580e`
MD5	`e6d5d9b10b39c1bf60a97176db126497`
BLAKE2b-256	`c0ad5ba0844b139fdaefedec3579782dccce194450ec7cd9f0b0d8d9782570ca`

See more details on using hashes here.

Provenance

The following attestation bundles were made for lmxlab-0.3.0-py3-none-any.whl:

Publisher: publish.yml on michaelellis003/lmxlab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: lmxlab-0.3.0-py3-none-any.whl
- Subject digest: 57597373276dec06df07ea6a21ac36f635a4b69e9b1e810268959fb340ae580e
- Sigstore transparency entry: 1108152556
- Sigstore integration time: Mar 15, 2026
Source repository:
- Permalink: michaelellis003/lmxlab@9458bddda6394de0b553aeb19882face0a269621
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/michaelellis003
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9458bddda6394de0b553aeb19882face0a269621
- Trigger Event: release

lmxlab 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

lmxlab

Why lmxlab?

What's included

Quick start

Recipes

CLI

Design principles

Requirements

Development

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance