Visual probing and interpretability tool for MLX language models
Project description
MLXLMProbe
TLDR install
brew install scouzi1966/afm/mlxlmprobe
mlxlmprobe to run - That's it!
A visual probing and interpretability tool for MLX language models on Apple Silicon.
Status: Work in Progress - Currently testing with GPT-OSS and other MoE models
Features
- Universal MLX-LM Support: TESTED ONLY on GPT-OSS so far
- MoE Analysis: Mixture-of-Experts routing visualization, expert load distribution, top-k selection patterns
- Layer Analysis: Visualize activation norms and patterns across all layers
- FFN Analysis: Gate sparsity and activation patterns in feed-forward networks
- Embedding Visualization: PCA plots with section-based coloring (System/User/Reasoning/Response)
- Logits Analysis: Token probability distributions with histograms
- Layer Similarity: Cosine similarity heatmaps between layer representations
- Residual Stream: Track information flow through the transformer
- Token Alternatives: See what other tokens the model considered at each position
- Reasoning Model Support: Detects and separates reasoning loops from final responses
- AI Interpretation: Optional AI-powered analysis using local model or Claude
- Export: PDF reports and interactive HTML exports
Deep token MoE tracing
Deep dive into MoE on a per token and per layer basis
Attention pattern analysis
RoPE Analysis
Deep Response and Input Sequence Token Analysis
Requirements
- Mac with Apple Silicon (M1, M2, M3, M4, or later)
- macOS 15.0+ (Sequoia or later recommended)
- 8GB+ unified memory (16GB+ recommended for larger models, 32GB+ for 30B+ models)
Installation
Option 1: Homebrew (Recommended)
brew install scouzi1966/afm/mlxlmprobe
Then run:
mlxlmprobe
Option 2: pip
pip install mlxlmprobe
Then run:
mlxlmprobe
Option 3: From Source
git clone https://github.com/scouzi1966/MLXLMProbe.git
cd MLXLMProbe
pip install -r requirements.txt
streamlit run mlxlmprobe.py
Quick Start
- Run
mlxlmprobe- the UI opens in your browser athttp://localhost:8501 - Select a model from the sidebar (or enter a HuggingFace model ID)
- Enter a prompt and click "Run Probe"
- Explore the analysis tabs
Load a Model
Option A: Use the sidebar to enter a HuggingFace model ID
Popular MLX models from mlx-community:
mlx-community/gpt-oss-20b-MXFP4-Q8(TESTED)mlx-community/Llama-3.2-3B-Instruct-4bit(small, fast)mlx-community/Mistral-7B-Instruct-v0.3-4bit(good quality)mlx-community/Mixtral-8x7B-Instruct-v0.1-4bit(MoE model)mlx-community/Qwen2.5-7B-Instruct-4bit(multilingual)mlx-community/DeepSeek-R1-Distill-Qwen-7B-4bit(reasoning model)
Option B: Specify model on command line
mlxlmprobe -- --model mlx-community/Llama-3.2-1B-Instruct-4bit
Option C: Use a local model path
mlxlmprobe -- --model /path/to/your/mlx-model
Usage Guide
Basic Workflow
- Enter a prompt in the text area
- Click "Run Probe" to generate and analyze
- Explore tabs: Layer Activations, FFN Analysis, Tokens, Embeddings, Logits, etc.
- For MoE models: Check the "MoE Routing" tab for expert analysis
Understanding MoE Visualizations
For Mixture-of-Experts models (like Mixtral), the MoE tab shows:
-
Top-K Expert Weights: Stacked bars showing which experts were selected
- 🟡 Gold = Top-1 (highest weight)
- 🟣 Magenta = Top-2
- 🔵 Cyan = Top-3
- 🟠 Orange = Top-4
- Bar length = router probability assigned to that expert
- Labels inside bars = Expert ID (E0, E1, etc.)
-
Expert Load: How many tokens each expert processed
-
Router Probabilities: Heatmap of all expert weights
Command Line Options
mlxlmprobe -- --help
Options:
--model PATH Path or HuggingFace ID of MLX model
--port PORT Streamlit port (default: 8501)
--max-tokens N Maximum tokens to generate (default: 100)
--max-context N Maximum context length (default: model's max)
Keyboard Shortcuts
Ctrl+Enter/Cmd+Enter- Run probeR- Refresh page
Troubleshooting
"No module named 'mlx'"
MLX only works on Apple Silicon Macs. Verify with uname -m (should be arm64).
Model download fails
- Check internet connection
- Verify the model ID exists on HuggingFace
- Try a smaller model first
Out of memory
- Try a smaller/more quantized model (4bit instead of 8bit)
- Reduce max tokens to generate
- Close other applications
Streamlit won't start
# Kill any existing Streamlit processes
pkill -f streamlit
# Try a different port
mlxlmprobe --server.port 8502
How It Works
MLXLMProbe intercepts the forward pass of transformer models to capture:
- Embeddings: Initial token representations
- Layer Outputs: Hidden states after each transformer block
- FFN/MoE Activations: Gate values and expert routing decisions
- Final Logits: Output distribution over vocabulary
- Per-token Alternatives: What other tokens were considered
These are visualized using Plotly for interactive exploration.
License
MIT License - see LICENSE file for details.
Acknowledgments
- Built on MLX by Apple
- Uses mlx-lm for model loading
- Inspired by transformer interpretability research
Contributing
This is a work in progress. Issues and PRs welcome!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlxlmprobe-0.1.2.tar.gz.
File metadata
- Download URL: mlxlmprobe-0.1.2.tar.gz
- Upload date:
- Size: 2.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aca8737ea8a1e76e466d0f06a99fb30336c04aa4ddc3552d700ec31561e83056
|
|
| MD5 |
4b9a76895246559e0606a44b052ac71f
|
|
| BLAKE2b-256 |
5c159061766150faf70681f9bd6ae0955d7f904147db0bf2c4be908789aabe62
|
File details
Details for the file mlxlmprobe-0.1.2-py3-none-any.whl.
File metadata
- Download URL: mlxlmprobe-0.1.2-py3-none-any.whl
- Upload date:
- Size: 95.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64a63f212e60dbe2e91127959d8ccc0d7e6241c11b8fffd12403f25824b5506a
|
|
| MD5 |
78628f8a3e0cd18f29fe0c7639a31099
|
|
| BLAKE2b-256 |
4a33368ade479099a146bcdbb38cd6f4e6af084c854a1bddd48dd264c22b175a
|