CLI for running LLMs on Apple Silicon via MLX
Project description
ppmlx
Run LLMs on your Mac. OpenAI-compatible API powered by Apple Silicon.
Install
pip install ppmlx
Requires macOS on Apple Silicon (M1+) and Python 3.11+
Privacy note:
ppmlxnever sends prompts, responses, file contents, paths, or tokens anywhere. Optional anonymous usage analytics can be disabled withppmlx config --no-analytics.
Get Started
ppmlx pull qwen3.5:9b # download a model
ppmlx run qwen3.5:9b # chat in the terminal
ppmlx serve # start API server on :6767
That's it. Any OpenAI-compatible tool works out of the box:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:6767/v1", api_key="local")
response = client.chat.completions.create(
model="qwen3.5:9b",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
Commands
| Command | Description | Key Options |
|---|---|---|
ppmlx launch |
Interactive launcher (pick action + model) | -m model, --host, --port, --flush |
ppmlx serve |
Start API server on :6767 | -m model, --embed-model, -i, --no-cors |
ppmlx run <model> |
Interactive chat REPL | -s system, -t temp, --max-tokens |
ppmlx pull [model] |
Download model (multiselect if no arg) | --token |
ppmlx list |
Show downloaded models | -a all (incl. registry), --path |
ppmlx rm <model> |
Remove a model | -f skip confirmation |
ppmlx ps |
Show loaded models & memory | |
ppmlx quantize <model> |
Convert & quantize HF model to MLX | -b bits, --group-size, -o output |
ppmlx config |
View/set configuration | --hf-token |
Connect Your Tools
Point any OpenAI-compatible client at http://localhost:6767/v1 with any API key:
- Cursor — Settings > AI > OpenAI-compatible
- Continue — config.json: provider
openai, apiBase above - LangChain / LlamaIndex — set
base_urlandapi_key="local"
Config
Optional. ~/.ppmlx/config.toml:
[server]
host = "127.0.0.1"
port = 6767
[defaults]
temperature = 0.7
max_tokens = 2048
[analytics]
enabled = true
provider = "posthog"
respect_do_not_track = true
Anonymous Usage Analytics
ppmlx supports privacy-preserving anonymous product analytics, disabled by default — you are asked to opt in on first run.
What is sent:
- command and API event names such as
serve_started,model_pulled,api_chat_completions - app version, Python minor version, OS family, CPU architecture
- coarse booleans/counters such as
stream=true,tools=true,batch_size=4
What is never sent:
- prompts, responses, tool arguments, file contents, file paths
- HuggingFace tokens, API keys, repo IDs, model prompts, request bodies
When events are sent:
- when a CLI command starts
- when OpenAI-compatible API endpoints are hit
Why:
- understand which workflows matter most
- prioritize compatibility work across commands and API surfaces
- measure adoption without collecting user content
Opt out:
ppmlx config --no-analytics
or:
[analytics]
enabled = false
For maintainer-operated analytics, the recommended sink is self-hosted PostHog. Configure it with:
export PPMLX_ANALYTICS_HOST="https://analytics.example.com"
export PPMLX_ANALYTICS_PROJECT_API_KEY="your-posthog-project-api-key"
If you prefer, you can also set the same values in ~/.ppmlx/config.toml.
API Documentation
When the server is running, interactive API docs are available at:
- Swagger UI: http://localhost:6767/docs
- ReDoc: http://localhost:6767/redoc
Requirements
- macOS on Apple Silicon (M1 or later)
- Python 3.11+
- At least 8 GB unified memory (16 GB+ recommended for larger models)
ppmlx vs Ollama
| ppmlx | Ollama | |
|---|---|---|
| Runtime | MLX (Apple-native) | llama.cpp (cross-platform) |
| Platform | macOS Apple Silicon only | macOS, Linux, Windows |
| GPU backend | Metal (unified memory) | Metal / CUDA / ROCm |
| API | OpenAI-compatible | Ollama + OpenAI-compatible |
| Language | Python | Go + C++ |
| Quantization | MLX format | GGUF format |
Choose ppmlx if you want maximum Apple Silicon performance with a pure-Python, MLX-native stack. Choose Ollama if you need cross-platform support or GGUF models.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ppmlx-0.3.0.tar.gz.
File metadata
- Download URL: ppmlx-0.3.0.tar.gz
- Upload date:
- Size: 79.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
029da31bb6aedc70277ede2d3f8c4e49ff319ae57bc4ac70dcbb2d99b2e1cd66
|
|
| MD5 |
4f5b1f11b032da704e09c2e4f1db65c3
|
|
| BLAKE2b-256 |
2436a1018ee7f9cbf9f99f7e65949fc286b50414e34b7c5c4038b5cfd5ddc4b4
|
Provenance
The following attestation bundles were made for ppmlx-0.3.0.tar.gz:
Publisher:
release.yml on the-focus-company/ppmlx
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ppmlx-0.3.0.tar.gz -
Subject digest:
029da31bb6aedc70277ede2d3f8c4e49ff319ae57bc4ac70dcbb2d99b2e1cd66 - Sigstore transparency entry: 1194324091
- Sigstore integration time:
-
Permalink:
the-focus-company/ppmlx@1dba5a0b77528c42c7be17e21022ef09ce288623 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/the-focus-company
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@1dba5a0b77528c42c7be17e21022ef09ce288623 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ppmlx-0.3.0-py3-none-any.whl.
File metadata
- Download URL: ppmlx-0.3.0-py3-none-any.whl
- Upload date:
- Size: 75.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c8a53eac1488ba610b8bc1de9ccaf5757a18de6556eda0756128b90c44abe857
|
|
| MD5 |
a6b5f339c460a5fea197872167f02f8e
|
|
| BLAKE2b-256 |
fc15d9f6d6ea4e27b9adbf37c8bf47718c116e39f394e524ada7ae235cf60193
|
Provenance
The following attestation bundles were made for ppmlx-0.3.0-py3-none-any.whl:
Publisher:
release.yml on the-focus-company/ppmlx
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ppmlx-0.3.0-py3-none-any.whl -
Subject digest:
c8a53eac1488ba610b8bc1de9ccaf5757a18de6556eda0756128b90c44abe857 - Sigstore transparency entry: 1194324143
- Sigstore integration time:
-
Permalink:
the-focus-company/ppmlx@1dba5a0b77528c42c7be17e21022ef09ce288623 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/the-focus-company
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@1dba5a0b77528c42c7be17e21022ef09ce288623 -
Trigger Event:
push
-
Statement type: