Skip to main content

CLI for running LLMs on Apple Silicon via MLX

Project description

ppmlx

Run LLMs on your Mac. OpenAI-compatible API powered by Apple Silicon.

CI PyPI Python 3.11+ Platform License

Install

pip install ppmlx

Requires macOS on Apple Silicon (M1+) and Python 3.11+

Privacy note: ppmlx never sends prompts, responses, file contents, paths, or tokens anywhere. Optional anonymous usage analytics can be disabled with ppmlx config --no-analytics.

Get Started

ppmlx pull qwen3.5:9b      # download a model
ppmlx run qwen3.5:9b       # chat in the terminal
ppmlx serve                 # start API server on :6767

That's it. Any OpenAI-compatible tool works out of the box:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:6767/v1", api_key="local")
response = client.chat.completions.create(
    model="qwen3.5:9b",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Commands

Command Description Key Options
ppmlx launch Interactive launcher (pick action + model) -m model, --host, --port, --flush
ppmlx serve Start API server on :6767 -m model, --embed-model, -i, --no-cors
ppmlx run <model> Interactive chat REPL -s system, -t temp, --max-tokens
ppmlx pull [model] Download model (multiselect if no arg) --token
ppmlx list Show downloaded models -a all (incl. registry), --path
ppmlx rm <model> Remove a model -f skip confirmation
ppmlx ps Show loaded models & memory
ppmlx quantize <model> Convert & quantize HF model to MLX -b bits, --group-size, -o output
ppmlx config View/set configuration --hf-token

Connect Your Tools

Point any OpenAI-compatible client at http://localhost:6767/v1 with any API key:

  • Cursor — Settings > AI > OpenAI-compatible
  • Continue — config.json: provider openai, apiBase above
  • LangChain / LlamaIndex — set base_url and api_key="local"

Config

Optional. ~/.ppmlx/config.toml:

[server]
host = "127.0.0.1"
port = 6767

[defaults]
temperature = 0.7
max_tokens = 2048

[analytics]
enabled = true
provider = "posthog"
respect_do_not_track = true

Anonymous Usage Analytics

ppmlx supports privacy-preserving anonymous product analytics, disabled by default — you are asked to opt in on first run.

What is sent:

  • command and API event names such as serve_started, model_pulled, api_chat_completions
  • app version, Python minor version, OS family, CPU architecture
  • coarse booleans/counters such as stream=true, tools=true, batch_size=4

What is never sent:

  • prompts, responses, tool arguments, file contents, file paths
  • HuggingFace tokens, API keys, repo IDs, model prompts, request bodies

When events are sent:

  • when a CLI command starts
  • when OpenAI-compatible API endpoints are hit

Why:

  • understand which workflows matter most
  • prioritize compatibility work across commands and API surfaces
  • measure adoption without collecting user content

Opt out:

ppmlx config --no-analytics

or:

[analytics]
enabled = false

For maintainer-operated analytics, the recommended sink is self-hosted PostHog. Configure it with:

export PPMLX_ANALYTICS_HOST="https://analytics.example.com"
export PPMLX_ANALYTICS_PROJECT_API_KEY="your-posthog-project-api-key"

If you prefer, you can also set the same values in ~/.ppmlx/config.toml.

API Documentation

When the server is running, interactive API docs are available at:

Requirements

  • macOS on Apple Silicon (M1 or later)
  • Python 3.11+
  • At least 8 GB unified memory (16 GB+ recommended for larger models)

ppmlx vs Ollama

ppmlx Ollama
Runtime MLX (Apple-native) llama.cpp (cross-platform)
Platform macOS Apple Silicon only macOS, Linux, Windows
GPU backend Metal (unified memory) Metal / CUDA / ROCm
API OpenAI-compatible Ollama + OpenAI-compatible
Language Python Go + C++
Quantization MLX format GGUF format

Choose ppmlx if you want maximum Apple Silicon performance with a pure-Python, MLX-native stack. Choose Ollama if you need cross-platform support or GGUF models.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ppmlx-0.3.0.tar.gz (79.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ppmlx-0.3.0-py3-none-any.whl (75.2 kB view details)

Uploaded Python 3

File details

Details for the file ppmlx-0.3.0.tar.gz.

File metadata

  • Download URL: ppmlx-0.3.0.tar.gz
  • Upload date:
  • Size: 79.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ppmlx-0.3.0.tar.gz
Algorithm Hash digest
SHA256 029da31bb6aedc70277ede2d3f8c4e49ff319ae57bc4ac70dcbb2d99b2e1cd66
MD5 4f5b1f11b032da704e09c2e4f1db65c3
BLAKE2b-256 2436a1018ee7f9cbf9f99f7e65949fc286b50414e34b7c5c4038b5cfd5ddc4b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for ppmlx-0.3.0.tar.gz:

Publisher: release.yml on the-focus-company/ppmlx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ppmlx-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ppmlx-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 75.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ppmlx-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c8a53eac1488ba610b8bc1de9ccaf5757a18de6556eda0756128b90c44abe857
MD5 a6b5f339c460a5fea197872167f02f8e
BLAKE2b-256 fc15d9f6d6ea4e27b9adbf37c8bf47718c116e39f394e524ada7ae235cf60193

See more details on using hashes here.

Provenance

The following attestation bundles were made for ppmlx-0.3.0-py3-none-any.whl:

Publisher: release.yml on the-focus-company/ppmlx

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page