CLI and OpenAI-compatible HTTP server for Apple Foundation Models (Apple Intelligence)
Project description
apple-fm-cli
A small command-line interface and local HTTP server that drive Apple’s on-device Foundation Models (Apple Intelligence) from Python. The repo bundles a ctypes-based apple_fm_sdk wrapper around the system model plus apple_fm_cli, which adds prompting, optional tools, JSON-schema-guided output, and an OpenAI-shaped API for clients like Codex.
Requirements
- macOS with Foundation Models available (the SDK checks availability and reports a reason if not).
- Python ≥ 3.14
Install
From PyPI (macOS, Python 3.14+):
pip install apple-fm-cli
# or: uv pip install apple-fm-cli
From a git checkout (editable):
pip install -e .
# or: uv pip install -e .
Entry point: apple-fm-cli.
The published wheel includes a prebuilt native bridge (Apple silicon / arm64), typically libapple_fm_bridge.dylib (or the legacy name libFoundationModels.dylib). To rebuild it after changing foundation-models-c, run swift build -c release in src/apple_fm_sdk/foundation-models-c and copy .build/*/release/libapple_fm_bridge.dylib to src/apple_fm_sdk/lib/ before building distributions.
Publishing to PyPI (maintainers)
Automated (recommended): .github/workflows/publish.yml runs when you push a version tag v* (for example v0.1.3). It checks that the tag matches version in pyproject.toml, builds, uploads to PyPI, then creates a GitHub Release (with generated notes) if one does not already exist. One-time: configure PyPI trusted publishing for workflow publish.yml.
Typical release steps:
-
Bump
versioninpyproject.tomland runuv lockif you trackuv.lockin git. -
Rebuild and commit
src/apple_fm_sdk/lib/libapple_fm_bridge.dylib(orlibFoundationModels.dylib) if the Swift bridge changed. -
Commit and push to
main, then tag and push the tag:git tag v0.1.3 git push origin v0.1.3
Manual dispatch: Actions → Publish to PyPI → Run workflow still works for the current branch (no tag check); use sparingly.
Manual upload: uv run --with build python -m build then uv run --with twine twine upload dist/* using an API token. Prefer trusted publishing in CI over storing long-lived tokens in ~/.pypirc or GitHub secrets.
CLI
Query the model (plain text or structured JSON):
apple-fm-cli query "Summarize this idea in one sentence."
apple-fm-cli query --format json --schema '{"type":"object","properties":{"title":{"type":"string"}}}' "Name this topic."
Optional tools (comma-separated): bash (local shell), google_search (DuckDuckGo lite + page fetch).
apple-fm-cli query --tools bash,google_search "What’s in README.md in the cwd?"
Legacy-style flags are still accepted: -q / --query, --output, --output-schema.
Local embedding benchmark
From the repo root, measure end-to-end latency for 512-d English sentence embeddings (native NLEmbedding or the HTTP POST /v1/embeddings route on the local server):
uv run python scripts/benchmark_embeddings.py
uv run python scripts/benchmark_embeddings.py -n 100 -w 5
uv run python scripts/benchmark_embeddings.py --json
# With the server: apple-fm-cli server --port 8000
uv run python scripts/benchmark_embeddings.py --mode http --base-url http://127.0.0.1:8000
Use --batch to time multi-string requests (HTTP sends one POST per iteration; native runs a tight loop). Results print throughput, dimension, and latency percentiles.
Server
Starts a FastAPI app that mimics parts of the OpenAI Chat Completions, Embeddings, and Responses APIs (including SSE for streaming), so tools that expect those endpoints can point at your machine instead of a cloud provider. Large agent system prompts are truncated heuristically to fit smaller local context windows.
apple-fm-cli server --host 0.0.0.0 --port 8000
POST /v1/chat/completionsPOST /v1/embeddings— on-device 512-dimensional English sentence embeddings (NaturalLanguage/NLEmbedding); themodelfield is accepted and echoed (OpenAI compatibility) but does not select a different backend.POST /v1/responses
Codex
-
Start the server (see above), e.g. on port
8000. -
Add a provider and profile to
~/.codex/config.toml:[model_providers.apple] name = "apple" base_url = "http://localhost:8000/v1" env_key = "OPENAI_API_KEY" [profiles.apple] model = "fm" model_provider = "apple" model_context_window = 4096
-
Run Codex with that profile:
codex -p apple
env_key is the environment variable Codex uses for the bearer token. The local server does not need a real OpenAI key; set OPENAI_API_KEY to any non-empty placeholder if your Codex build requires it to be present.
Other agent harnesses
Anything that can target an OpenAI-compatible HTTP API (Chat Completions and/or Responses, including SSE) can point base_url at http://<host>:<port>/v1 and use a model id string of your choice—the server echoes the requested model name. Prefer a context window that matches what the on-device model can handle (4096 is a reasonable default for local sessions). If the client insists on an API key, keep using a dummy value in the configured env var unless you add auth in front of the server yourself.
Layout
| Path | Role |
|---|---|
src/apple_fm_sdk/ |
Session, tools, guided generation, tokenizer, native bridge bindings |
src/apple_fm_cli/ |
query / server, built-in tools, schema → Generable helpers |
scripts/ |
Helpers, e.g. benchmark_embeddings.py for local latency/throughput |
notes/ |
Design notes (Responses SSE lifecycle, native bridge, e2e testing) |
Licensing
apple_fm_sdk source files carry Apple Inc. copyright headers; refer to any accompanying license text distributed with that SDK.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file apple_fm_cli-0.3.0.tar.gz.
File metadata
- Download URL: apple_fm_cli-0.3.0.tar.gz
- Upload date:
- Size: 173.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c05a6bcdd7901be86ba3df8e917ed8c52c1eb469b6e8260d62c7258390a133ef
|
|
| MD5 |
9bb3698c3d176cfee904c8948b55cad7
|
|
| BLAKE2b-256 |
5d6cc88b177e77c8f432f0b6e61b99fc49f575116c79e83a8eec8ee0a52ccf5c
|
Provenance
The following attestation bundles were made for apple_fm_cli-0.3.0.tar.gz:
Publisher:
workflow.yml on sohail288/apple-fm-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
apple_fm_cli-0.3.0.tar.gz -
Subject digest:
c05a6bcdd7901be86ba3df8e917ed8c52c1eb469b6e8260d62c7258390a133ef - Sigstore transparency entry: 1366599149
- Sigstore integration time:
-
Permalink:
sohail288/apple-fm-cli@b96072243f0375bd625c8647f3b4d8be279d051e -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/sohail288
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@b96072243f0375bd625c8647f3b4d8be279d051e -
Trigger Event:
push
-
Statement type:
File details
Details for the file apple_fm_cli-0.3.0-py3-none-any.whl.
File metadata
- Download URL: apple_fm_cli-0.3.0-py3-none-any.whl
- Upload date:
- Size: 138.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5076ad51bce576d4a1b1b6dbcfb3f8e63884c202e3faff291b8f2e93c84e7500
|
|
| MD5 |
23ab024fca3ec537ec1650229eba47d7
|
|
| BLAKE2b-256 |
49da91a6666526605b55e1af319bde0114da362bd56c926e294f21feed9a1b00
|
Provenance
The following attestation bundles were made for apple_fm_cli-0.3.0-py3-none-any.whl:
Publisher:
workflow.yml on sohail288/apple-fm-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
apple_fm_cli-0.3.0-py3-none-any.whl -
Subject digest:
5076ad51bce576d4a1b1b6dbcfb3f8e63884c202e3faff291b8f2e93c84e7500 - Sigstore transparency entry: 1366599395
- Sigstore integration time:
-
Permalink:
sohail288/apple-fm-cli@b96072243f0375bd625c8647f3b4d8be279d051e -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/sohail288
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@b96072243f0375bd625c8647f3b4d8be279d051e -
Trigger Event:
push
-
Statement type: