A plugin for https://github.com/simonw/llm for running local CoreML .mlpackage model files.
Project description
llm-coreml
A plugin for https://llm.datasette.io/ cli tool that runs CoreML .mlpackage LLM models locally on macOS.
Point it at a model (and its corresponding HuggingFace tokenizer), then prompt it like any other llm model.
Requirements
- macOS (CoreML is Apple-only)
- Python 3.11+
Installation
llm install llm-coreml
Or for development:
git clone https://github.com/anentropic/llm-coreml.git
cd llm-coreml
llm install -e .
Quick start
Register a model with a name and a path to the .mlpackage.
The --tokenizer argument is the HuggingFace model name to load the tokenizer from. This should match the HF model your .mlpackage was derived from:
llm coreml add my-llama /path/to/llama.mlpackage \
--tokenizer meta-llama/Llama-3.2-1B-Instruct
Prompt it:
llm -m coreml/my-llama "Explain quantum computing in one sentence"
Check it shows up in llm models:
llm models | grep coreml
Usage
Prompting
# Basic prompt
llm -m coreml/my-llama "What is Rust?"
# With a system prompt
llm -m coreml/my-llama "Hello" -s "You are a pirate"
# Continue a conversation
llm -m coreml/my-llama "What is Rust?"
llm -c "Compare it to Go"
Model options
Pass options with -o:
llm -m coreml/my-llama "Write a haiku" \
-o temperature 0.7 \
-o top_p 0.9 \
-o max_tokens 50
Python API
import llm
model = llm.get_model("coreml/my-llama")
response = model.prompt("What is the capital of France?")
print(response.text())
CLI reference
llm coreml add
llm coreml add <name> <path> --tokenizer <hf_id> [--compute-units <units>]
Register a CoreML model.
| Argument | Description |
|---|---|
name |
Model name, used as coreml/<name> |
path |
Path to the .mlpackage directory (resolved to absolute) |
--tokenizer |
HuggingFace tokenizer model ID (required) |
--compute-units |
Compute units: all, cpu_only, cpu_and_gpu, cpu_and_ne (default: all) |
llm coreml list
llm coreml list
Lists registered models with their paths, tokenizer IDs, and compute units.
llm coreml remove
llm coreml remove <name>
Removes a registered model. Exits with code 1 if the model doesn't exist.
Model options reference
| Option | Type | Default | Description |
|---|---|---|---|
max_tokens |
int | 200 | Maximum tokens to generate |
temperature |
float | 0.0 | Sampling temperature. 0 = greedy (deterministic) |
top_p |
float | 1.0 | Top-p nucleus sampling threshold |
How it works
Format auto-detection
The plugin reads the CoreML model spec at load time and checks the input names:
inputIds(camelCase) = Apple format, uses float16 causal masksinput_ids(snake_case) = HuggingFace format, uses int32 attention masks
No config file needed.
Stateful KV-cache
If the model spec declares stateDescriptions, the plugin uses stateful inference with KV-cache. Otherwise it falls back to stateless inference, which reprocesses the full sequence each step (slower, but works with older models).
Tokenization
The plugin uses transformers.AutoTokenizer with apply_chat_template() to handle chat formatting. The tokenizer is downloaded and cached the first time you use a model.
Getting CoreML models
You can get .mlpackage LLM models by:
- Converting HuggingFace models with coremltools
- Using Apple's ml-explore tools
- Downloading pre-converted models from HuggingFace (search for "coreml" tagged models)
Development
uv sync --dev
Quality gates
uv run basedpyright # Type checking (strict)
uv run ruff check # Linting
uv run ruff format # Formatting
uv run pytest # Tests
Or all at once:
prek run --all-files
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_coreml-0.1.0.tar.gz.
File metadata
- Download URL: llm_coreml-0.1.0.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6d55b09478b89ff8c8d1d7ed4c2a5d2d468aa77093b5cb8b51840b5c9ec5b9aa
|
|
| MD5 |
b1eb732f6921f7acdd9542118694ec88
|
|
| BLAKE2b-256 |
18794be2b2e72f085cadda941dc36bb74747b4ee3ff4acd6aa01d5476da9f0c3
|
Provenance
The following attestation bundles were made for llm_coreml-0.1.0.tar.gz:
Publisher:
release.yml on anentropic/llm-coreml
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_coreml-0.1.0.tar.gz -
Subject digest:
6d55b09478b89ff8c8d1d7ed4c2a5d2d468aa77093b5cb8b51840b5c9ec5b9aa - Sigstore transparency entry: 1026900046
- Sigstore integration time:
-
Permalink:
anentropic/llm-coreml@e4718490cf7f23de81de15ab5b0fc35d3d3c3534 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/anentropic
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e4718490cf7f23de81de15ab5b0fc35d3d3c3534 -
Trigger Event:
push
-
Statement type:
File details
Details for the file llm_coreml-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llm_coreml-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d3b07de8bdaa614ab9c8e750c39f910e01b2d91d16000cd00139d7382084d5a
|
|
| MD5 |
996f287a26d52c2f490d6011de26c080
|
|
| BLAKE2b-256 |
1b2fae855cb77f518351bd5147d8b6935471d64920d8d3686db71ea4000000d3
|
Provenance
The following attestation bundles were made for llm_coreml-0.1.0-py3-none-any.whl:
Publisher:
release.yml on anentropic/llm-coreml
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_coreml-0.1.0-py3-none-any.whl -
Subject digest:
1d3b07de8bdaa614ab9c8e750c39f910e01b2d91d16000cd00139d7382084d5a - Sigstore transparency entry: 1026900116
- Sigstore integration time:
-
Permalink:
anentropic/llm-coreml@e4718490cf7f23de81de15ab5b0fc35d3d3c3534 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/anentropic
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e4718490cf7f23de81de15ab5b0fc35d3d3c3534 -
Trigger Event:
push
-
Statement type: