Discover, compare, and install Core AI models for Apple Silicon.

These details have not been verified by PyPI

Project links

Project description

Core AI Catalog

🌐 Live site: kevinqz.github.io/coreai-catalog — searchable web UI with model cards, filters, and benchmarks.

A compact, source-grounded catalog of Apple Core AI models, artifacts, upstreams, benchmarks, provenance and a verified Apple AI terminology layer.

Core AI Catalog maps Apple Core AI-compatible model artifacts with granular metadata, source links, Hugging Face artifact references, GitHub/Hugging Face attribution, runtime requirements, device support, benchmark records and verification status.

YAML is the source of truth. Markdown is the human view. JSON is the generated machine/API export.

Scope and disclaimer

This catalog tracks open-source models and their Apple Core AI artifacts — provenance, runtime, licenses and benchmarks — plus a verified reference layer of Apple AI terminology grounded in official Apple sources. It does not redistribute model weights, re-document Apple's APIs, or treat Apple's proprietary Foundation Models as downloadable artifacts.

Not affiliated with or endorsed by Apple. commercial_use fields are triage labels, not legal advice or permissions — always verify the upstream model, code and artifact licenses yourself.

Status

Version: v2.0.3

79 Apple Core AI models with artifact provenance, benchmarks, verified terminology, readiness scores, and an MCP server for agent-native model discovery, comparison, and recommendation. Agent-ready: CLI, MCP server, JSON exports, llms.txt, openapi.yaml — all from the same engine.

Quick Start

pip install coreai-catalog

# Find the right model for your task
coreai-catalog recommend --task "private OCR on iPhone" --license likely

# Install it (downloads .aimodel from Hugging Face)
coreai-catalog install unlimited-ocr

# Compare alternatives
coreai-catalog compare unlimited-ocr qwen3-vl-2b

See examples/ for Swift integration snippets (OCR, VLM chat, embeddings/RAG).

Why this exists

Apple Core AI model artifacts are spread across upstream repositories, model cards, official recipe conversions, community ports and Hugging Face artifact repos. This project organizes that information into a compact, machine-readable catalog that can be consumed by humans, agents and automation.

The goal is not to run models directly. The goal is to know, precisely and traceably:

what model exists
where it came from
what it can do
what it receives and outputs
where the artifact is hosted
who should be credited
whether it is an official Apple recipe conversion or a community zoo port
what runtime/device constraints are known
which benchmark records exist
which fields are confirmed and which remain unknown

Current scope

Area	Count / status
Model records	79
Artifact provenance records	79
Source records	21
Main upstreams	2
Upstream taxonomy entries	66
Benchmark records	66
Terminology records	42
JSON exports	generated via script

Main upstreams:

john-rocky/coreai-model-zoo
apple/coreai-models

Primary Hugging Face artifact owner currently mapped:

mlboydaisuke

Repository structure

coreai-catalog/
├── README.md
├── AGENTS.md
├── CONTRIBUTING.md
├── CREDITS.md
├── pyproject.toml
├── catalog.yaml
├── artifacts.yaml
├── sources.yaml
├── upstreams.yaml
├── benchmarks.yaml
├── terms.yaml
├── requirements.txt
├── schema/
│   ├── model.schema.json
│   ├── artifact.schema.json
│   ├── upstream.schema.json
│   ├── benchmark.schema.json
│   └── term.schema.json
├── scripts/
│   ├── validate.py
│   ├── audit.py
│   ├── deep_audit.py
│   ├── derive_fields.py
│   ├── generate.py
│   ├── sync_upstream.py
│   └── check_sources.sh
├── coreai_catalog/
│   ├── __init__.py
│   ├── __main__.py
│   ├── cli.py
│   ├── catalog.py
│   ├── exports.py
│   └── installer.py
├── mcp_server/
│   ├── __init__.py
│   └── server.py
├── skills/
│   ├── coreai-model-selection/
│   └── coreai-license-triage/
├── llms.txt
├── docs/
│   ├── index.md
│   ├── model-registry.md
│   ├── capability-matrix.md
│   ├── runtime-matrix.md
│   ├── artifact-provenance.md
│   ├── upstream-map.md
│   ├── benchmark-map.md
│   ├── source-map.md
│   ├── apple-terminology-map.md
│   ├── data-model.md
│   ├── compare/
│   ├── v0.3-verification.md
│   ├── sota-maintenance.md
│   └── generated-files.md
└── .github/
    └── workflows/
        └── validate.yml

JSON exports are generated by scripts/generate.py and committed to dist/. They are available via raw GitHub URLs (e.g. https://raw.githubusercontent.com/kevinqz/coreai-catalog/main/dist/catalog.json) without cloning the repo.

Source of truth

File	Purpose
`catalog.yaml`	Model facts: name, family, capabilities, modalities, size, runtime, device support, license status and verification status. Measurements live in `benchmarks.yaml`, not here.
`artifacts.yaml`	Converted artifact provenance: GitHub conversion source, Hugging Face owner/repo/url and official recipe status.
`sources.yaml`	Compact registry of primary/supporting sources already used by the catalog.
`upstreams.yaml`	Source taxonomy for framework, conversion, artifact host, benchmark, sample, original model and license sources.
`benchmarks.yaml`	Normalized benchmark records by model, metric, device, compute unit and source.
`terms.yaml`	Verified Apple AI terminology, tagged by ecosystem layer, each citing an official Apple source.
`CREDITS.md`	Human-readable attribution for GitHub and Hugging Face users/repositories.
`schema/*.json`	Validation contracts for model, artifact, upstream and benchmark records.
`docs/*.md`	Generated or curated human views.
`dist/*.json`	Generated machine-readable exports.

Core data model

A model entry in catalog.yaml represents model metadata:

- id: qwen3-5-0-8b
  name: Qwen3.5-0.8B
  family: Qwen
  source_group: zoo
  capabilities:
    - chat
    - text-generation
  modalities:
    input:
      - text
    output:
      - text
  artifact:
    format: aimodel
    availability: available
  runtime:
    runtime_name: apple-core-ai
    runner: CoreAIRunner
  status: confirmed
  confidence: medium

An artifact entry in artifacts.yaml represents converted artifact provenance and hosting:

- id: qwen3-5-0-8b
  group: zoo
  github:
    owner: john-rocky
    repo: coreai-model-zoo
    path: https://github.com/john-rocky/coreai-model-zoo/blob/main/zoo/qwen3.5.md
  huggingface:
    owner: mlboydaisuke
    repo: qwen3.5-0.8B-CoreAI
    url: https://huggingface.co/mlboydaisuke/qwen3.5-0.8B-CoreAI
  officiality:
    apple_export_recipe: false
    apple_hosted_artifact: false
    community_packaged: true

An upstream entry in upstreams.yaml represents source taxonomy:

- id: qwen
  title: Qwen original model family
  category: original_model
  platform: huggingface
  owner: Qwen
  url: https://huggingface.co/Qwen
  trust: original_model_primary
  applies_to:
    - qwen3-5-0-8b
    - qwen3-vl-2b

A benchmark entry in benchmarks.yaml represents a normalized measurement:

- id: qwen3-5-0-8b-iphone17pro-gpu-toks
  model_id: qwen3-5-0-8b
  metric: decode_throughput
  unit: tokens_per_second
  value: 71.9
  device: iPhone 17 Pro
  compute_unit: GPU
  environment: iOS 27 beta, coreai-pipelined engine
  observed: '2026-06-25'
  source: john-rocky-coreai-model-zoo
  confidence: medium

Measurements are the single source of truth in benchmarks.yaml (model records carry no inline numbers). Each row is environment-scoped and append-only: values that differ across OS/runtime versions are kept as separate dated records, and a superseded value is retained with confidence: needs_review and a superseded_by pointer rather than overwritten.

Source layers

Layer	File/category	Purpose
Model facts	`catalog.yaml`	What the model is and what it does.
Converted artifact	`artifacts.yaml`	Where the Core AI artifact lives and who converted/hosts it.
Framework/runtime	`upstreams.yaml > framework_sources`	Apple Core AI, Core ML and tooling context.
Original model	`upstreams.yaml > original_model_sources`	Original creators/model-family sources.
License	`upstreams.yaml > license_sources`	License documents and review flags.
Benchmarks	`benchmarks.yaml`	Measurement rows, source IDs and confidence.
Human docs	`docs/*.md`	Tables, maps and curated summaries.
Machine exports	`dist/*.json`	Generated JSON outputs for agents/APIs.

Model groups

Group	Meaning
`zoo`	Community model port from `john-rocky/coreai-model-zoo`.
`official`	Artifact described upstream as an Apple official recipe conversion from `apple/coreai-models`.
`external`	External source, not yet used by the current catalog.
`unknown`	Not classified yet.

Official Apple recipe conversions

Entries with source_group: official in catalog.yaml and officiality.apple_export_recipe: true in artifacts.yaml are treated as official Apple recipe conversion artifacts. The officiality block disambiguates official of what: apple_export_recipe (converted via an Apple recipe), apple_hosted_artifact (Apple hosts the artifact — false for all current entries), and community_packaged (packaged/hosted by the community).

These entries credit:

GitHub source: apple/coreai-models
Artifact host: mlboydaisuke on Hugging Face

Current official entries include:

gpt-oss-20B
Qwen3 0.6B
Qwen3 4B
Qwen3 8B
Gemma 3 4B IT
Gemma 3 12B IT
Mistral 7B v0.3
FLUX.2 klein 4B
SAM 3
Whisper large-v3-turbo

Original model attribution

Original model creators are tracked separately from converted artifact hosts. This avoids conflating:

original model creator
Apple official recipe source
community conversion source
Hugging Face artifact host
license source

Examples:

Model family	Original upstream	Converted artifact host
Qwen	`Qwen`	`mlboydaisuke`
Gemma	`google`	`mlboydaisuke`
Mistral	`mistralai`	`mlboydaisuke`
SAM	`facebook` / Meta	`mlboydaisuke`
RF-DETR	`Roboflow`	`mlboydaisuke`

See upstreams.yaml and docs/upstream-map.md.

Capabilities covered

The catalog currently covers:

chat / text generation
instruction following
reasoning / agentic LLMs
MoE LLMs
1.58-bit ternary LLMs
vision-language models
GUI grounding / computer use
document OCR
visual document retrieval (ColBERT / MaxSim)
audio understanding
text-to-speech
speech-to-text (ASR + transducer / TDT)
embeddings
reranking
image-text similarity (CLIP)
object detection
instance segmentation
promptable segmentation
monocular depth
image generation
super-resolution
text-to-video
image-to-3D (Gaussian splatting)
text-to-audio (generative music)
diffusion LLMs (dLLM)
vision-language-action (VLA / robotics)

Devices and runtime metadata

The catalog tracks known runtime/device facts when available:

Apple Core AI artifact format
.aimodel availability
stock runtime vs community runtime
runner name
tokenizer requirement
processor requirement
custom Metal kernel requirement
patch/workaround requirement
AOT requirement
iPhone/iPad/Mac support
Mac-only status

Unknown or unverified values are intentionally kept as unknown instead of guessed.

Validation and generation

Install dependencies:

pip install -r requirements.txt

Validate records:

python scripts/validate.py

Regenerate Markdown docs:

python scripts/generate.py --docs

Export JSON, search indexes, and readiness scores:

python scripts/generate.py --json

Or generate everything at once:

python scripts/generate.py

The GitHub Actions workflow runs validation, generation, CLI smoke test, and MCP assertion on every push and pull request.

CLI

Install the CLI for the full experience:

pip install -e .

Commands

# Discover models
coreai-catalog search --capability vision-language --device iphone
coreai-catalog list                          # all models, sorted by readiness score
coreai-catalog scores                        # 0-100 readiness scores with grade distribution
coreai-catalog capabilities                  # list all capabilities with model counts

# Inspect a model
coreai-catalog show qwen3-vl-2b              # full details: caps, devices, runtime, provenance, benchmarks
coreai-catalog show qwen3-vl-2b -v           # verbose — full notes, not truncated
coreai-catalog compare qwen3-vl-2b unlimited-ocr  # side-by-side

# Get recommendations
coreai-catalog recommend --task "robot vision" --device iphone
coreai-catalog recommend --task "private on-device OCR" --device iphone
coreai-catalog recommend --task "voice assistant" --device mac

# Install a model (downloads from Hugging Face, writes manifest + Swift snippet)
coreai-catalog install qwen3-vl-2b           # downloads artifact, generates snippet.swift
coreai-catalog install qwen3-vl-2b --dry-run # preview download size without downloading
coreai-catalog installed                     # list locally installed models
coreai-catalog uninstall qwen3-vl-2b

# Check your environment
coreai-catalog doctor                        # checks Python, Xcode, coreai-torch, coreai-opt, HF CLI, disk

All commands support --json for programmatic consumption by agents and automation.

MCP server (Agent API)

The catalog ships an MCP server that exposes 11 tools to AI agents (Claude Desktop, Cursor, any MCP-compatible client).

Setup

pip install -e ".[mcp]"

Configure in Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "coreai-catalog": {
      "command": "python",
      "args": ["mcp_server/server.py"]
    }
  }
}

Or use the installed entry point:

{
  "mcpServers": {
    "coreai-catalog": {
      "command": "coreai-catalog-mcp"
    }
  }
}

Available tools

Tool	Description
`search_models`	Filter by capability, device, license, family, source, modality
`get_model`	Full model details: capabilities, runtime, provenance, benchmarks
`compare_models`	Side-by-side comparison of 2+ models
`recommend_model`	Task-based recommendations (89 task synonyms mapped)
`check_license`	License and commercial use triage for a model
`get_benchmarks`	All benchmark records for a model
`get_artifact`	Artifact provenance and download info
`explain_term`	Apple AI terminology lookup (42 verified terms)
`get_capabilities`	List all capabilities with model counts
`get_tasks`	List all supported task synonyms and their mappings
`get_version`	Catalog version, model count, last-verified date

Example agent interaction

User: I need a vision-language model that runs on iPhone for robot perception.

Agent calls: search_models(capability="vision-language", device="iphone")
→ Returns 6 candidates with readiness scores

Agent calls: compare_models(["qwen3-vl-2b", "minicpm-v-4-6"])
→ Returns side-by-side comparison

Agent calls: check_license("qwen3-vl-2b")
→ Returns Apache-2.0, commercial_use: likely

Agent recommends: Qwen3-VL 2B — benchmarked, iPhone-supported, Apache-2.0

Query and decision

All query and decision tools are built into the CLI (see above) and the MCP server (see below). There is no separate scripts/query.py or scripts/recommend.py — the CLI is the single entry point for both humans and automation.

Documentation

generated docs are produced from the YAML source by scripts and must not be hand-edited; curated docs are maintained manually (see docs/generated-files.md).

Doc	Type	Description
`docs/getting-started.md`	curated	60-second → 10-minute walkthrough
`docs/index.md`	generated	Docs entry point and counts (`scripts/generate.py`).
`docs/model-registry.md`	generated	Human-readable model table (`scripts/generate.py`).
`docs/artifact-provenance.md`	generated	Artifact ownership and hosting view (`scripts/generate.py`).
`docs/apple-terminology-map.md`	generated	Verified Apple AI terminology by layer (`scripts/generate.py`).
`docs/tasks/`	generated	Per-capability task pages with model tables (`scripts/generate.py`).
`docs/concepts/`	curated	Model vs artifact, runtime landscape, license risk, benchmark quality.
`docs/data-model.md`	curated	Entity model and relationship documentation.
`docs/capability-matrix.md`	curated	Models grouped by capability.
`docs/runtime-matrix.md`	curated	Runtime concepts and flags.
`docs/upstream-map.md`	curated	Framework/original-model/license upstream map.
`docs/benchmark-map.md`	curated	Benchmark registry explanation.
`docs/source-map.md`	curated	Source and upstream map.
`docs/sota-maintenance.md`	curated	Maintenance plan and data-model direction.
`docs/generated-files.md`	curated	Generated vs curated file policy.
`PROJECT_PHILOSOPHY.md`	curated	Why the project exists, design principles, non-goals.

Attribution

This project is a catalog and attribution layer. It does not claim ownership of upstream model artifacts or source repositories.

Primary credits are recorded in:

CREDITS.md
sources.yaml
artifacts.yaml
upstreams.yaml

Key credited sources include:

john-rocky/coreai-model-zoo
john-rocky/CoreML-Models
apple/coreai-models
apple/coremltools
john-rocky/apple-silicon-llm-bench
john-rocky/coreai-samples
Hugging Face user mlboydaisuke
original model creators listed in upstreams.yaml

License handling

Licenses are tracked per model when known. Some entries are marked as check_license when commercial-use terms need explicit review.

Important rule:

The repository license, upstream code license, model license and artifact-hosting license may differ.

For sensitive licenses such as Gemma Terms, Meta SAM License, LFM Open License or OpenRAIL-style licenses, treat commercial_use: check_license as requiring manual review before use.

Maintenance rules

One meaningful model variant should have one catalog entry.
Do not collapse variants when size, device support, runtime, quantization, license or artifact changes.
Use unknown instead of guessing.
Keep catalog.yaml focused on model facts.
Keep artifacts.yaml focused on converted artifact provenance and hosting.
Keep upstreams.yaml focused on original model, framework, license and benchmark sources.
Keep benchmarks.yaml focused on normalized measurement records.
Keep sources.yaml focused on compact source registry.
Generate Markdown and JSON views from YAML whenever possible.
Credit original model creator, conversion source and artifact host separately.
Update last_verified when a source is rechecked.

Roadmap

Current milestone:

v2.0.0 — Web UI (GitHub Pages): model explorer, task browser, filters, search.

Earlier:

v1.7.0 — Public Python library API (from coreai_catalog import Catalog), schema versioning docs.
v1.6.0 — Task-first discovery: tasks command, recommend --explain, enriched MCP get_tasks.
v1.5.0 — Structured docs (philosophy, getting-started, concepts, task pages), community templates, issue templates.
v1.4.0 — PyPI-ready, 60-second demo, Swift examples, recommend redesign.
v1.3.x — RWKV-7 Goose 1.5B, source-monitor cron, 3-round red-team, dist/ committed, docs sync.
v1.3.0 — CLI↔MCP parity, TASK_MAP expanded 40→89, version command, terminology alignment ("Core AI").
v1.2.x — Fuzzy search, capability aliases, ANSI auto-detect, recommend --license, installer hardening, DX improvements.
v1.0 — Error resilience: 8 crash fixes + 63-test suite + CI integration.
v0.6 — Technical backfill (precision, quantization, runtime flags), non-LLM benchmarks, terminology to 42 terms.
v0.5 — Expanded model coverage: ternary LLM, GUI grounding, visual retrieval, transducer ASR, video, 3D, diffusion LLM, VLA.
v0.4 — Verified Apple AI terminology layer, artifact officiality, benchmark provenance.
v0.3 — Validation depth, upstream taxonomy, benchmark registry.

Later:

Split large YAML files into data/models/*.yaml if the catalog grows significantly.
Richer model cards, per-model pages, and SEO optimization on the web UI.
Additional filters: runtime, maturity, confidence, artifact availability, modality.
Automated source verification (in progress via scripts/check_sources.sh).

Non-goals

This repository does not currently define:

model workflows
app logic
inference pipelines
benchmarking harnesses
model conversion scripts
runtime implementations

Those belong in separate repositories or future layers.

Upstream

Primary community upstream:

https://github.com/john-rocky/coreai-model-zoo

Official Apple recipe upstream:

https://github.com/apple/coreai-models

Additional upstream taxonomy:

upstreams.yaml
docs/upstream-map.md

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.2.3

Jul 4, 2026

2.2.2

Jul 4, 2026

2.2.1

Jul 3, 2026

2.2.0

Jul 3, 2026

2.1.0

Jul 2, 2026

2.0.5

Jul 1, 2026

2.0.4

Jul 1, 2026

This version

2.0.3

Jul 1, 2026

2.0.2

Jul 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreai_catalog-2.0.3.tar.gz (76.4 kB view details)

Uploaded Jul 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

coreai_catalog-2.0.3-py3-none-any.whl (72.6 kB view details)

Uploaded Jul 1, 2026 Python 3

File details

Details for the file coreai_catalog-2.0.3.tar.gz.

File metadata

Download URL: coreai_catalog-2.0.3.tar.gz
Upload date: Jul 1, 2026
Size: 76.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for coreai_catalog-2.0.3.tar.gz
Algorithm	Hash digest
SHA256	`c6de37da90532bb81b620636b3f8e0a4c598f571b769983e7810f21485d9c901`
MD5	`9d78fc3e4826828d0b602932e256f81e`
BLAKE2b-256	`1475155be47e4e77a5f10d159a6177bb32898ed2bcca63633144b15e4be96191`

See more details on using hashes here.

File details

Details for the file coreai_catalog-2.0.3-py3-none-any.whl.

File metadata

Download URL: coreai_catalog-2.0.3-py3-none-any.whl
Upload date: Jul 1, 2026
Size: 72.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for coreai_catalog-2.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ce43a7ca4737a68a2f62e873e78816ed213b02bd61908358942c067201ef1ab2`
MD5	`ad2e745946dddefb9d533aa970425920`
BLAKE2b-256	`753c775bf71008acac90366b9adf050c50515d9c91c2de14654715def2bdb453`

See more details on using hashes here.

coreai-catalog 2.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Core AI Catalog

Scope and disclaimer

Status

Quick Start

Why this exists

Current scope

Repository structure

Source of truth

Core data model

Source layers

Model groups

Official Apple recipe conversions

Original model attribution

Capabilities covered

Devices and runtime metadata

Validation and generation

CLI

Commands

MCP server (Agent API)

Setup

Configure in Claude Desktop

Available tools

Example agent interaction

Query and decision

Documentation

Attribution

License handling

Maintenance rules

Roadmap

Non-goals

Upstream

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes