Skip to main content

LiteLLM custom provider for Apple's on-device Foundation Models

Project description

LiteLLM Apple Foundation Models (custom provider)

LiteLLM custom provider for Apple's on-device Foundation Models (macOS 26+ with Apple Intelligence).

Pre-requisites

  • macOS 26.0+ (Sequoia) with Apple Intelligence enabled
  • Python 3.9+

Install

pip install -e .

Requires Python 3.9+ and apple-foundation-models (only available on macOS).

Quick start

import litellm
from litellm_apple_foundation_models import (
    register_apple_foundation_models_provider,
)

# Register the provider once in your process
register_apple_foundation_models_provider()

resp = litellm.completion(
    model="apple_foundation_models/system",
    messages=[{"role": "user", "content": "Hello from macOS"}],
)
print(resp.choices[0].message.content)

Streaming and async are supported:

resp = litellm.completion(
    model="apple_foundation_models/system",
    messages=[{"role": "user", "content": "Write a haiku about objc"}],
    stream=True,
)
for chunk in resp:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Async:

import asyncio
from litellm import acompletion

async def main():
    response = await acompletion(
        model="apple_foundation_models/system",
        messages=[{"role": "user", "content": "Hello, how are you?"}],
        max_tokens=100,
    )
    print(response)

asyncio.run(main())

Async + streaming:

import asyncio
from litellm import acompletion

async def main():
    response = await acompletion(
        model="apple_foundation_models/system",
        messages=[{"role": "user", "content": "Write a short poem about AI"}],
        stream=True,
        max_tokens=200,
    )

    async for chunk in response:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)

asyncio.run(main())

Tool calling

from litellm import completion
from litellm_apple_foundation_models import register_apple_foundation_models_provider

register_apple_foundation_models_provider()

def get_weather(location: str, units: str = "celsius") -> str:
    """Get the current weather for a location."""
    return f"Weather in {location}: 22°{units[0].upper()}, sunny"

def calculate(a: int, b: int) -> int:
    """Add two numbers together."""
    return a + b

response = completion(
    model="apple_foundation_models/system",
    messages=[{"role": "user", "content": "What's the weather in Paris and what's 5 plus 7?"}],
    tool_functions=[get_weather, calculate],
    max_tokens=200,
)

if response.choices[0].message.tool_calls:
    for tool_call in response.choices[0].message.tool_calls:
        print(f"Tool: {tool_call.function.name}({tool_call.function.arguments})")

print(response.choices[0].message.content)

You can also pass OpenAI-style schemas via tools=[...] and map implementations with tool_functions.

Structured output

Pydantic:

from pydantic import BaseModel
from litellm import completion

class Person(BaseModel):
    name: str
    age: int
    city: str

response = completion(
    model="apple_foundation_models/system",
    messages=[{"role": "user", "content": "Extract person info: Alice is 30 and lives in Paris."}],
    response_format=Person,
    max_tokens=150,
)
print(response.choices[0].message.content)

JSON schema:

import json
from litellm import completion

schema = {
    "type": "object",
    "properties": {
        "answer": {"type": "string"},
        "confidence": {"type": "number"},
    },
    "required": ["answer"],
}

response = completion(
    model="apple_foundation_models/system",
    messages=[{"role": "user", "content": "Is the sky blue? Return JSON with answer and confidence (0-1)."}],
    response_format={"type": "json_schema", "json_schema": {"schema": schema}},
    max_tokens=100,
)
print(response.choices[0].message.content)

Supported parameters

  • temperature: float
  • max_tokens: int
  • stream: bool
  • tools / tool_functions
  • response_format (Pydantic model or JSON schema)

How it works

  • Uses LiteLLM's CustomLLM interface to avoid touching LiteLLM core.
  • Registers itself via litellm.custom_provider_map so calls to litellm.completion with model="apple_foundation_models/*" are routed here.

Development

  • Tests live under tests/ (mirrors the coverage from the original core PR).
  • To run tests locally:
pip install -e .[dev]
pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litellm_apple_foundation_models-0.1.0.tar.gz (200.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

litellm_apple_foundation_models-0.1.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file litellm_apple_foundation_models-0.1.0.tar.gz.

File metadata

File hashes

Hashes for litellm_apple_foundation_models-0.1.0.tar.gz
Algorithm Hash digest
SHA256 54fa86dccf47818e9bb1b350cf9005510b4be8523904e1703c725779e2545820
MD5 3df0cb10c5b38bcef4cc76adb9e2e28e
BLAKE2b-256 db43ac7598da51bc889b1222c27d7ee696ff1c13b49efeed9a9ef1c4b61ab79b

See more details on using hashes here.

Provenance

The following attestation bundles were made for litellm_apple_foundation_models-0.1.0.tar.gz:

Publisher: publish.yml on btucker/litellm-apple-foundation-models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file litellm_apple_foundation_models-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for litellm_apple_foundation_models-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8bb0bc08666a88c58cdc9b9a02ea0229ec5350a4d3730999b8168c03f00cfda9
MD5 017d9b4d3817206fdc7dcad4c0fdd6b4
BLAKE2b-256 6d3ea45aed68d1c0afbb82c9eafe5d6365a57498d5809cb34696a8d89cec1216

See more details on using hashes here.

Provenance

The following attestation bundles were made for litellm_apple_foundation_models-0.1.0-py3-none-any.whl:

Publisher: publish.yml on btucker/litellm-apple-foundation-models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page