Skip to main content

Run MLX compatible HuggingFace models with Pydantic AI locally

Project description

pydantic-ai-mlx

MLX local inference for Pydantic AI through LM Studio or mlx-lm directly.


pydantic-ai-mlx-lm PyPI download count

Run MLX compatible HuggingFace models on Apple silicon locally with Pydantic AI.

Two options are provided as backends;

  • LM Studio backend (OpenAI compatible server that can also utilize mlx-lm, model runs on a seperate process)
  • mlx-lm backend (direct integration with Apple's library, model runs within your Python process, experimental)

STILL IN DEVELOPMENT, NOT RECOMMENDED FOR PRODUCTION USE YET.

Contributions are welcome!

Features

  • LM Studio backend, should be fully supported
  • Streaming text support for mlx-lm backend
  • Tool calling support for mlx-lm backend

Apple's MLX seems more performant on Apple silicon than llama.cpp (Ollama), as of January 2025.

Installation

uv add pydantic-ai-mlx

Usage

LM Studio backend

from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage
from pydantic_ai_lm_studio import LMStudioModel

model = LMStudioModel(model_name="mlx-community/Qwen2.5-7B-Instruct-4bit") # supports tool calling
agent = Agent(model, system_prompt="You are a chatbot.")

async def stream_response(user_prompt: str, message_history: list[ModelMessage]):
    async with agent.run_stream(user_prompt, message_history) as result:
        async for message in result.stream():
            yield message

mlx-lm backend

from pydantic_ai import Agent
from pydantic_ai.messages import ModelMessage
from pydantic_ai_mlx_lm import MLXModel

model = MLXModel(model_name="mlx-community/Llama-3.2-3B-Instruct-4bit")
# See https://github.com/ml-explore/mlx-examples/blob/main/llms/README.md#supported-models
# also https://huggingface.co/mlx-community

agent = Agent(model, system_prompt="You are a chatbot.")

async def stream_response(user_prompt: str, message_history: list[ModelMessage]):
    async with agent.run_stream(user_prompt, message_history) as result:
        async for message in result.stream():
            yield message

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantic_ai_mlx-0.2.0.tar.gz (43.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydantic_ai_mlx-0.2.0-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file pydantic_ai_mlx-0.2.0.tar.gz.

File metadata

  • Download URL: pydantic_ai_mlx-0.2.0.tar.gz
  • Upload date:
  • Size: 43.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.24

File hashes

Hashes for pydantic_ai_mlx-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2cb433ef915eafcd84abb08369d7c0ee4abac354fa3e362e88af8821b573ada5
MD5 3afee07dc7cf61845732a1cb3b1ff217
BLAKE2b-256 f69552d87d5a055d5099310dce4d27c7ddb5e432265a865a28417337e6f6c2cf

See more details on using hashes here.

File details

Details for the file pydantic_ai_mlx-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pydantic_ai_mlx-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e3a0880e388dd005693ed8f0e78682962ea397dc9b7b77f142318ee66a1c310
MD5 3b1c84ecae04d7955f65e6723cc0b1cd
BLAKE2b-256 5a0cd7f57fcd57b44ccbfb4b0685bc55be28003c288b59b5011f6a7a11e92af6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page