Skip to main content

Pydantic AI Ollama integration

Project description

Pydantic AI Ollama Wrapper

This project provides a custom OllamaModel wrapper for the Pydantic AI framework, allowing seamless integration and fine-grained control over Ollama models. It addresses the limitations of using the generic OpenAIModel for Ollama-specific parameters and features, including faster response times and fewer API calls.

Features

  • Dedicated OllamaModel class for Pydantic AI.
  • Comprehensive OllamaModelSettings for all Ollama-specific parameters, including temperature, num_predict, top_k, top_p, stop, and advanced think (reasoning) mode.
  • Integration with the standard Ollama Python client.
  • Easy to use with Pydantic AI's Agent.
  • Full support for streaming responses.
  • Function Calling / Tool Use: Define and use tools for structured interactions with the model.
  • Structured Output: Generate responses directly as Pydantic models (JSON format).
  • Multi-modal Input: Send images along with text prompts to supported Ollama models.

Quick Start

Install using pip

pip install pydanticai-ollama

Install using uv

uv add pydanticai-ollama

Installation

  1. Clone the repository (if you haven't already):

    git clone https://github.com/ariel-ml/pydanticai-ollama.git
    cd pydanticai-ollama
    
  2. Create and activate a virtual environment using uv:

    uv venv
    # On Linux/macOS:
    source .venv/bin/activate
    # On Windows:
    # .venv\Scripts\activate
    
  3. Install the project dependencies:

    uv sync
    

    This command installs the project in editable mode, making the OllamaModel and related components available to your Python environment.

Usage

First, ensure you have an Ollama server running and a model downloaded (e.g., ollama run qwen3:4b-instruct or ollama run gemma3:latest for multi-modal capabilities).

Basic Text Generation

import asyncio
from pydantic import BaseModel
from pydantic_ai import Agent
from pydanticai_ollama.models.ollama import OllamaModel
from pydanticai_ollama.providers.ollama import OllamaProvider
from pydanticai_ollama.settings.ollama import OllamaModelSettings

# 1. Define your output type (optional, but recommended for structured responses)
class CityLocation(BaseModel):
    city: str
    country: str

# 2. Configure OllamaModelSettings with desired parameters
#    You can set any parameter defined in OllamaModelSettings (e.g., temperature, num_predict, top_k, etc.)
ollama_settings = OllamaModelSettings(
    temperature=0.7,
    num_predict=128,
    num_ctx=2048,
    main_gpu=0,
    num_gpu=1,
    num_thread=4,
    repeat_penalty=1.1,
    top_k=40,
    top_p=0.9
)

# 3. Initialize OllamaProvider with your Ollama server's base URL
#    Default is "http://localhost:11434"
ollama_provider = OllamaProvider(base_url="http://localhost:11434")

# 4. Create an OllamaModel instance
#    Replace 'llama2' with the name of the Ollama model you want to use
ollama_model = OllamaModel(
    model_name='qwen3:4b-instruct',
    provider=ollama_provider,
    settings=ollama_settings,
)

# 5. Create a Pydantic AI Agent with your OllamaModel
agent = Agent(ollama_model, output_type=CityLocation)

# 6. Run the agent
async def main():
    result = await agent.run('Where were the olympics held in 2012?')
    print(result.output)
    print(result.usage())

if __name__ == "__main__":
    asyncio.run(main())

Streaming Responses

The OllamaModel fully supports streaming responses. When you run an agent with stream=True (or if the underlying model supports streaming by default), you can iterate over the response:

import asyncio
from pydantic_ai import Agent
from pydanticai_ollama.models.ollama import OllamaModel
from pydanticai_ollama.providers.ollama import OllamaProvider

async def streaming_example():
    ollama_model = OllamaModel(
        model_name='qwen3:4b-instruct',
        provider=OllamaProvider(base_url="http://localhost:11434")
    )
    agent = Agent(ollama_model)

    print("Streaming response:")
    async for chunk in agent.run_stream('Tell me a long story about a space cat.'):
        if chunk.output:
            print(chunk.output, end='', flush=True)
    print("\nStreaming finished.")

if __name__ == "__main__":
    asyncio.run(streaming_example())

Function Calling / Tool Use

Define tools using Pydantic models and let the Ollama model call them:

import asyncio
from pydantic import BaseModel, Field
from pydantic_ai import Agent
from pydanticai_ollama.models.ollama import OllamaModel
from pydanticai_ollama.providers.ollama import OllamaProvider

# 1. Define a Pydantic model for your tool's arguments
class GetCurrentWeatherArgs(BaseModel):
    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")
    unit: str = Field(default="celsius", description="The unit of temperature (celsius or fahrenheit) to return")

async def tool_use_example():
    ollama_model = OllamaModel(model_name='qwen3:4b-instruct') # Ensure model supports function calling (e.g., qwen)
    agent = Agent(ollama_model)

    @agent.tool
    def get_current_weather(ctx: RunContext[str], args: GetCurrentWeatherArgs) -> str:
        """Get the current weather in a given location."""
        # In a real application, this would call an external weather API
        print(f"Tool args: {args}")
        if "San Francisco" in args.location:
            return f"24 degrees {args.unit} and sunny in San Francisco."
        else:
            return f"28 degrees {args.unit} and cloudy in {args.location}."

    print("Tool use example:")
    result = await agent.run("What's the weather like in San Francisco?")
    print(result.output)

if __name__ == "__main__":
    asyncio.run(tool_use_example())

Structured Output (JSON Mode)

Force the model to respond with a Pydantic model (JSON):

import asyncio
from pydantic import BaseModel, Field
from pydantic_ai import Agent
from pydanticai_ollama.models.ollama import OllamaModel
from pydanticai_ollama.providers.ollama import OllamaProvider

class Joke(BaseModel):
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline of the joke")

async def structured_output_example():
    ollama_model = OllamaModel(model_name='qwen3:4b-instruct') # Ensure model supports JSON output
    agent = Agent(ollama_model, output_type=Joke)

    print("Structured output example:")
    joke_obj = await agent.run("Tell me a joke about a computer.")
    print(f"Setup: {joke_obj.output.setup}")
    print(f"Punchline: {joke_obj.output.punchline}")

if __name__ == "__main__":
    asyncio.run(structured_output_example())

Multi-modal Input (Images)

Send an image along with your prompt (requires a multi-modal Ollama model like llava):

import asyncio
from pydantic_ai import Agent, ImageUrl
from pydanticai_ollama.models.ollama import OllamaModel
from pydanticai_ollama.providers.ollama import OllamaProvider

async def multimodal_example():
    ollama_model = OllamaModel(model_name='gemma3:latest') # Use a multi-modal model like LLaVA
    agent = Agent(ollama_model)

    # You can use a local file path or a URL
    image_path = "tests/assets/kiwi.png" # Ensure this path is correct or use a URL
    image_url = ImageUrl(url=image_path)

    print("Multi-modal example (describing an image):")
    result = await agent.run([image_url, "What is in this image?"])
    print(result.output)

if __name__ == "__main__":
    asyncio.run(multimodal_example())

Development

Running Tests

To run the unit tests, ensure your virtual environment is activated and then execute:

.venv/bin/python -m pytest tests/

Note: If you encounter ModuleNotFoundError during testing, ensure that the project is installed in editable mode (uv sync) and that __init__.py files are present in all package directories within src/pydanticai_ollama.

Project Structure

. (project root)
├── src/
│   └── pydanticai_ollama/
│       ├── models/
│       │   └── ollama.py
│       ├── providers/
│       │   └── ollama.py
│       └── settings/
│           └── ollama.py
├── tests/
│   ├── __init__.py
│   ├── conftest.py
│   ├── assets/
│   │   └── kiwi.png
│   ├── models/
│   │   ├── __init__.py
│   │   ├── mock_async_stream.py
│   │   └── test_ollama.py
│   └── providers/
│       ├── __init__.py
│       └── test_ollama.py
├── pyproject.toml
├── README.md
├── uv.lock
└── .python-version

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydanticai_ollama-0.1.4.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydanticai_ollama-0.1.4-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file pydanticai_ollama-0.1.4.tar.gz.

File metadata

  • Download URL: pydanticai_ollama-0.1.4.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.0

File hashes

Hashes for pydanticai_ollama-0.1.4.tar.gz
Algorithm Hash digest
SHA256 900daeae920f8ddaad5e53b44c0a2478e3c34af81421ed97e91e9e5380214587
MD5 dd6e607232f6136583e68ad4fa2e425d
BLAKE2b-256 e0d5ab49ee178e59f019cbcabe0a5d09e33648686f6aaf084db38af8b52b2665

See more details on using hashes here.

File details

Details for the file pydanticai_ollama-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for pydanticai_ollama-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 27b64b35c6955123fbf18cb769f7e14ee52706b9114d784b555553a7fe810cb3
MD5 26a0f5fb733533dea04a64ae53557e3c
BLAKE2b-256 d9109286486eb9772940778900bb6ad09406c6910e9bc575f79b89b6dccc6144

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page