Skip to main content

Pydantic-AI models for AgentPool

Project description

LLMling-models

PyPI License Package status Monthly downloads Distribution format Wheel availability Python version Implementation Releases Github Contributors Github Discussions Github Forks Github Issues Github Issues Github Watchers Github Stars Github Repository size Github last commit Github release date Github language count Github commits this month Package status PyUp

llmling-models

Collection of model wrappers and adapters for use with AgentPool, but should work with the underlying pydantic-ai API without issues.

CodeModeToolset

The CodeModeToolset wraps other pydantic-ai toolsets and provides Python code execution with all their tools available as async functions. This approach is more effective than traditional tool calling because LLMs have extensive training on real-world code but limited exposure to synthetic tool-calling examples. When tools are presented as a programming API, LLMs can handle more complex scenarios, chain multiple operations efficiently without token overhead between calls, and leverage their code-writing strengths. (more info: https://blog.cloudflare.com/code-mode/)

import webbrowser
from pydantic_ai import Agent
from pydantic_ai.toolsets.function import FunctionToolset
from llmling_models import CodeModeToolset

# Create toolsets with tools
browser_toolset = FunctionToolset(tools=[webbrowser.open])

# Wrap in CodeModeToolset  
code_toolset = CodeModeToolset([browser_toolset])

# Use with agent
agent = Agent(model="openai:gpt-4", toolsets=[code_toolset])

async with agent:
    result = await agent.run("Open google.com in a new tab using Python code")
    # The LLM can now write: await open("https://google.com", new=2)

The toolset generates documentation for all available tools and handles the execution environment automatically.

Available Models

Augmented Model

Enhances prompts through pre- and post-processing steps using auxiliary language models:

from llmling_models import AugmentedModel

model = AugmentedModel(
    main_model="openai:gpt-4",
    pre_prompt={
        "text": "Expand this question: {input}",
        "model": "openai:gpt-3.5-turbo"
    },
    post_prompt={
        "text": "Summarize this response concisely: {output}",
        "model": "openai:gpt-3.5-turbo"
    }
)
agent = Agent(model)

# The question will be expanded before processing
# and the response will be summarized afterward
result = await agent.run("What is AI?")

Input Model

A model that delegates responses to human input, useful for testing, debugging, or creating hybrid human-AI workflows:

from pydantic_ai import Agent
from llmling_models import InputModel

# Basic usage with default console input
model = InputModel(
    prompt_template="🤖 Question: {prompt}",
    show_system=True,
    input_prompt="Your answer: ",
)

# Create agent with system context
agent = Agent(
    model=model,
    system_prompt="You are helping test an input model. Be concise.",
)

# Run interactive conversation
result = await agent.run("What's your favorite color?")
print(f"You responded: {result.output}")

# Supports streaming input
async with agent.run_stream("Tell me a story...") as response:
    async for chunk in response.stream():
        print(chunk, end="", flush=True)

Features:

  • Interactive console input for testing and debugging
  • Support for streaming input (character by character, but not "true" async with default handler)
  • Configurable message formatting
  • Custom input handlers for different input sources
  • System message display control
  • Full conversation context support

This model is particularly useful for:

  • Testing complex prompt chains
  • Creating hybrid human-AI workflows
  • Debugging agent behavior
  • Collecting human feedback
  • Educational scenarios where human input is needed

User Select Model

An interactive model that lets users manually choose which model to use for each prompt:

from pydantic_ai import Agent
from llmling_models import UserSelectModel

# Basic setup with model list
model = UserSelectModel(
    models=["openai:gpt-4o-mini", "openai:gpt-3.5-turbo", "anthropic:claude-3"]
)

agent = Agent(model)

# The user will be shown the prompt and available models,
# and can choose which one to use for the response
result = await agent.run("What is the meaning of life?")

Model Delegation

Dynamically selects models based on given prompt. Uses a selector model to choose the most appropriate model for each task:

from pydantic_ai import Agent
from llmling_models import DelegationMultiModel

# Basic setup with model list
delegation_model = DelegationMultiModel(
    selector_model="openai:gpt-4-turbo",
    models=["openai:gpt-4", "openai:gpt-3.5-turbo"],
    selection_prompt="Pick gpt-4 for complex tasks, gpt-3.5-turbo for simple queries."
)

# Advanced setup with model descriptions
delegation_model = DelegationMultiModel(
    selector_model="openai:gpt-4-turbo",
    models=["openai:gpt-4", "anthropic:claude-2", "openai:gpt-3.5-turbo"],
    model_descriptions={
        "openai:gpt-4": "Complex reasoning, math problems, and coding tasks",
        "anthropic:claude-2": "Long-form analysis and research synthesis",
        "openai:gpt-3.5-turbo": "Simple queries, chat, and basic information"
    },
    selection_prompt="Select the most appropriate model for the task."
)

agent = Agent(delegation_model)

# The selector model will analyze the prompt and choose the most suitable model
result = await agent.run("Solve this complex mathematical proof...")

The cost-optimized model ensures you stay within budget while getting the best possible model for your needs, while the token-optimized model automatically handles varying input lengths by selecting models with appropriate context windows.

Remote Input Model

A model that connects to a remote human operator, allowing distributed human-in-the-loop operations:

from pydantic_ai import Agent
from llmling_models import RemoteInputModel

# Basic setup with WebSocket (preferred for streaming)
model = RemoteInputModel(
    url="ws://operator:8000/v1/chat/stream",
    api_key="your-api-key"
)

# Or use REST API
model = RemoteInputModel(
    url="http://operator:8000/v1/chat",
    api_key="your-api-key"
)

agent = Agent(model)

# The request will be forwarded to the remote operator
result = await agent.run("What's the meaning of life?")
print(f"Remote operator responded: {result.output}")

# Streaming also works with WebSocket protocol
async with agent.run_stream("Tell me a story...") as response:
    async for chunk in response.stream():
        print(chunk, end="", flush=True)

Features:

  • Distributed human-in-the-loop operations
  • WebSocket support for real-time streaming
  • REST API for simpler setups
  • Full conversation context support
  • Secure authentication via API keys

Setting up a Remote Model Server

Setting up a remote model server is straightforward. You just need a pydantic-ai model and can start serving it:

from llmling_models.remote_model.server import ModelServer

# Create and start server
server = ModelServer(
    model="openai:gpt-4",
    api_key="your-secret-key",  # Optional authentication
)
server.run(port=8000)

That's it! The server now accepts both REST and WebSocket connections and handles all the message protocol details for you.

Features:

  • Simple setup - just provide a model
  • Optional API key authentication
  • Automatic handling of both REST and WebSocket protocols
  • Full pydantic-ai message protocol support
  • Usage statistics forwarding
  • Built-in error handling and logging

For development, you might want to run the server locally:

server = ModelServer(
    model="openai:gpt-4",
    api_key="dev-key"
)
server.run(host="localhost", port=8000)

For production, you'll typically want to run it on a public server with proper authentication:

server = ModelServer(
    model="openai:gpt-4",
    api_key="your-secure-key",  # Make sure to use a strong key
    title="Production GPT-4 Server",
    description="Serves GPT-4 model for production use"
)
server.run(
    host="0.0.0.0",  # Accept connections from anywhere
    port=8000,
    workers=4  # Multiple workers for better performance
)

Both REST and WebSocket protocols are supported, with WebSocket being preferred for streaming capabilities. They also maintain the full pydantic-ai message protocol, ensuring compatibility with all features of the framework.

All multi models are generically typed to follow pydantic best practices. Usefulness for that is debatable though. :P

Providers

LLMling-models extends the capabilities of pydantic-ai with additional provider implementations that make it easy to connect to various LLM API services.

Available Providers

The package includes the following provider implementations:

GitHub Copilot Provider

Connect to GitHub Copilot's API for code-focused tasks (requires token management):

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from llmling_models.providers.copilot_provider import CopilotProvider

# Requires tokonomics.CopilotTokenManager to handle token management
provider = CopilotProvider()  # Uses tokonomics for authentication
model = OpenAIModel("gpt-4o-mini", provider=provider)
agent = Agent(model=model)
result = await agent.run("Write a function to calculate Fibonacci numbers")

LM Studio Provider

Connect to local LM Studio inference server for open-source models:

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIModel
from llmling_models.providers.lm_studio_provider import LMStudioProvider

provider = LMStudioProvider(base_url="http://localhost:11434/v1")
model = OpenAIModel("model_name", provider=provider)  # Use model loaded in LM Studio
agent = Agent(model=model)
result = await agent.run("Tell me about yourself")

Provider Utility Functions

infer_provider

The infer_provider function extends pydantic-ai's provider inference to include all LLMling-models providers:

from llmling_models.providers import infer_provider

# Get provider by name
provider = infer_provider("openrouter")  # Returns OpenRouterProvider instance
provider = infer_provider("grok")        # Returns GrokProvider instance
provider = infer_provider("perplexity")  # Returns PerplexityProvider instance
provider = infer_provider("copilot")     # Returns CopilotProvider instance
provider = infer_provider("lm-studio")   # Returns LMStudioProvider instance

# Still works with standard providers too
provider = infer_provider("openai")      # Returns pydantic_ai's OpenAIProvider

Extended infer_model Function

LLMling-models provides an extended infer_model function that resolves various model notations to appropriate instances:

from llmling_models import infer_model

# Provider prefixes (requires appropriate API keys as env vars)
model = infer_model("openai:gpt-4o")             # OpenAI models
model = infer_model("openrouter:anthropic/opus") # OpenRouter (requires OPENROUTER_API_KEY)
model = infer_model("grok:grok-2-1212")          # Grok/X.AI (requires X_AI_API_KEY)
model = infer_model("perplexity:sonar-medium")   # Perplexity (requires PERPLEXITY_API_KEY)
model = infer_model("deepseek:deepseek-chat")    # DeepSeek (requires DEEPSEEK_API_KEY)
model = infer_model("copilot:gpt-4o-mini")       # GitHub Copilot (requires token management)
model = infer_model("lm-studio:model-name")      # LM Studio local models

# LLMling's special models
model = infer_model("simple-openai:gpt-4")      # Simple HTTPX-based OpenAI client
model = infer_model("input")                    # Interactive human input model
model = infer_model("remote_model:ws://url")    # Remote model proxy
model = infer_model("remote_input:ws://url")    # Remote human input
model = infer_model("import:module.path:Class") # Import model from Python path

# Testing
model = infer_model("test:Custom response")     # Test model with fixed output

The function provides a fallback to a simple HTTPX-based OpenAI client in environments where the full OpenAI library is not available (like Pyodide/WebAssembly contexts).

Environment Variable Configuration

For convenience, most providers support configuration via environment variables:

Provider Environment Variable Purpose
OpenRouter OPENROUTER_API_KEY API key for authentication
Grok (X.AI) X_AI_API_KEY or GROK_API_KEY API key for authentication
DeepSeek DEEPSEEK_API_KEY API key for authentication
Perplexity PERPLEXITY_API_KEY API key for authentication
Copilot Uses tokonomics token management -
LM Studio LM_STUDIO_BASE_URL Base URL for local server
OpenAI OPENAI_API_KEY API key for authentication


### Claude Code Model

A model that delegates to the Claude Code CLI via the Claude Agent SDK, providing access to Claude with filesystem access, code execution, and other agentic capabilities:

```python
from pydantic_ai import Agent
from llmling_models import ClaudeCodeModel, ClaudeCodeReadTool, ClaudeCodeGlobTool

# Basic usage
model = ClaudeCodeModel(model="sonnet")
agent = Agent(model=model)

# Without builtin tools, Claude has no tool access
result = await agent.run("What is 2+2?")

# With builtin tools, Claude can use them
result = await agent.run(
    "What files are in the current directory?",
    builtin_tools=[ClaudeCodeGlobTool(), ClaudeCodeReadTool()],
)

Available builtin tools:

  • ClaudeCodeReadTool - Read file contents
  • ClaudeCodeWriteTool - Write files
  • ClaudeCodeEditTool - Edit files
  • ClaudeCodeBashTool - Execute bash commands
  • ClaudeCodeGlobTool - Find files by pattern
  • ClaudeCodeGrepTool - Search file contents
  • ClaudeCodeWebSearchTool - Search the web
  • ClaudeCodeWebFetchTool - Fetch web content
  • ClaudeCodeTaskTool - Spawn subagents
  • ClaudeCodeNotebookEditTool - Edit Jupyter notebooks

Convenience functions for batch tool access:

from llmling_models import claude_code_all_tools, claude_code_read_only_tools

# All tools
agent = Agent(model=model, builtin_tools=claude_code_all_tools())

# Read-only tools (no writes, edits, or bash)
agent = Agent(model=model, builtin_tools=claude_code_read_only_tools())

Configuration options:

model = ClaudeCodeModel(
    model="opus",  # or "sonnet", "haiku", or full names like "claude-sonnet-4-5-20250929"
    cwd="/path/to/workdir",  # Working directory
    permission_mode="bypassPermissions",  # Tool permission handling
    system_prompt="Custom system prompt",
    max_turns=10,  # Max conversation turns
    max_thinking_tokens=1000,  # For extended thinking
)

Requires authentication via claude login (the CLI handles auth automatically).

Installation

uv add llmling-models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmling_models-1.4.4.tar.gz (71.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmling_models-1.4.4-py3-none-any.whl (98.8 kB view details)

Uploaded Python 3

File details

Details for the file llmling_models-1.4.4.tar.gz.

File metadata

  • Download URL: llmling_models-1.4.4.tar.gz
  • Upload date:
  • Size: 71.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llmling_models-1.4.4.tar.gz
Algorithm Hash digest
SHA256 1a05f175cc4bc51c3e7e00a3b1b9fb835b2f86cfa06b03d08814c976d759bde8
MD5 320da055a854b1d8ad56f4f0203b9deb
BLAKE2b-256 f033816d08f67409d9951f075668a7b794b4fd807d9ba8b765c0c33ef3a91aef

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmling_models-1.4.4.tar.gz:

Publisher: build.yml on phil65/LLMling-models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llmling_models-1.4.4-py3-none-any.whl.

File metadata

  • Download URL: llmling_models-1.4.4-py3-none-any.whl
  • Upload date:
  • Size: 98.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llmling_models-1.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 849c4fcc3ed1228959b3394b0366dd90c86c124155bb8366f439acc634a00036
MD5 4ec5cf70f694a2438ec8413ce5238fb5
BLAKE2b-256 cb87797af8024f404928a737d87739adf93fd160891804459f8754ae2436f3ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmling_models-1.4.4-py3-none-any.whl:

Publisher: build.yml on phil65/LLMling-models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page