Skip to main content

A lightweight Pydantic AI model provider that routes requests to web-based LLMs (OpenAI, Google) through Temporal workflows.

Project description

pydantic_ai_web_models

PyPI Python Docs

Disclaimer: This project was created for personal learning and educational purposes only. It is not intended for production use, nor does it encourage or endorse circumventing any terms of service. The author makes no warranties and accepts no liability for any misuse. Users are solely responsible for ensuring their usage complies with the terms of service of OpenAI, Google, and any other third-party services they interact with.

A lightweight Pydantic AI model provider that routes requests to web-based LLMs (OpenAI, Google) through Temporal workflows. No API keys needed — the Temporal worker handles browser-based LLM access.

Documentation: https://eugenzor.github.io/pydantic-ai-web-models/

Important: This package provides only the Pydantic AI model provider (the client side). It does not include the Temporal workflows required to actually execute LLM requests. You must develop and deploy your own Temporal worker that implements the LLMInvokeWorkflow workflow type. Without a running worker that handles browser-based LLM access, this package cannot function on its own.

Installation

pip install pydantic-ai-web-models

or with uv:

uv add pydantic-ai-web-models

Prerequisites

  • A running Temporal server (default: localhost:7233)
  • A Temporal worker listening on ai-worker-task-queue that implements the LLMInvokeWorkflow
  • Python 3.11+
  • pydantic-ai and temporalio installed

Quick Start

from pydantic_ai import Agent
import pydantic_ai_web_models  # registers providers on import

agent = Agent(model="google-web:gemini-3-flash")
result = agent.run_sync("What is the capital of France?")
print(result.data)
# Paris is the capital of France...

Importing pydantic_ai_web_models automatically registers the openai-web and google-web prefixes with Pydantic AI's model inference, so you can pass model strings directly to Agent().

Available Models

Provider Model String Description
google-web google-web:gemini-3-flash Gemini 3 Flash
google-web google-web:gemini-3-flash-thinking Gemini 3 Flash with thinking
google-web google-web:gemini-3.1-pro Gemini 3.1 Pro
openai-web openai-web:gpt-5-3 GPT-5-3
openai-web openai-web:gpt-5-4-standard GPT-5-4 Standard
openai-web openai-web:gpt-5-4-extended GPT-5-4 Extended

Usage Examples

Basic Text Response (Async)

import asyncio
from pydantic_ai import Agent
import pydantic_ai_web_models

async def main():
    agent = Agent(model="google-web:gemini-3-flash")
    result = await agent.run("Explain quantum computing in one paragraph.")
    print(result.data)

asyncio.run(main())

Basic Text Response (Sync)

from pydantic_ai import Agent
import pydantic_ai_web_models

agent = Agent(model="openai-web:gpt-5-3")
result = agent.run_sync("Write a haiku about Python programming.")
print(result.data)

Structured Output

Use Pydantic models as output_type to get validated, typed responses. The model is instructed to respond with JSON matching the schema, and the response is automatically parsed and validated.

from pydantic import BaseModel
from pydantic_ai import Agent
import pydantic_ai_web_models


class CityInfo(BaseModel):
    name: str
    country: str
    population: int
    famous_for: list[str]


agent = Agent(
    model="google-web:gemini-3-flash",
    output_type=CityInfo,
)
result = agent.run_sync("Tell me about Tokyo.")
city = result.data

print(f"{city.name}, {city.country}")
print(f"Population: {city.population:,}")
print(f"Famous for: {', '.join(city.famous_for)}")
# Tokyo, Japan
# Population: 13,960,000
# Famous for: cherry blossoms, Shibuya crossing, sushi, anime

Structured Output with Nested Models

from pydantic import BaseModel
from pydantic_ai import Agent
import pydantic_ai_web_models


class Address(BaseModel):
    street: str
    city: str
    country: str


class Person(BaseModel):
    name: str
    age: int
    occupation: str
    address: Address


agent = Agent(
    model="openai-web:gpt-5-4-standard",
    output_type=Person,
)
result = agent.run_sync(
    "Generate a fictional person living in Berlin who works as a software engineer."
)
person = result.data

print(f"{person.name}, age {person.age}")
print(f"Works as: {person.occupation}")
print(f"Lives at: {person.address.street}, {person.address.city}")

System Prompts

from pydantic_ai import Agent
import pydantic_ai_web_models

agent = Agent(
    model="google-web:gemini-3-flash",
    system_prompt="You are a helpful cooking assistant. Keep answers concise.",
)
result = agent.run_sync("How do I make scrambled eggs?")
print(result.data)

Multiple System Prompts with Decorators

from pydantic_ai import Agent
import pydantic_ai_web_models

agent = Agent(model="openai-web:gpt-5-3")


@agent.system_prompt
def base_prompt() -> str:
    return "You are a travel guide specializing in European destinations."


@agent.system_prompt
def style_prompt() -> str:
    return "Always include a practical tip at the end of your response."


result = agent.run_sync("What should I see in Prague?")
print(result.data)

Multi-turn Conversations

import asyncio
from pydantic_ai import Agent
import pydantic_ai_web_models


async def main():
    agent = Agent(model="google-web:gemini-3.1-pro")

    # First turn
    result1 = await agent.run("What are the three laws of thermodynamics?")
    print(result1.data)

    # Follow-up using message history
    result2 = await agent.run(
        "Can you explain the second one in simpler terms?",
        message_history=result1.all_messages(),
    )
    print(result2.data)


asyncio.run(main())

Structured Output with Enums and Optional Fields

from enum import Enum
from pydantic import BaseModel
from pydantic_ai import Agent
import pydantic_ai_web_models


class Sentiment(str, Enum):
    POSITIVE = "positive"
    NEGATIVE = "negative"
    NEUTRAL = "neutral"


class ReviewAnalysis(BaseModel):
    sentiment: Sentiment
    confidence: float
    key_topics: list[str]
    summary: str
    improvement_suggestion: str | None = None


agent = Agent(
    model="openai-web:gpt-5-4-standard",
    output_type=ReviewAnalysis,
)
result = agent.run_sync(
    "Analyze this review: 'The food was amazing but the service was incredibly slow. "
    "We waited 45 minutes for our appetizers. The dessert almost made up for it though.'"
)
analysis = result.data

print(f"Sentiment: {analysis.sentiment.value} ({analysis.confidence:.0%})")
print(f"Topics: {', '.join(analysis.key_topics)}")
print(f"Summary: {analysis.summary}")
if analysis.improvement_suggestion:
    print(f"Suggestion: {analysis.improvement_suggestion}")

Comparing Models Side-by-Side

import asyncio
from pydantic_ai import Agent
import pydantic_ai_web_models

MODELS = [
    "google-web:gemini-3-flash",
    "openai-web:gpt-5-3",
]


async def ask_model(model: str, prompt: str) -> str:
    agent = Agent(model=model)
    result = await agent.run(prompt)
    return result.data


async def main():
    prompt = "In exactly one sentence, what is the meaning of life?"
    tasks = [ask_model(m, prompt) for m in MODELS]
    responses = await asyncio.gather(*tasks)

    for model, response in zip(MODELS, responses):
        print(f"[{model}]: {response}\n")


asyncio.run(main())

Configuration

Via Environment Variables

All connection parameters can be set through environment variables (or a .env file in the working directory). Copy .env.example to .env and fill in the values relevant to your setup.

Local / self-hosted Temporal

TEMPORAL_ADDRESS=localhost:7233
TEMPORAL_NAMESPACE=default
TEMPORAL_TASK_QUEUE=ai-worker-task-queue
TEMPORAL_TIMEOUT_SECONDS=300

Temporal Cloud — API key authentication

Generate an API key in the Temporal Cloud UI under Settings → API Keys, then:

TEMPORAL_ADDRESS=<namespace>.tmprl.cloud:7233
TEMPORAL_NAMESPACE=<namespace>.<account-id>
TEMPORAL_API_KEY=<your-api-key>
TEMPORAL_TASK_QUEUE=ai-worker-task-queue

TEMPORAL_ADDRESS and TEMPORAL_NAMESPACE are read by the Temporal SDK's envconfig bridge; TEMPORAL_API_KEY is also consumed there and configures bearer-token auth automatically — no additional code is required.

mTLS client certificates

For self-hosted clusters that require mutual TLS, provide paths to PEM-encoded files:

TEMPORAL_ADDRESS=temporal.internal:7233
TEMPORAL_NAMESPACE=production
TEMPORAL_TLS_CERT=/path/to/client.pem
TEMPORAL_TLS_KEY=/path/to/client.key
# Optional — only needed for a custom/private CA:
TEMPORAL_TLS_CA=/path/to/ca.pem
# Optional — override TLS server name (SNI):
TEMPORAL_TLS_SERVER_NAME=temporal.internal
TEMPORAL_TASK_QUEUE=ai-worker-task-queue

TEMPORAL_TLS_CERT / TEMPORAL_TLS_KEY / TEMPORAL_TLS_CA / TEMPORAL_TLS_SERVER_NAME are not handled by the SDK bridge; they are read by TemporalConfig (via pydantic-settings) and applied when building the TLSConfig object before connecting.

Note: TEMPORAL_API_KEY and mTLS are mutually exclusive. Use one or the other depending on your cluster's auth policy.

Full environment variable reference

Variable Default Handled by Description
TEMPORAL_ADDRESS localhost:7233 SDK envconfig Temporal server address (host:port)
TEMPORAL_NAMESPACE default SDK envconfig Temporal namespace
TEMPORAL_API_KEY (unset) SDK envconfig Bearer token for Temporal Cloud API key auth
TEMPORAL_TLS_CERT (unset) TemporalConfig Path to PEM client certificate (mTLS)
TEMPORAL_TLS_KEY (unset) TemporalConfig Path to PEM client private key (mTLS)
TEMPORAL_TLS_CA (unset) TemporalConfig Path to PEM CA certificate (mTLS, optional)
TEMPORAL_TLS_SERVER_NAME (unset) TemporalConfig Override TLS SNI hostname (mTLS, optional)
TEMPORAL_TASK_QUEUE ai-worker-task-queue TemporalConfig Task queue the worker listens on
TEMPORAL_WORKFLOW_NAME LLMInvokeWorkflow TemporalConfig Workflow type name registered on the worker
TEMPORAL_TIMEOUT_SECONDS 300 TemporalConfig Workflow execution timeout in seconds

Custom Temporal Connection

By default, the package connects to localhost:7233. Override this before creating any agents:

from pydantic_ai_web_models import set_default_config, TemporalConfig

set_default_config(TemporalConfig(
    host="temporal.internal:7233",
    namespace="production",
    task_queue="llm-workers",
    timeout_seconds=600,
))

Per-Model Temporal Config

Pass a config directly when constructing a model:

from pydantic_ai import Agent
from pydantic_ai_web_models import WebModel, TemporalConfig

model = WebModel(
    provider="google-web",
    model_name="gemini-3.1-pro",
    temporal_config=TemporalConfig(
        host="remote-temporal:7233",
        task_queue="gpu-workers",
    ),
)
agent = Agent(model=model)
result = agent.run_sync("Hello!")

TemporalConfig Fields

Field Type Default Description
host str "localhost:7233" Temporal server address
namespace str "default" Temporal namespace
task_queue str "ai-worker-task-queue" Task queue the worker listens on
workflow_name str "LLMInvokeWorkflow" Name of the workflow to execute
timeout_seconds int 300 Workflow execution timeout (seconds)

Error Handling

from pydantic_ai import Agent
from pydantic_ai_web_models import (
    TemporalConnectionError,
    WorkflowExecutionError,
    JSONParseError,
)
import pydantic_ai_web_models

agent = Agent(model="google-web:gemini-3-flash")

try:
    result = agent.run_sync("Hello!")
    print(result.data)
except TemporalConnectionError as e:
    print(f"Cannot reach Temporal server: {e}")
except WorkflowExecutionError as e:
    print(f"LLM workflow failed (id={e.workflow_id}): {e}")
except JSONParseError as e:
    # Only happens with structured output
    print(f"Failed to parse JSON: {e}")
    print(f"Raw response was: {e.raw_text[:200]}")

Architecture

Agent.run() / Agent.run_sync()
    |
    v
WebModel.request()
    |
    +-- format_messages()        # flatten messages to text prompt
    +-- build_json_schema_instruction()  # (structured output only)
    |
    v
Temporal LLMInvokeWorkflow
    |
    +-- prompt + model (+ optional thread_id) sent to Temporal worker
    +-- worker invokes web LLM (OpenAI/Google web interface)
    +-- response text returned (+ optional thread_id -> ModelResponse.metadata)
    |
    v
WebModel.request()
    |
    +-- extract_json_from_response()  # (structured output only)
    +-- wrap_as_tool_call()           # (structured output only)
    |
    v
ModelResponse (returned to pydantic-ai)

Thread ID and model_settings

Optional keys in model_settings (per Agent.run / run_sync or on the agent) are forwarded by this provider:

  • thread_id (str) — When non-empty, included in the LLMInvokeWorkflow input so your worker can resume a server-side browser/chat session.
  • skip_system_prompt (bool, default False) — When True, system instructions are not embedded in the text prompt sent to Temporal (conversation turns are unchanged).

On success, workers that return response, thread_id, and error (empty on success) — for example LLMInvokeResult from LLMInvokeWorkflow — cause the assistant ModelResponse to carry metadata["thread_id"]. Read it from result.response, not from result.metadata (that is only for Agent.run(..., metadata=...)):

result = agent.run_sync("Hello!")
tid = result.response.metadata["thread_id"]

If the workflow returns a non-empty error, this provider raises WorkflowExecutionError (no assistant message). Your worker may also raise before returning; handle those as usual.

Optional thread_id in model_settings is forwarded on the workflow input; omit it for a new conversation.

Limitations

  • No streaming — responses are returned in full after the workflow completes
  • No tool/function calls — only text and structured output are supported
  • No binary content — images and files in messages are skipped
  • Estimated token counts — usage is approximated as len(text) // 4 since the workflow API does not return token counts
  • Requires Temporal infrastructure — a running Temporal server and worker are needed

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantic_ai_web_models-0.2.0.tar.gz (209.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydantic_ai_web_models-0.2.0-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file pydantic_ai_web_models-0.2.0.tar.gz.

File metadata

  • Download URL: pydantic_ai_web_models-0.2.0.tar.gz
  • Upload date:
  • Size: 209.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pydantic_ai_web_models-0.2.0.tar.gz
Algorithm Hash digest
SHA256 36a4f5765879341786a8f2d326273f99ef6038bebb16d16cbcd1ce5e92a6252c
MD5 97f45b189cbe37473969e961a44260e7
BLAKE2b-256 aa0b99f6eb8c28b510aef8c29f7724d430c863fc53f8f8631c727c719478567e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_ai_web_models-0.2.0.tar.gz:

Publisher: publish-pypi.yml on eugenzor/pydantic-ai-web-models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pydantic_ai_web_models-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pydantic_ai_web_models-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 da48a47e45262ec394471c35543c4f89c3e4f9ad3851f3db2fced8e926601007
MD5 9c3723017687a73e52aa6780d50b4ac4
BLAKE2b-256 0a39a0a684fd26298e2b12fd570d31e4b0e6e15070946ad82c935a187d6de294

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_ai_web_models-0.2.0-py3-none-any.whl:

Publisher: publish-pypi.yml on eugenzor/pydantic-ai-web-models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page