Skip to main content

The batteries for your Pydantic AI agent

Project description

Pydantic AI Harness

CI PyPI versions license

The batteries for your Pydantic AI agent.


Pydantic AI's capabilities and hooks API is how you give an agent its harness -- bundles of tools, lifecycle hooks, instructions, and model settings that extend what the agent can do without any framework changes.

Pydantic AI Harness is the official capability library for Pydantic AI, maintained by the Pydantic AI team. Pydantic AI core ships capabilities that require model or framework support, and capabilities fundamental to every agent -- web search, tool search, thinking. Everything else lives here: standalone building blocks you pick and choose to turn your agent into a coding agent, a research assistant, or anything else. This is also where new capabilities start -- as they stabilize and prove themselves broadly essential, they can graduate into core.

The capability matrix tracks where we are. Tell us what to prioritize.

Contents: Installation · Quick start · Capability matrix · Help us prioritize · Build your own · Contributing · Version policy · Pydantic AI references · License

Installation

uv add pydantic-ai-harness

Extras for specific capabilities:

uv add "pydantic-ai-harness[code-mode]"   # CodeMode (adds the Monty sandbox)

Requires Python 3.10+ and pydantic-ai-slim>=1.80.0.

Quick start

import logfire
from pydantic_ai import Agent
from pydantic_ai.capabilities import MCP  # from the core pydantic-ai package
from pydantic_ai_harness import CodeMode

logfire.configure()
logfire.instrument_pydantic_ai()

agent = Agent(
    'anthropic:claude-sonnet-4-6',
    capabilities=[
        MCP('https://api.githubcopilot.com/mcp/'),
        CodeMode(),
    ],
)

result = agent.run_sync('Rank the open PRs on pydantic/pydantic-ai-harness by thumbs-up reactions. Which 5 should we merge first?')
print(result.output)

MCP (from the core pydantic-ai package) connects your agent to any MCP server -- here, GitHub's official MCP server.

CodeMode wraps all tools into a single run_code tool powered by our Monty sandbox, so the model can orchestrate multiple tool calls with Python code instead of one model round-trip per call.

logfire gives you a trace for every agent run. With CodeMode, you can see the run_code span with each nested tool call as a child span -- making it easy to debug what the model's code actually did. See the Pydantic AI Logfire docs for setup details.

Capability matrix

We studied leading coding agents, agent frameworks, and Claw-style assistants to map every capability area that matters for production agents. Each one is tracked as an issue in this repo.

Vote on whatever is linked in the Status column -- PRs if we're actively building it, issues if it's planned -- to help us decide what to work on next.

Category Capability Description Status Community alternatives
Tools & execution Code mode Sandboxed Python execution via Monty -- one run_code call replaces N tool calls :white_check_mark: Docs
Tool search Progressive tool discovery for large tool sets :white_check_mark: Pydantic AI
File system Read, write, edit, search files with path traversal prevention :construction: PR #177 pydantic-ai-backend (vstorm‑co)
Shell Execute commands with allowlists, denylists, and timeouts :construction: PR #177 pydantic-ai-backend (vstorm‑co)
Repo context injection Auto-load CLAUDE.md/AGENTS.md and repo structure :construction: PR #175 pydantic-deep (vstorm‑co)
Verification loop Run tests after edits, auto-fix failures :construction: PR #169
Context management Sliding window Trim conversation history to stay within token limits :construction: PR #191 summarization-pydantic-ai (vstorm‑co)
Context compaction LLM-powered summarization of older messages :construction: PR #191 summarization-pydantic-ai (vstorm‑co)
Limit warnings Warn agent before hitting context/iteration limits :construction: PR #191 summarization-pydantic-ai (vstorm‑co)
Tool output management Truncate, summarize, or spill large tool outputs :construction: PR #185
System reminders Inject periodic reminders to counteract instruction drift :construction: PR #181
Memory & persistence Memory Persistent key-value memory across sessions :construction: PR #179 pydantic-deep (vstorm‑co)
Session persistence Save and restore full conversation state :construction: PR #176
Checkpointing Save, rewind, and fork conversation state :memo: #196 pydantic-deep (vstorm‑co)
Agent orchestration Sub-agents Delegate subtasks to specialized child agents :construction: PR #178 subagents-pydantic-ai (vstorm‑co)
Skills Progressive tool loading -- search, activate, deactivate :construction: PR #183 pydantic-ai-skills (DougTrajano), pydantic-deep (vstorm‑co)
Planning Break complex tasks into structured plans before execution :construction: PR #180
Task tracking Track tasks, subtasks, and dependencies :memo: #65 pydantic-ai-todo (vstorm‑co)
Teams Multi-agent teams with shared state and message bus :memo: #195 pydantic-deep (vstorm‑co)
Safety & guardrails Input guardrails Validate user input before the agent run starts :construction: PR #182 pydantic-ai-shields (vstorm‑co)
Output guardrails Validate model output after the run completes :construction: PR #182 pydantic-ai-shields (vstorm‑co)
Cost/token budgets Enforce token and cost limits per run :construction: PR #182 pydantic-ai-shields (vstorm‑co)
Tool access control Block tools or require approval before execution :construction: PR #182 pydantic-ai-shields (vstorm‑co)
Async guardrails Run validation concurrently with model requests :construction: PR #182 pydantic-ai-shields (vstorm‑co)
Secret masking Detect and redact secrets in agent I/O :construction: PR #172 pydantic-ai-shields (vstorm‑co)
Approval workflows Require human approval for sensitive operations :construction: PR #173 Pydantic AI (built‑in)
Tool budget Limit total tool calls or cost per run :construction: PR #168
Reliability Stuck loop detection Detect and break out of repetitive agent loops :construction: PR #186
Tool error recovery Retry failed tool calls with backoff and budget :construction: PR #171
Tool orphan repair Fix orphaned tool calls in conversation history :construction: PR #184
Reasoning Adaptive reasoning Adjust thinking effort based on task complexity :construction: PR #174
Current time Inject current date/time into system prompt :construction: PR #170

Packages by vstorm-co are endorsed by the Pydantic AI team. We're working with them to upstream some of their implementations into this repo.

Help us prioritize

Vote on whatever is linked in the Status column above. If there's a PR, vote on the PR -- it means we're actively building it. If there's only an issue, vote on the issue.

Want something that's not on the list? Open a capability request.

Build your own

Capabilities are the primary extension point for Pydantic AI. Any of the existing capabilities in this repo can serve as a reference for building your own.

Publishing as a standalone package? Use the pydantic-ai-<name> naming convention. See Publishing capability packages.

Contributing

We welcome capability contributions. Here's how:

  1. Start with an issue. Open a capability request describing the behavior you want. This lets us discuss the approach and priority before code is written -- we can close an approach without closing the problem.
  2. Then open a PR. Once the issue exists, you're welcome to open a PR with an implementation. Link the issue in your PR. We review based on community interest -- upvotes on both the issue and PR count.
  3. Don't chase green CI. Get the approach working, then let us know. We'll take it from there -- we may push to your branch, rewrite, or open a follow-up PR. You'll be credited as the original author. (See the Pydantic AI contributing guide.)

Note: PRs that modify pyproject.toml or uv.lock from non-team members are auto-closed by CI to prevent supply chain risk. If you need a new dependency, open an issue.

Development

make install   # install dependencies
make format    # ruff format
make lint      # ruff check
make typecheck # pyright strict
make test      # pytest
make testcov   # pytest with 100% branch coverage

Version policy

Pydantic AI Harness uses 0.x versioning to signal that APIs are still stabilizing. During 0.x:

  • Minor releases (0.1 → 0.2) may include breaking changes — renamed parameters, changed defaults, restructured APIs. As the library grows, especially as capabilities gain provider-native support (starting as a local implementation, then auto-switching to the provider's built-in API when available), we may need to reshape APIs we couldn't fully anticipate in the initial design.
  • Patch releases (0.1.0 → 0.1.1) will not intentionally break existing behavior.
  • All breaking changes are documented in release notes with migration guidance.
  • Where practical, we'll keep the previous behavior available under a deprecated name or configuration option before removing it.

This is why Pydantic AI Harness is a separate package from Pydantic AI, which has a stricter version policy. As the core capabilities stabilize, we'll move toward 1.0 with stability guarantees to match.

Pydantic AI references

  • Capabilities -- what capabilities are, built-in capabilities, building your own
  • Hooks -- lifecycle hooks reference, ordering, error handling
  • Extensibility -- publishing packages, third-party ecosystem
  • Toolsets -- building tools for capabilities
  • API reference -- full API docs

License

MIT -- see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantic_ai_harness-0.1.1.tar.gz (139.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydantic_ai_harness-0.1.1-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file pydantic_ai_harness-0.1.1.tar.gz.

File metadata

  • Download URL: pydantic_ai_harness-0.1.1.tar.gz
  • Upload date:
  • Size: 139.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pydantic_ai_harness-0.1.1.tar.gz
Algorithm Hash digest
SHA256 32a5e5859fa211bef1d53fa298c45f4be0a137f99266bc1fc81674298592e021
MD5 60a4798394315d725aae07b376e25d27
BLAKE2b-256 a765601dc2306c373251d53894c2b285441fe60508db12515b6a42a9ca429a0a

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_ai_harness-0.1.1.tar.gz:

Publisher: main.yml on pydantic/pydantic-ai-harness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pydantic_ai_harness-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pydantic_ai_harness-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6b2e827f28233b378cf3b49cd3979c15a515ba66a9e20b419c7f7c2c105358fe
MD5 a72aa193da71090ae81775df15af8c93
BLAKE2b-256 579f783912423a097e795b61b24ab8bd3357a4dc4803ca03119bc832163c3310

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_ai_harness-0.1.1-py3-none-any.whl:

Publisher: main.yml on pydantic/pydantic-ai-harness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page