Skip to main content

Guardrail capabilities for Pydantic AI — cost tracking, tool permissions, input/output guards

Project description

Pydantic AI Shields

Guardrail Capabilities for Pydantic AI Agents

PyPI version Python 3.10+ License: MIT Pydantic AI

Cost Tracking  •  Prompt Injection  •  PII Detection  •  Secret Redaction  •  Tool Permissions  •  Async Guardrails


Pydantic AI Shields provides ready-to-use guardrail capabilities for Pydantic AI agents. Drop them into any agent for cost control, tool permissions, and safety checks — no middleware wrappers needed.

Full framework? Check out Pydantic Deep Agents — complete agent framework with planning, filesystem, subagents, and skills.

Installation

pip install pydantic-ai-shields

Quick Start

from pydantic_ai import Agent
from pydantic_ai_shields import CostTracking, ToolGuard, InputGuard

agent = Agent(
    "openai:gpt-4.1",
    capabilities=[
        CostTracking(budget_usd=5.0),
        ToolGuard(blocked=["execute"], require_approval=["write_file"]),
        InputGuard(guard=lambda prompt: "ignore all instructions" not in prompt.lower()),
    ],
)

result = await agent.run("Hello!")

Available Shields

CostTracking

Track token usage and API costs with optional budget enforcement:

from pydantic_ai_shields import CostTracking

tracking = CostTracking(budget_usd=10.0)
agent = Agent("openai:gpt-4.1", capabilities=[tracking])

result = await agent.run("Hello")
print(f"Total cost: ${tracking.total_cost:.4f}")
print(f"Total tokens: {tracking.total_request_tokens + tracking.total_response_tokens}")

Raises BudgetExceededError when the cumulative cost exceeds the budget. Pricing auto-detected from model via genai-prices.

ToolGuard

Control which tools the agent can use:

from pydantic_ai_shields import ToolGuard

async def ask_user(tool_name: str, args: dict) -> bool:
    return input(f"Allow {tool_name}? (y/n) ") == "y"

guard = ToolGuard(
    blocked=["execute", "rm"],              # Hidden from model entirely
    require_approval=["write_file"],        # User must approve each call
    approval_callback=ask_user,
)
agent = Agent("openai:gpt-4.1", capabilities=[guard])
  • blocked tools are removed via prepare_tools — the model never sees them
  • require_approval tools trigger the callback before execution

InputGuard

Block or validate user input before the agent runs:

from pydantic_ai_shields import InputGuard

# Sync guard
agent = Agent("openai:gpt-4.1", capabilities=[
    InputGuard(guard=lambda prompt: "jailbreak" not in prompt.lower()),
])

# Async guard (e.g., call moderation API)
async def check_toxicity(prompt: str) -> bool:
    result = await moderation_api.check(prompt)
    return result.is_safe

agent = Agent("openai:gpt-4.1", capabilities=[InputGuard(guard=check_toxicity)])

Raises InputBlocked when the guard returns False.

OutputGuard

Block or validate model output after the agent runs:

from pydantic_ai_shields import OutputGuard

agent = Agent("openai:gpt-4.1", capabilities=[
    OutputGuard(guard=lambda output: "SSN" not in output),
])

Raises OutputBlocked when the guard returns False.

AsyncGuardrail

Run a guardrail concurrently with the LLM call — if the guard fails first, the LLM is cancelled (saves cost):

from pydantic_ai_shields import AsyncGuardrail, InputGuard

agent = Agent(
    "openai:gpt-4.1",
    capabilities=[AsyncGuardrail(
        guard=InputGuard(guard=check_policy),
        timing="concurrent",       # "concurrent" | "blocking" | "monitoring"
        cancel_on_failure=True,     # Cancel LLM if guard fails
        timeout=5.0,                # Guard timeout in seconds
    )],
)
Timing Behavior
"concurrent" Guard runs alongside LLM, fail-fast on violation
"blocking" Guard completes before LLM starts (traditional)
"monitoring" Guard runs after LLM, fire-and-forget (logging/audit)

Built-in Content Shields

PromptInjection

Detect and block prompt injection / jailbreak attempts:

from pydantic_ai_shields import PromptInjection

agent = Agent("openai:gpt-4.1", capabilities=[
    PromptInjection(sensitivity="high"),  # "low" | "medium" | "high"
])

6 detection categories: ignore_instructions, system_override, role_play, delimiter_injection, prompt_leaking, jailbreak. Add custom patterns with custom_patterns=[r"my_pattern"].

PiiDetector

Detect PII (email, phone, SSN, credit card, IP) in user input:

from pydantic_ai_shields import PiiDetector

agent = Agent("openai:gpt-4.1", capabilities=[
    PiiDetector(detect=["email", "ssn", "credit_card"]),
])

Use action="log" to allow through while recording detections in cap.last_detections.

SecretRedaction

Block API keys, tokens, and credentials from appearing in model output:

from pydantic_ai_shields import SecretRedaction

agent = Agent("openai:gpt-4.1", capabilities=[SecretRedaction()])

Detects: OpenAI, Anthropic, AWS, GitHub, Slack keys, JWTs, private keys, generic API keys.

BlockedKeywords

Block prompts containing forbidden words or phrases:

from pydantic_ai_shields import BlockedKeywords

agent = Agent("openai:gpt-4.1", capabilities=[
    BlockedKeywords(
        keywords=["competitor_name", "internal_only"],
        whole_words=True,
    ),
])

Supports case_sensitive, whole_words, and use_regex modes.

NoRefusals

Block LLM refusals — ensure the model attempts to answer:

from pydantic_ai_shields import NoRefusals

agent = Agent("openai:gpt-4.1", capabilities=[NoRefusals()])

Use allow_partial=True to allow responses that contain refusal language but also have substance.

Composing Shields

All shields compose naturally as pydantic-ai capabilities:

agent = Agent(
    "openai:gpt-4.1",
    capabilities=[
        CostTracking(budget_usd=5.0),
        PromptInjection(sensitivity="high"),
        PiiDetector(),
        SecretRedaction(),
        BlockedKeywords(keywords=["classified"]),
        NoRefusals(),
    ],
)

API Reference

Infrastructure Shields

Class Description
CostTracking Token/USD tracking with budget enforcement
ToolGuard Block tools or require approval
InputGuard Custom input validation (pluggable function)
OutputGuard Custom output validation (pluggable function)
AsyncGuardrail Concurrent guardrail + LLM execution

Content Shields

Class Description
PromptInjection Detect prompt injection / jailbreak (6 categories, 3 sensitivity levels)
PiiDetector Detect PII — email, phone, SSN, credit card, IP (regex-based)
SecretRedaction Block API keys, tokens, credentials in output
BlockedKeywords Block forbidden keywords/phrases (case, word boundary, regex modes)
NoRefusals Block LLM refusals ("I cannot help with that")

Data

Class Description
CostInfo Per-run and cumulative token/cost data

Exceptions

Exception Raised by
GuardrailError Base exception for all shields
InputBlocked InputGuard, PromptInjection, PiiDetector, BlockedKeywords, AsyncGuardrail
OutputBlocked OutputGuard, SecretRedaction, NoRefusals
ToolBlocked ToolGuard
BudgetExceededError CostTracking

Related Projects

Package Description
Pydantic Deep Agents Full agent framework
pydantic-ai-todo Task planning capability
subagents-pydantic-ai Multi-agent delegation
pydantic-ai-backend File storage and Docker sandbox
summarization-pydantic-ai Context management
pydantic-ai The foundation — agent framework by Pydantic

License

MIT


Need help implementing this in your company?

We're Vstorm — an Applied Agentic AI Engineering Consultancy
with 30+ production AI agent implementations.

Talk to us



Made with ❤️ by Vstorm

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantic_ai_shields-0.3.1.tar.gz (381.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydantic_ai_shields-0.3.1-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file pydantic_ai_shields-0.3.1.tar.gz.

File metadata

  • Download URL: pydantic_ai_shields-0.3.1.tar.gz
  • Upload date:
  • Size: 381.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pydantic_ai_shields-0.3.1.tar.gz
Algorithm Hash digest
SHA256 9f48b536b49a4d1608977e98243df870e9012445eee565ef915a31f41719e70f
MD5 eeeeb2bc8dbfc435f633a4fe2bdc36be
BLAKE2b-256 9da6baa86faed1ddf83c5712b7a5623ef86ed6f79deb76a32abfcef4bea7a9cf

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_ai_shields-0.3.1.tar.gz:

Publisher: publish.yml on vstorm-co/pydantic-ai-shields

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pydantic_ai_shields-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pydantic_ai_shields-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 556a61512e7f34eb3e17cc23b67e672752c712ce1c6692c89d986eae477432f8
MD5 d7a540e780571dcea4b22338e1eec674
BLAKE2b-256 50b37be974c1f1e319b10f602feb9a465dbfb550c2aeae0c77f65c7eaf69086d

See more details on using hashes here.

Provenance

The following attestation bundles were made for pydantic_ai_shields-0.3.1-py3-none-any.whl:

Publisher: publish.yml on vstorm-co/pydantic-ai-shields

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page