Whitelist NLP intent enforcement for MCP agents — pre-execution tool call validation

These details have not been verified by PyPI

Project links

Project description

MCP Guardian

Agent intent enforcement for MCP tool calls — pre-execution security for AI agents.

MCP Guardian is not a firewall for MCP servers. It's a declarative intent guardrail for agent behavior. It validates every tool call against declared intent policies before execution. If the call doesn't match the policy, the MCP server never sees it.

Install

pip install mcp-guardian-ai

This pulls in all dependencies: openai-agents, pydantic, pyyaml.

Or install everything explicitly:

pip install mcp-guardian-ai openai-agents pydantic pyyaml

Set your OpenAI API key (used by the LLM intent evaluator — the fast check tier runs without it):

export OPENAI_API_KEY=sk-...

Note: The PyPI package is mcp-guardian-ai. The Python import is mcp_guardian.

For development from source:

git clone https://github.com/mcp-guardian/mcp-guardian.git
cd mcp-guardian
pip install -e ".[dev]"

Three Ways to Use It

Path 1: Pure Python (no files needed)

import asyncio
from agents import Agent, Runner
from agents.mcp import MCPServerStreamableHttp
from mcp_guardian import GuardianToolGuardrail, IntentPolicy

policy = IntentPolicy(
    name="read-only",
    description="Read files only — no writes, no shell",
    expected_workflow="Read and list files to answer user questions",
    forbidden_tools=["write_*", "execute_*", "delete_*"],
)
guardrail = GuardianToolGuardrail(policy=policy)

async def main():
    async with MCPServerStreamableHttp(
        name="my-server",
        params={"url": "https://my-mcp-server.example.com/mcp"},
    ) as server:
        tools = await guardrail.wrap_mcp_tools([server])
        agent = Agent(name="Worker", model="gpt-4o", tools=tools)
        result = await Runner.run(agent, "List all files")
        print(result.final_output)

    # Print audit log
    for entry in guardrail.audit_log:
        verdict = str(entry.verdict)
        icon = "✓" if verdict == "allow" else "✗"
        print(f"  {icon} {entry.tool_name} → {verdict.upper()} "
              f"(conf={entry.confidence:.2f}, {entry.method}, {entry.elapsed_ms:.0f}ms)")
        if verdict != "allow":
            print(f"    Reason: {entry.reason}")

asyncio.run(main())

Path 2: YAML policy file (recommended)

Define a policy.yaml:

name: read-only
description: Read-only file access
expected_workflow: Read and list files to answer user questions
allowed_tools: ["read_*", "list_*"]
forbidden_tools: ["write_*", "execute_*", "delete_*"]
allowed_transitions:
  list_directory: [read_file, list_directory]
  read_file: [read_file, list_directory]
constraints:
  - Do not access files outside the working directory
escalation_threshold: 0.7

Load it:

policy = IntentPolicy.from_file("policy.yaml")
guardrail = GuardianToolGuardrail(policy=policy)

Path 3: guardian.yaml + policy files (multi-server / production)

A single guardian.yaml ties together multiple servers, per-server policies, auth headers, and model settings:

model: gpt-4o
guardian_model: gpt-4o
default_policy: policies/default.yaml
servers:
  - name: filesystem
    url: https://fs-server.example.com/mcp
    policy: policies/read-only.yaml
  - name: database
    url: https://db-server.example.com/mcp
    policy: policies/db-read-only.yaml
    headers:
      Authorization: "Bearer ${DB_TOKEN}"

config = GuardianConfig.from_file("guardian.yaml")

See the Quick Start for complete examples of all three paths.

How It Works

Three-tier enforcement pipeline on every tool call:

Fast check (0ms) — forbidden tools, whitelists, glob patterns, transition graph. Deterministic, no LLM, impossible to bypass with prompt injection.
LLM intent evaluation (1–5s) — analyzes the call against policy constraints and workflow context.
Escalation — low-confidence decisions flagged for human review.

The transition graph (allowed_transitions) is a state machine over tool calls — similar to LangGraph, but enforced externally on the agent rather than built into the agent's own execution graph. After tool A, only tools B and C are allowed. Everything else is blocked deterministically at 0ms.

This makes MCP Guardian a reasoning guardrail, not just a tool filter. Anyone can do allow/block lists. The LLM intent evaluation layer supervises the agent's reasoning — catching an allowed tool called with suspicious arguments, or a permitted call that doesn't fit the declared intent. A second LLM evaluating the first LLM's decisions.

The guardian LLM defaults to gpt-4o-mini (fast, cheap) but can point at any OpenAI-compatible endpoint — Ollama, vLLM, Azure OpenAI, or a fine-tuned model:

# Use a local Ollama model for the guardian
guardrail = GuardianToolGuardrail(
    policy=policy,
    guardian_model="llama3.2",
    guardian_base_url="http://localhost:11434/v1",
)

Or in guardian.yaml:

guardian_model: llama3.2
guardian_base_url: http://localhost:11434/v1

Every evaluation is logged with verdict, confidence, timing, and reasoning.

Policy Fields

Field	Purpose
`allowed_tools`	Whitelist with glob patterns (`read_`, `list_`)
`forbidden_tools`	Blacklist — always blocked (`write_`, `execute_`)
`allowed_transitions`	State machine: tool A → [tool B, C]
`constraints`	Free-text rules for the LLM evaluator
`expected_workflow`	What the agent should be doing (LLM context)
`escalation_threshold`	Below this confidence → ask human

Demo: Exfiltration Prevention

A working demo blocks a data exfiltration attack across two MCP servers. The agent reads a secret (allowed), then an adversarial prompt tries to send it to an attacker URL — blocked at Tier 1 by the transition graph (0ms) and independently at Tier 2 by the LLM constraints.

See demos/exfiltration/ for details.

Documentation

The full docs are built with MkDocs Material. Run them locally with Docker:

docker build -f Dockerfile.docs -t mcp-guardian-docs .
docker run -p 8000:8000 -v $(pwd)/docs:/docs/docs mcp-guardian-docs

Then open http://localhost:8000. The -v mount gives you live reload as you edit.

Or without Docker:

pip install mkdocs-material
mkdocs serve

Roadmap

The core engine (policies, fast check, transition graphs, LLM evaluation, audit log) is SDK-agnostic. Currently we ship an adapter for the OpenAI Agents SDK. Future adapters under consideration:

Runtime	Hook point	Status
OpenAI Agents SDK	`ToolInputGuardrail`	Shipped
Anthropic Claude	`PreToolUse` hook	Planned
Microsoft Agent Framework	`FunctionInvocationFilter`	Planned

Same YAML policies, same pip install, any runtime. Feedback welcome — open an issue if your framework isn't listed.

Built On

OpenAI Agents SDK — ToolInputGuardrail, AgentHooksBase
Model Context Protocol (MCP) — tool server standard

License

Apache 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

May 14, 2026

This version

0.1.1

Mar 13, 2026

0.1.0

Mar 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_guardian_ai-0.1.1.tar.gz (36.9 kB view details)

Uploaded Mar 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcp_guardian_ai-0.1.1-py3-none-any.whl (35.9 kB view details)

Uploaded Mar 13, 2026 Python 3

File details

Details for the file mcp_guardian_ai-0.1.1.tar.gz.

File metadata

Download URL: mcp_guardian_ai-0.1.1.tar.gz
Upload date: Mar 13, 2026
Size: 36.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for mcp_guardian_ai-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`14559ce4006dc592d569877aac6fffe48aadf938bdc52e53fecd13bae297c8bb`
MD5	`c2f593290e6fe03dffa33b9df070dad4`
BLAKE2b-256	`cbbe55e9ae5e8fa9ae356c55c304f0ec0f1b8a39b2f4e9e521909817cb0ac7ba`

See more details on using hashes here.

File details

Details for the file mcp_guardian_ai-0.1.1-py3-none-any.whl.

File metadata

Download URL: mcp_guardian_ai-0.1.1-py3-none-any.whl
Upload date: Mar 13, 2026
Size: 35.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for mcp_guardian_ai-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`521ee79b7761edea408ce06a2b9877f70954e8d630467623f80f997461dea126`
MD5	`abb1697595e2012a1059dcee99996507`
BLAKE2b-256	`da17a46c1b65984bda81153ad3b7409650aabcae3dd06039bd566a1afa218369`

See more details on using hashes here.

mcp-guardian-ai 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MCP Guardian

Install

Three Ways to Use It

Path 1: Pure Python (no files needed)

Path 2: YAML policy file (recommended)

Path 3: guardian.yaml + policy files (multi-server / production)

How It Works

Policy Fields

Demo: Exfiltration Prevention

Documentation

Roadmap

Built On

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes