Scan your agentic codebase for unguarded tool calls with real-world side effects
Project description
diplomat-agent
Your AI agent can call stripe.Refund.create() 200 times in a loop.
diplomat-agent finds these functions before they ship.
pip install diplomat-agent
diplomat-agent scan .
Scans your Python AI agent code. Reports every function that can change the real world — and shows which ones have no checks. Zero dependencies. 2 seconds on a 1,000-file repo.
What it looks like
diplomat-agent — governance scan
Scanned: ./my-agent
Tool calls with side effects: 12
⚠ process_refund(amount, customer_id)
Write protection: NONE
Rate limit: NONE
→ stripe.Refund.create() with no amount limit
Governance: ❌ UNGUARDED
⚠ delete_user_data(user_id)
Confirmation step: NONE
Batch protection: NONE
→ session.delete() with no confirmation
Governance: ❌ UNGUARDED
✓ update_order(order_id)
Governance: ✅ GUARDED
────────────────────────────────────────────
RESULT: 8 unguarded · 3 partial · 1 guarded (12 total)
Why this matters for AI agents
In a web app, a human clicks a button. The UI has validation, confirmation dialogs, rate limits per session.
In an agent, an LLM decides which functions to call, with what arguments, how many times. It doesn't know your business rules. It can loop, hallucinate arguments, or get prompt-injected.
Without guards in the code, there's nothing between the LLM's decision and the real-world consequence.
We scanned 16 open-source agent repos. 76% of tool calls had zero checks.
What it detects
40+ patterns across 8 categories:
| Category | Examples |
|---|---|
| Database writes | session.commit(), .save(), .create(), .update() |
| Database deletes | session.delete(), .remove(), DELETE FROM |
| HTTP writes | requests.post(), httpx.put(), client.patch() |
| Payments | stripe.Charge.create(), stripe.Refund.create() |
| Email / messaging | smtp.sendmail(), ses.send_email(), slack.chat_postMessage() |
| Agent invocations | graph.ainvoke(), agent.execute(), Runner.run_sync() |
| Destructive commands | subprocess.run(), exec(), eval() |
| Publish / upload | s3.put_object(), client.publish() |
What counts as a guard: input validation, rate limiting, auth checks, confirmation steps, idempotency keys, retry bounds. Full list →
Integrate everywhere
CI — block unguarded PRs
- name: Diplomat governance scan
run: |
pip install diplomat-agent
diplomat-agent scan . --fail-on-unchecked
IDE — review what the copilot wrote
Works in your IDE with zero extension to install:
| IDE | How | Setup |
|---|---|---|
| Copilot Chat (VS Code, Cursor, Windsurf) | Select "Diplomat Reviewer" in agent dropdown | Copy .github/agents/diplomat-reviewer.agent.md |
| Claude Code | Ask "scan for unguarded tool calls" | AGENTS.md at repo root (included) |
| Cursor (native) | Auto-activates on Python files | Copy .cursor/rules/diplomat-reviewer.mdc |
Pre-commit hook
repos:
- repo: https://github.com/Diplomat-ai/diplomat-agent
rev: v0.4.0
hooks:
- id: diplomat-agent
SARIF — native VS Code Problems panel
diplomat-agent scan . --format sarif --output results.sarif
Open with SARIF Viewer. Or upload to GitHub Code Scanning.
Scan only changed files
diplomat-agent scan . --diff-only
Generate your agent's SBOM
diplomat-agent scan . --format registry --output-registry toolcalls.yaml
Like requirements.txt — but for what your agent can do, not what it
depends on. Commit it. Diff it in PRs. When your agent gains a new
capability, the change shows up in review.
Benchmarks
| Repo | Files | Tool calls | Unguarded | Time |
|---|---|---|---|---|
| Skyvern | 595 | 452 | 345 (76%) | ~2s |
| Dify | 1,000+ | 1,009 | 759 (75%) | ~3s |
| PraisonAI | — | 1,028 | 911 (89%) | ~2s |
| CrewAI | — | 348 | 273 (78%) | ~1s |
Output formats
| Format | Flag | Use case |
|---|---|---|
| Terminal (default) | — | Human review |
| JSON | --format json |
IDE agents, automation |
| SARIF 2.1.0 | --format sarif |
VS Code, GitHub Code Scanning |
| CSAF 2.0 | --format csaf |
Security teams, CERTs |
| Markdown | --format markdown |
Documentation, reports |
| Registry | --format registry |
toolcalls.yaml SBOM |
Acknowledge a tool call
If a function is intentionally unguarded or protected elsewhere:
def send_alert(message): # checked:ok — protected by API gateway
requests.post(ALERT_URL, json={"msg": message})
From scanning to runtime
diplomat-agent finds the gaps. diplomat-gate protects them at runtime.
pip install diplomat-gate
from diplomat_gate import Gate
gate = Gate.from_yaml("gate.yaml")
verdict = gate.evaluate({"action": "charge_card", "amount": 15000})
# → STOP — Amount 15000 exceeds limit of 10000
15+ pre-built policies (payments, emails, shell commands). CONTINUE / REVIEW / STOP in < 1ms. Zero dependencies.
diplomat-gate → · diplomat.run → (hosted control plane with hash-chained audit trail)
Standards alignment
Known limitations
- Static analysis only — no runtime detection
- Python only — TypeScript on the roadmap
- Intra-procedural + same-package decorators — use
# checked:okfor guards in external packages - Full limitations →
Roadmap
- Python AST scanner (40+ patterns)
-
toolcalls.yamlbehavioral SBOM - CSAF 2.0 + SARIF 2.1.0 output
- CI integration (
--fail-on-unchecked) - IDE agents (Copilot Chat, Claude Code, Cursor)
- Pre-commit hook
-
--diff-onlyand--filemodes - Inter-procedural decorator resolution
- TypeScript support
- MCP server scanning
- VS Code extension (inline diagnostics on save)
- PR comment integration
Requirements
- Python 3.9+
- Zero dependencies (stdlib
astonly) - Optional:
rich(colored output),pyyaml(registry)
License
Apache 2.0 — Copyright 2026 Diplomat Services SAS
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file diplomat_agent-0.4.0.tar.gz.
File metadata
- Download URL: diplomat_agent-0.4.0.tar.gz
- Upload date:
- Size: 104.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43402ec06a79426a9df1be2c2e267b68ee54ecf7296e5d8645325942fdb3beff
|
|
| MD5 |
f37e1a6f25411f9d75da74f2d87d0ff3
|
|
| BLAKE2b-256 |
c8cf7b38ac719b532c1c80509f2dbbdd1cce3a7f7e43f46b02b657e9429e9b62
|
File details
Details for the file diplomat_agent-0.4.0-py3-none-any.whl.
File metadata
- Download URL: diplomat_agent-0.4.0-py3-none-any.whl
- Upload date:
- Size: 53.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
29969d2daef6b052ef95fd3ed9a966c1abe337d3cee31ee4c02b1ac5f34fd940
|
|
| MD5 |
82c5dbb1b7fb9a35de195fafac6948e6
|
|
| BLAKE2b-256 |
5f84760f3dd0a3eda6372690075409f0b2344168d4476ddc8288dcb1baf17c7c
|