Skip to main content

Open-source safety firewall for AI agents. Intercept tool calls. Enforce policies. Kill dangerous operations.

Project description

AvaKill

Open-source safety firewall for AI agents

PyPI version Python License CI Tests Red Team GitHub stars

One YAML policy. Three independent enforcement paths. Every agent protected.

pipx install avakill && avakill setup

Quickstart · How It Works · Integrations · Policy · CLI · Docs · Contributing


The Problem

AI agents are shipping to production with zero safety controls on their tool calls. The results are predictable:

  • Replit's agent dropped a production database and fabricated 4,000 fake user accounts to cover it up.
  • Google's Gemini CLI wiped a user's entire D: drive — 8,000+ files, gone.
  • Amazon Q terminated EC2 instances and deleted infrastructure during a debugging session.

These aren't edge cases. Research shows AI agents fail in 75% of real-world tasks, and when they fail, they fail catastrophically — because nothing sits between the agent and its tools.

AvaKill is that missing layer. A firewall that intercepts every tool call, evaluates it against your safety policies, and kills dangerous operations before they execute. No ML models, no API calls, no latency — just fast, deterministic policy checks in <1ms.

Quickstart

pipx install avakill
avakill setup

macOS note: macOS 14+ blocks pip install at the system level (PEP 668). Use pipx or a virtualenv.

avakill setup walks you through an interactive flow that:

  1. Detects agents across three enforcement paths (hooks, MCP proxy, OS sandbox)
  2. Creates a policy from a catalog of 81 rules across 14 categories
  3. Installs hooks for detected agents (Claude Code, Cursor, Windsurf, Gemini CLI, Codex, Kiro, Amp, OpenClaw)
  4. Wraps MCP servers for MCP-capable agents (Claude Desktop, Cline, Continue)
  5. Shows sandbox commands for agents that support OS-level containment
  6. Enables tracking (optional) for audit logs and diagnostics

After setup, test it:

echo '{"tool": "Bash", "args": {"command": "rm -rf /"}}' | avakill evaluate --policy avakill.yaml
# deny: Matched rule 'block-catastrophic-shell'

Safe calls pass through. Destructive calls are killed before they execute.

Optional framework extras

pip install "avakill[openai]"       # OpenAI function calling
pip install "avakill[anthropic]"    # Anthropic tool use
pip install "avakill[langchain]"    # LangChain / LangGraph
pip install "avakill[mcp]"          # MCP proxy
pip install "avakill[all]"          # Everything

How It Works

AvaKill enforces a single YAML policy across three independent enforcement paths. Each path works standalone — no daemon required, no single point of failure.

avakill.yaml (one policy file)
    |
    ├── Hooks (Claude Code, Cursor, Windsurf, Gemini CLI, Codex, Kiro, Amp, OpenClaw)
    |     → work standalone, evaluate in-process
    |
    ├── MCP Proxy (wraps MCP servers)
    |     → works standalone, evaluate in-process
    |
    ├── OS Sandbox (launch + profiles)
    |     → works standalone, OS-level enforcement
    |
    └── Daemon (optional)
          → shared evaluation, audit logging
          → hooks/proxy CAN talk to it if running
          → enables: logs, fix, tracking, approvals, metrics

One Policy File
avakill.yaml is the single source of truth. Deny-by-default, allow lists, rate limits, argument pattern matching, shell safety checks, path resolution, and content scanning.

Native Agent Hooks
Drop-in hooks for Claude Code, Cursor, Windsurf, Gemini CLI, Codex, Kiro, Amp, and OpenClaw. One command to install. Works standalone — no daemon required.

MCP Proxy
Wraps any MCP server with policy enforcement. Scans tool responses for secrets, PII, and prompt injection. Works standalone, evaluates in-process.

OS Sandbox
Launch agents in OS-level sandboxes. Landlock on Linux, sandbox-exec on macOS, AppContainer on Windows. Deny-default, kernel-level enforcement.

Sub-Millisecond
Pure rule evaluation, no ML models. Adds <1ms overhead to tool calls that already take 500ms-5s. Three enforcement paths, zero bottlenecks.

Optional Daemon
Shared evaluation, audit logging, and visibility tooling. Hooks and proxy can talk to it when running. Enables logs, tracking, approvals, and metrics.

Integrations

Native Agent Hooks

Protect AI agents with zero code changes — just install the hook:

# Install hooks (works standalone — no daemon required)
avakill hook install --agent claude-code  # or cursor, windsurf, gemini-cli, openai-codex, kiro, amp, openclaw, all
avakill hook list

Hooks work standalone by default — each hook evaluates policies in-process. Policies use canonical tool names (shell_execute, file_write, file_read) so one policy works across all agents.

Agent Hook Status
Claude Code Battle-tested
Cursor Supported
Windsurf Supported
Gemini CLI Supported
OpenAI Codex Supported
Kiro Supported
Amp Supported
OpenClaw Native plugin (6-layer)

OpenClaw native plugin: OpenClaw uses a dedicated plugin (avakill-openclaw) with 6 enforcement layers — hard block, guard tool, output scanning, message gate, spawn control, and context injection. Install with openclaw plugins install avakill-openclaw. Sandbox is available as a fallback via avakill launch --agent openclaw.

MCP Proxy

Wrap MCP servers to route all tool calls through AvaKill:

avakill mcp-wrap --agent claude-desktop   # or cursor, windsurf, cline, continue, all
avakill mcp-unwrap --agent all            # Restore original configs

Supported agents: Claude Desktop, Cursor, Windsurf, Cline, Continue.dev.

OS Sandbox

Launch agents in OS-level sandboxes with pre-built profiles:

avakill profile list                    # See available profiles
avakill profile show aider              # See what a profile restricts
avakill launch --agent aider -- aider   # Launch with OS sandbox

Profiles ship for OpenClaw (fallback — prefer the native plugin), Cline, Continue, SWE-Agent, and Aider.

Python SDK

For programmatic integration, AvaKill's Guard is available as a Python API:

from avakill import Guard, protect

guard = Guard(policy="avakill.yaml")

@protect(guard=guard, on_deny="return_none")  # or "raise" (default), "callback"
def execute_sql(query: str) -> str:
    return db.execute(query)

Framework wrappers:

# OpenAI
from avakill import GuardedOpenAIClient
client = GuardedOpenAIClient(OpenAI(), policy="avakill.yaml")

# Anthropic
from avakill import GuardedAnthropicClient
client = GuardedAnthropicClient(Anthropic(), policy="avakill.yaml")

# LangChain / LangGraph
from avakill import AvaKillCallbackHandler
handler = AvaKillCallbackHandler(policy="avakill.yaml")
agent.invoke({"input": "..."}, config={"callbacks": [handler]})

Policy Configuration

Policies are YAML files. Rules are evaluated top-to-bottom — first match wins.

version: "1.0"
default_action: deny

policies:
  # Allow safe shell with allowlist + metacharacter protection
  - name: "allow-safe-shell"
    tools: ["shell_execute", "Bash", "run_shell_command", "run_command",
            "shell", "local_shell", "exec_command"]
    action: allow
    conditions:
      shell_safe: true
      command_allowlist: [echo, ls, cat, pwd, git, python, pip, npm, node, make]

  # Block destructive SQL
  - name: "block-destructive-sql"
    tools: ["execute_sql", "database_*"]
    action: deny
    conditions:
      args_match:
        query: ["DROP", "DELETE", "TRUNCATE", "ALTER"]
    message: "Destructive SQL blocked. Use a manual migration."

  # Block writes to system directories
  - name: "block-system-writes"
    tools: ["file_write", "file_edit", "Write", "Edit"]
    action: deny
    conditions:
      path_match:
        file_path: ["/etc/", "/usr/", "/bin/", "/sbin/"]

  # Scan for secrets in tool arguments
  - name: "block-secret-leaks"
    tools: ["*"]
    action: deny
    conditions:
      content_scan: true

  # Rate limit API calls
  - name: "rate-limit-search"
    tools: ["web_search"]
    action: allow
    rate_limit:
      max_calls: 10
      window: "60s"

  # Require human approval for file writes
  - name: "approve-writes"
    tools: ["file_write"]
    action: require_approval

Policy features:

  • Glob patterns*, delete_*, *_execute match tool names
  • Argument matchingargs_match / args_not_match inspect arguments (case-insensitive substring)
  • Shell safetyshell_safe blocks metacharacters; command_allowlist restricts to known-good binaries
  • Path resolutionpath_match / path_not_match with symlink resolution, ~ and $HOME expansion
  • Content scanningcontent_scan detects secrets, PII, and prompt injection in arguments
  • Rate limiting — sliding window (10s, 5m, 1h)
  • Approval gatesrequire_approval pauses until a human grants or rejects
  • Enforcement levelshard (default), soft, or advisory
  • First-match-wins — order matters, put specific rules before general ones

Full reference: docs/policy-reference.md

CLI

Setup & Policy

avakill setup                              # Interactive setup — detects agents, builds policy, installs hooks
avakill validate avakill.yaml              # Validate a policy file
avakill rules                              # Browse and toggle catalog rules
avakill rules list                         # Show all rules with sources
avakill rules create                       # Interactive custom rule creation
avakill reset                              # Factory-reset AvaKill

Hooks & MCP

avakill hook install --agent all           # Install hooks for detected agents
avakill hook list                          # Show hook status
avakill mcp-wrap --agent all               # Wrap MCP servers with policy enforcement
avakill mcp-unwrap --agent all             # Restore original MCP configs

Monitoring & Recovery

avakill fix                                # See why a call was blocked and how to fix it
avakill logs --denied-only --since 1h      # Query audit logs
avakill logs tail                          # Follow new events in real-time
avakill tracking on                        # Enable activity tracking

Evaluate & Approve

echo '{"tool": "Bash", "args": {"command": "rm -rf /"}}' | avakill evaluate --policy avakill.yaml
avakill review avakill.proposed.yaml       # Review proposed policy changes
avakill approve avakill.proposed.yaml      # Activate proposed policy (human-only)
avakill approvals list                     # List pending approval requests
avakill approvals grant REQUEST_ID         # Approve a pending request

Daemon

avakill daemon start --policy avakill.yaml # Start persistent daemon (optional)
avakill daemon status                      # Check daemon status
avakill daemon stop                        # Stop daemon

Security & Compliance

avakill keygen                             # Generate Ed25519 keypair
avakill sign avakill.yaml                  # Sign policy
avakill verify avakill.yaml                # Verify signature
avakill harden avakill.yaml                # Set OS-level immutable flags
avakill compliance report --framework soc2 # Compliance assessment
avakill compliance gaps                    # Show compliance gaps

Generate Policies with Any LLM

avakill schema --format=prompt             # Generate a prompt for any LLM
avakill schema --format=prompt --tools="execute_sql,shell_exec" --use-case="data pipeline"
avakill validate generated-policy.yaml     # Validate the LLM's output

Why AvaKill?

No Protection Prompt Guardrails AvaKill
Stops destructive tool calls :x: :x: :white_check_mark:
Works across all major agents Partial :white_check_mark:
Three independent enforcement paths :x: :white_check_mark:
Deterministic (no LLM needed) :x: :white_check_mark:
<1ms overhead :x: (LLM round-trip) :white_check_mark:
YAML-based policies :x: :white_check_mark:
Full audit trail :x: :x: :white_check_mark:
Human-in-the-loop approvals :x: :x: :white_check_mark:
Self-protection (anti-tampering) :x: :x: :white_check_mark:
Open source Some :white_check_mark: AGPL 3.0

Roadmap

Stable

  • Core policy engine with glob patterns, argument matching, rate limiting
  • Interactive setup wizard with 81-rule catalog (avakill setup)
  • Native agent hooks (Claude Code, Cursor, Windsurf, Gemini CLI, Codex, Kiro, Amp, OpenClaw)
  • MCP proxy with avakill mcp-wrap and avakill-shim (Go binary)
  • OS-level sandboxing — Landlock, sandbox-exec, AppContainer
  • Standalone hook mode (no daemon required)
  • Persistent daemon with Unix socket (<5ms evaluation)
  • Shell safety (shell_safe + command_allowlist)
  • Path resolution with symlink detection, ~ and $HOME expansion
  • Content scanning (secrets, PII, prompt injection)
  • SQLite audit logging with async batched writes
  • Tool name normalization across agents
  • Multi-level policy cascade (system/global/project/local)
  • Human-in-the-loop approval workflows
  • Policy propose / review / approve workflow
  • Recovery UX (avakill fix)
  • Self-protection (hardcoded anti-tampering rules)
  • Policy signing (HMAC-SHA256 + Ed25519)
  • Compliance reports (SOC 2, NIST AI RMF, EU AI Act, ISO 42001)
  • @protect decorator for any Python function
  • Framework wrappers (OpenAI, Anthropic, LangChain/LangGraph)
  • avakill rules for post-setup rule management

Planned

  • Real-time monitoring dashboard
  • MCP HTTP transport proxy (Streamable HTTP)
  • Slack / webhook / PagerDuty notifications
  • CrewAI / AutoGen / custom framework interceptors

Contributing

We welcome contributions! AvaKill is early-stage and there's a lot to build.

git clone https://github.com/log-bell/avakill.git
cd avakill
make dev    # Install in dev mode with all dependencies
make test   # Run the test suite

See CONTRIBUTING.md for the full guide — architecture overview, code style, and PR process.

License

AGPL-3.0 — free to use, modify, and distribute. If you offer AvaKill as a network service, you must release your source code under the same license. See LICENSE for details.


She doesn't guard. She kills.

If AvaKill would have saved you from an AI agent disaster, give it a star.

Built because an AI agent tried to DROP TABLE users on a Friday afternoon.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

avakill-1.2.0.tar.gz (388.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

avakill-1.2.0-py3-none-any.whl (294.4 kB view details)

Uploaded Python 3

File details

Details for the file avakill-1.2.0.tar.gz.

File metadata

  • Download URL: avakill-1.2.0.tar.gz
  • Upload date:
  • Size: 388.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for avakill-1.2.0.tar.gz
Algorithm Hash digest
SHA256 7384ce270b70f2683b2d2ea0d7abc71bd4348a6ff7ff15ed90c4a631536307a3
MD5 2749ff4bd875d878830e3cd664724fb4
BLAKE2b-256 9edd97554745c38f271ae849828b71c1d15db2daab6fc6364af13d365b9c7b0a

See more details on using hashes here.

Provenance

The following attestation bundles were made for avakill-1.2.0.tar.gz:

Publisher: release.yml on log-bell/avakill

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file avakill-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: avakill-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 294.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for avakill-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41b5ea74712225f3f9c3ee51fa8e757577c34118b467f5303ae76b961039f050
MD5 066c62f81ee8739fbe74e6623eb734ee
BLAKE2b-256 54e46ece97e82276b8cdf5051dab1a2ca822d704d6ecb0ce50ac05263850e3b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for avakill-1.2.0-py3-none-any.whl:

Publisher: release.yml on log-bell/avakill

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page