Skip to main content

Production-grade guardrails for AI agent function calls. Prevent overspending, unauthorized purchases, unsafe deletions, and more โ€” with Human-in-the-Loop approval.

Project description

๐Ÿ›ก๏ธ AgentHalt

Production-grade guardrails for AI agent function calls.

Prevent overspending, unauthorized purchases, unsafe deletions, and more โ€” with Human-in-the-Loop (HIL) approval that lives outside the prompt.

Python 3.10+ License: Apache 2.0 CI Dashboard

Built by HZYAI โ€” the team behind RAGScore (12.5K+ downloads). NVIDIA Inception ยท AWS Startup.


Why AgentHalt?

AI agents are increasingly making real-world function calls โ€” sending emails, making purchases, deleting documents, calling APIs. But what happens when an agent goes rogue?

Real incidents that AgentHalt prevents:

  • ๐Ÿ—‘๏ธ Agent auto-deleting emails without permission
  • ๐Ÿ’ธ Runaway API calls burning through $1000s in minutes
  • ๐Ÿ›’ Agent making unauthorized purchases
  • ๐Ÿ”„ Infinite loops calling the same tool repeatedly
  • ๐Ÿ”‘ Leaking API keys or PII through function arguments

AgentHalt is different from prompt-based guardrails:

  • Policies are defined in code or YAML โ€” not in the system prompt
  • Cannot be jailbroken or prompt-injected away
  • Works as middleware between the agent and tool execution
  • Provides a proper Human-in-the-Loop approval flow

Quick Start

Installation

pip install agenthalt

30-Second Example

import asyncio
from agenthalt import (
    PolicyEngine, CallContext,
    BudgetGuard, BudgetConfig,
    DeletionGuard, DeletionConfig,
    PurchaseGuard, PurchaseConfig,
)

async def main():
    # Create the engine and add guards
    engine = PolicyEngine()
    engine.add_guard(BudgetGuard(BudgetConfig(max_daily_spend=10.0)))
    engine.add_guard(DeletionGuard(DeletionConfig(
        allow_patterns=["temp_*", "draft_*"],
        protected_resources=["inbox", "sent"],
    )))
    engine.add_guard(PurchaseGuard(PurchaseConfig(
        max_single_purchase=100.0,
        require_approval_above=50.0,
    )))

    # Evaluate a function call before executing it
    result = await engine.evaluate(CallContext(
        function_name="delete_email",
        arguments={"email_id": "inbox"},
    ))

    if result.is_allowed:
        execute_function(...)
    elif result.needs_approval:
        # Route to human approval
        ...
    else:
        print(f"Blocked: {result.denial_reasons}")

asyncio.run(main())

Built-in Guards

๐Ÿ’ฐ Budget Guard

Prevent overspending on API calls and external services.

from agenthalt import BudgetGuard, BudgetConfig

guard = BudgetGuard(BudgetConfig(
    max_call_cost=1.0,          # Max cost per individual call
    max_session_spend=5.0,      # Max spend per session
    max_daily_spend=50.0,       # Max spend per day
    max_monthly_spend=500.0,    # Max spend per month
    warn_threshold=0.8,         # Require approval at 80% of limit
    cost_estimator={            # Known costs per function
        "gpt4_call": 0.03,
        "image_generation": 0.04,
    },
))

๐Ÿ›’ Purchase Guard

Prevent unauthorized or excessive purchases.

from agenthalt import PurchaseGuard, PurchaseConfig

guard = PurchaseGuard(PurchaseConfig(
    max_single_purchase=100.0,        # Max per transaction
    max_daily_purchases=500.0,        # Max daily total
    max_purchase_count_per_day=10,    # Max transactions per day
    require_approval_above=50.0,      # HIL above this amount
    blocked_categories=["luxury", "gambling"],
))

๐Ÿ—‘๏ธ Deletion Guard

Restrict document and resource deletion to preset guidelines.

from agenthalt import DeletionGuard, DeletionConfig

guard = DeletionGuard(DeletionConfig(
    allow_patterns=["temp_*", "draft_*", "cache_*"],
    deny_patterns=["*_production", "*_backup"],
    protected_resources=["inbox", "sent", "important"],
    require_approval_always=True,
    max_bulk_delete=5,
    max_deletions_per_day=20,
    soft_delete_only=True,
    cooldown_seconds=5.0,
))

โฑ๏ธ Rate Limit Guard

Prevent runaway agent loops and excessive function calls.

from agenthalt import RateLimitGuard, RateLimitConfig

guard = RateLimitGuard(RateLimitConfig(
    max_calls_per_minute=30,
    max_calls_per_minute_per_function=10,
    max_calls_per_session=200,
    max_identical_calls=3,       # Detect stuck loops
    burst_threshold=15,          # Calls in burst window
    burst_window_seconds=5.0,
    cooldown_seconds=30.0,       # Cooldown after burst
))

๐Ÿ” Sensitive Data Guard

Block actions involving PII, credentials, or sensitive information.

from agenthalt import SensitiveDataGuard, SensitiveDataConfig

guard = SensitiveDataGuard(SensitiveDataConfig(
    blocked_patterns=["ssn", "credit_card", "api_key", "aws_key", "jwt"],
    sensitive_fields=["password", "secret", "token"],
    custom_patterns={"employee_id": r"EMP-\d{6}"},
    redact_on_modify=True,       # Redact instead of deny
))

๐ŸŽฏ Scope Guard

Restrict which tools/functions an agent is allowed to call.

from agenthalt import ScopeGuard, ScopeConfig

# Whitelist mode
guard = ScopeGuard(ScopeConfig(
    allow_functions=["get_*", "list_*", "search_*"],
))

# Blacklist mode with per-agent overrides
guard = ScopeGuard(ScopeConfig(
    deny_functions=["drop_*", "format_*", "shutdown_*"],
    require_approval_functions=["send_email", "post_*"],
    deny_by_agent={"untrusted_agent": ["send_*", "delete_*"]},
    read_only_mode=False,
))

YAML Configuration

Define all guards in a single YAML file:

# agenthalt.yaml
guards:
  budget:
    max_daily_spend: 10.0
    warn_threshold: 0.8
    cost_estimator:
      gpt4_call: 0.03
      web_search: 0.01

  deletion:
    allow_patterns: ["temp_*", "draft_*"]
    protected_resources: ["inbox", "sent"]
    require_approval_always: true

  purchase:
    max_single_purchase: 100.0
    require_approval_above: 50.0
    blocked_categories: ["luxury", "gambling"]

  rate_limit:
    max_calls_per_minute: 30
    max_identical_calls: 3

  scope:
    deny_functions: ["drop_*", "format_*"]

  sensitive_data:
    blocked_patterns: ["ssn", "credit_card", "api_key"]
from agenthalt.config import load_config

engine = load_config("agenthalt.yaml")

Human-in-the-Loop (HIL) Approval

AgentHalt provides pluggable approval handlers:

from agenthalt.hil.approval import (
    ConsoleApprovalHandler,   # Interactive CLI prompt
    CallbackApprovalHandler,  # Custom callback (Slack, webhooks, etc.)
    AutoDenyHandler,          # Auto-deny for CI/testing
)

# Console approval (development)
handler = ConsoleApprovalHandler(timeout=300.0)

# Custom approval flow (production)
async def slack_approval(request):
    # Send to Slack, wait for response
    channel_msg = await slack.post(f"Approve {request.call_context.function_name}?")
    reaction = await slack.wait_for_reaction(channel_msg, timeout=600)
    return ApprovalResponse(approved=reaction == "โœ…", approver="slack")

handler = CallbackApprovalHandler(slack_approval)

Decorator API

Protect functions directly with the @guarded decorator:

from agenthalt import PolicyEngine, BudgetGuard, BudgetConfig, guarded
from agenthalt.decorators import GuardedCallBlocked

engine = PolicyEngine()
engine.add_guard(BudgetGuard(BudgetConfig(max_daily_spend=10.0)))

@guarded(engine, agent_id="my_agent")
def call_api(prompt: str, model: str = "gpt-4") -> str:
    return openai_client.chat(prompt, model=model)

@guarded(engine)
async def search_web(query: str) -> list[str]:
    return await web_search(query)

# Calls are automatically evaluated
try:
    result = call_api("Hello world")
except GuardedCallBlocked as e:
    print(f"Blocked: {e.result.denial_reasons}")

OpenAI Integration

from openai import OpenAI
from agenthalt import PolicyEngine, BudgetGuard, BudgetConfig
from agenthalt.integrations.openai_adapter import OpenAIGuardedClient

engine = PolicyEngine()
engine.add_guard(BudgetGuard(BudgetConfig(max_daily_spend=10.0)))

client = OpenAI()
guarded = OpenAIGuardedClient(engine=engine, agent_id="assistant")

# Standard OpenAI chat completion with tools
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Delete all my emails"}],
    tools=[...],
)

# Evaluate each tool call before executing
for tool_call in response.choices[0].message.tool_calls or []:
    result = await guarded.evaluate_tool_call(tool_call)
    if result.is_allowed:
        output = execute_tool(tool_call)
    elif result.needs_approval:
        # Route to approval flow
        ...
    else:
        # Return denial to the model
        ...

Audit Logging

Every guard evaluation is logged for compliance and debugging:

from agenthalt import AuditLogger
from agenthalt.audit.logger import JsonFileSink, LoggingSink

audit = AuditLogger()
audit.add_sink(JsonFileSink("audit.jsonl"))     # JSON lines file
audit.add_sink(LoggingSink())                    # Python logging

# Attach to engine
engine.add_post_hook(audit.create_post_hook())

# Query audit history
denied = audit.query(decision="deny", limit=10)
for entry in denied:
    print(f"{entry.function_name}: {entry.final_decision}")

Custom Guards

Create your own guards by subclassing Guard:

from agenthalt import Guard, CallContext
from agenthalt.core.decision import Decision

class BusinessHoursGuard(Guard):
    """Only allow certain actions during business hours."""

    def __init__(self):
        super().__init__(name="business_hours")

    def should_apply(self, ctx: CallContext) -> bool:
        return ctx.function_name in ("send_email", "make_payment")

    async def evaluate(self, ctx: CallContext) -> Decision:
        import datetime
        now = datetime.datetime.now()
        if 9 <= now.hour < 17 and now.weekday() < 5:
            return self.allow("Within business hours")
        return self.require_approval(
            f"Outside business hours ({now.strftime('%A %H:%M')})",
            risk_score=0.6,
        )

engine.add_guard(BusinessHoursGuard())

Real-Time Dashboard

AgentHalt includes a built-in monitoring dashboard for live demos and production monitoring:

pip install agenthalt[dashboard]
python examples/live_demo.py
# Open http://localhost:8550

Dashboard features:

  • Live event feed โ€” Watch guard evaluations stream in real-time via WebSocket
  • Budget gauges โ€” Visual spend tracking with warn/danger thresholds
  • Stats overview โ€” Total evaluations, allow/deny/approval counts, avg risk score
  • Guard status โ€” Active guards and their evaluation counts
  • Dark theme โ€” Professional UI built for live demos

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   AI Agent   โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚   AgentHalt    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Tool/Action  โ”‚
โ”‚  (LLM + tools)โ”‚    โ”‚  PolicyEngine  โ”‚     โ”‚  Execution    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚
         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
         โ”‚                                  โ”‚
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   Guards   โ”‚  โ”‚  HIL Flow   โ”‚  โ”‚  Dashboard  โ”‚
   โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”‚  (Approval) โ”‚  โ”‚ (Real-Time) โ”‚
   โ”‚ Budget     โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   โ”‚ Purchase   โ”‚
   โ”‚ Deletion   โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚ Rate Limit โ”‚  โ”‚ Audit Loggerโ”‚  โ”‚   SQLite   โ”‚
   โ”‚ Scope      โ”‚  โ”‚ (Compliance)โ”‚  โ”‚   State    โ”‚
   โ”‚ PII/Secret โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   โ”‚ Custom     โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key Design Principles:

  • Policy-as-code โ€” Rules defined in Python or YAML, never in prompts
  • Zero-trust default โ€” Guard errors result in denial (fail-safe)
  • Composable โ€” Stack multiple guards; most restrictive decision wins
  • Framework-agnostic โ€” Works with OpenAI, LangChain, CrewAI, or raw calls
  • Async-first โ€” Native async with sync wrappers
  • Concurrent evaluation โ€” All guards run in parallel for minimal latency

Decision Priority

When multiple guards evaluate a call, the most restrictive decision wins:

DENY > REQUIRE_APPROVAL > MODIFY > ALLOW

If any guard denies, the call is blocked โ€” regardless of other guards allowing it.

Development

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=agenthalt

# Type checking
mypy src/agenthalt

# Linting
ruff check src/

License

Apache 2.0 โ€” see LICENSE for details.

Copyright 2025 HZYAI Pty Ltd

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agenthalt-0.1.0.tar.gz (47.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agenthalt-0.1.0-py3-none-any.whl (50.2 kB view details)

Uploaded Python 3

File details

Details for the file agenthalt-0.1.0.tar.gz.

File metadata

  • Download URL: agenthalt-0.1.0.tar.gz
  • Upload date:
  • Size: 47.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for agenthalt-0.1.0.tar.gz
Algorithm Hash digest
SHA256 02e1d40cb5792c4ac7be1d4ecdf3d07c73730b3c2356a9d1318e736080b9495e
MD5 60f23e2283b4091a4f07ef2fb8de9d24
BLAKE2b-256 fba985ed2df873da0589c27903821ec8b12e2d19481d34173c6cbc27981c03da

See more details on using hashes here.

File details

Details for the file agenthalt-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: agenthalt-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 50.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for agenthalt-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b4ae4895ea6d9ca0d2502de7761b54df85a998e8bc68f4dc4728c85c4dfb9194
MD5 83d77ab7aa4ffa67ed644d7dbec695a2
BLAKE2b-256 4b0a365a5433e0313f2c7e6aa7e8bac9f9d3acf4fa6a957d2c88c8df688549e9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page