Skip to main content

Production-grade guardrails for AI agent function calls. Prevent overspending, unauthorized purchases, unsafe deletions, and more โ€” with Human-in-the-Loop approval.

Project description

๐Ÿ›ก๏ธ AgentHalt

Production-grade guardrails for AI agent function calls.

Prevent overspending, unauthorized purchases, unsafe deletions, and more โ€” with Human-in-the-Loop (HIL) approval that lives outside the prompt.

Python 3.10+ License: Apache 2.0 CI Dashboard

Built by HZYAI โ€” the team behind RAGScore (12.5K+ downloads). NVIDIA Inception ยท AWS Startup.


Why AgentHalt?

AI agents are increasingly making real-world function calls โ€” sending emails, making purchases, deleting documents, calling APIs. But what happens when an agent goes rogue?

Real incidents that AgentHalt prevents:

  • ๐Ÿ—‘๏ธ Agent auto-deleting emails without permission
  • ๐Ÿ’ธ Runaway API calls burning through $1000s in minutes
  • ๐Ÿ›’ Agent making unauthorized purchases
  • ๐Ÿ”„ Infinite loops calling the same tool repeatedly
  • ๐Ÿ”‘ Leaking API keys or PII through function arguments

AgentHalt is different from prompt-based guardrails:

  • Policies are defined in code or YAML โ€” not in the system prompt
  • Cannot be jailbroken or prompt-injected away
  • Works as middleware between the agent and tool execution
  • Provides a proper Human-in-the-Loop approval flow

Quick Start

Installation

pip install agenthalt

30-Second Example

import asyncio
from agenthalt import (
    PolicyEngine, CallContext,
    BudgetGuard, BudgetConfig,
    DeletionGuard, DeletionConfig,
    PurchaseGuard, PurchaseConfig,
)

async def main():
    # Create the engine and add guards
    engine = PolicyEngine()
    engine.add_guard(BudgetGuard(BudgetConfig(max_daily_spend=10.0)))
    engine.add_guard(DeletionGuard(DeletionConfig(
        allow_patterns=["temp_*", "draft_*"],
        protected_resources=["inbox", "sent"],
    )))
    engine.add_guard(PurchaseGuard(PurchaseConfig(
        max_single_purchase=100.0,
        require_approval_above=50.0,
    )))

    # Evaluate a function call before executing it
    result = await engine.evaluate(CallContext(
        function_name="delete_email",
        arguments={"email_id": "inbox"},
    ))

    if result.is_allowed:
        execute_function(...)
    elif result.needs_approval:
        # Route to human approval
        ...
    else:
        print(f"Blocked: {result.denial_reasons}")

asyncio.run(main())

Built-in Guards

๐Ÿ’ฐ Budget Guard

Prevent overspending on API calls and external services.

from agenthalt import BudgetGuard, BudgetConfig

guard = BudgetGuard(BudgetConfig(
    max_call_cost=1.0,          # Max cost per individual call
    max_session_spend=5.0,      # Max spend per session
    max_daily_spend=50.0,       # Max spend per day
    max_monthly_spend=500.0,    # Max spend per month
    warn_threshold=0.8,         # Require approval at 80% of limit
    cost_estimator={            # Known costs per function
        "gpt4_call": 0.03,
        "image_generation": 0.04,
    },
))

๐Ÿ›’ Purchase Guard

Prevent unauthorized or excessive purchases.

from agenthalt import PurchaseGuard, PurchaseConfig

guard = PurchaseGuard(PurchaseConfig(
    max_single_purchase=100.0,        # Max per transaction
    max_daily_purchases=500.0,        # Max daily total
    max_purchase_count_per_day=10,    # Max transactions per day
    require_approval_above=50.0,      # HIL above this amount
    blocked_categories=["luxury", "gambling"],
))

๐Ÿ—‘๏ธ Deletion Guard

Restrict document and resource deletion to preset guidelines.

from agenthalt import DeletionGuard, DeletionConfig

guard = DeletionGuard(DeletionConfig(
    allow_patterns=["temp_*", "draft_*", "cache_*"],
    deny_patterns=["*_production", "*_backup"],
    protected_resources=["inbox", "sent", "important"],
    require_approval_always=True,
    max_bulk_delete=5,
    max_deletions_per_day=20,
    soft_delete_only=True,
    cooldown_seconds=5.0,
))

โฑ๏ธ Rate Limit Guard

Prevent runaway agent loops and excessive function calls.

from agenthalt import RateLimitGuard, RateLimitConfig

guard = RateLimitGuard(RateLimitConfig(
    max_calls_per_minute=30,
    max_calls_per_minute_per_function=10,
    max_calls_per_session=200,
    max_identical_calls=3,       # Detect stuck loops
    burst_threshold=15,          # Calls in burst window
    burst_window_seconds=5.0,
    cooldown_seconds=30.0,       # Cooldown after burst
))

๐Ÿ” Sensitive Data Guard

Block actions involving PII, credentials, or sensitive information.

from agenthalt import SensitiveDataGuard, SensitiveDataConfig

guard = SensitiveDataGuard(SensitiveDataConfig(
    blocked_patterns=["ssn", "credit_card", "api_key", "aws_key", "jwt"],
    sensitive_fields=["password", "secret", "token"],
    custom_patterns={"employee_id": r"EMP-\d{6}"},
    redact_on_modify=True,       # Redact instead of deny
))

๐ŸŽฏ Scope Guard

Restrict which tools/functions an agent is allowed to call.

from agenthalt import ScopeGuard, ScopeConfig

# Whitelist mode
guard = ScopeGuard(ScopeConfig(
    allow_functions=["get_*", "list_*", "search_*"],
))

# Blacklist mode with per-agent overrides
guard = ScopeGuard(ScopeConfig(
    deny_functions=["drop_*", "format_*", "shutdown_*"],
    require_approval_functions=["send_email", "post_*"],
    deny_by_agent={"untrusted_agent": ["send_*", "delete_*"]},
    read_only_mode=False,
))

YAML Configuration

Define all guards in a single YAML file:

# agenthalt.yaml
guards:
  budget:
    max_daily_spend: 10.0
    warn_threshold: 0.8
    cost_estimator:
      gpt4_call: 0.03
      web_search: 0.01

  deletion:
    allow_patterns: ["temp_*", "draft_*"]
    protected_resources: ["inbox", "sent"]
    require_approval_always: true

  purchase:
    max_single_purchase: 100.0
    require_approval_above: 50.0
    blocked_categories: ["luxury", "gambling"]

  rate_limit:
    max_calls_per_minute: 30
    max_identical_calls: 3

  scope:
    deny_functions: ["drop_*", "format_*"]

  sensitive_data:
    blocked_patterns: ["ssn", "credit_card", "api_key"]
from agenthalt.config import load_config

engine = load_config("agenthalt.yaml")

Human-in-the-Loop (HIL) Approval

AgentHalt provides pluggable approval handlers:

from agenthalt.hil.approval import (
    ConsoleApprovalHandler,   # Interactive CLI prompt
    CallbackApprovalHandler,  # Custom callback (Slack, webhooks, etc.)
    AutoDenyHandler,          # Auto-deny for CI/testing
)

# Console approval (development)
handler = ConsoleApprovalHandler(timeout=300.0)

# Custom approval flow (production)
async def slack_approval(request):
    # Send to Slack, wait for response
    channel_msg = await slack.post(f"Approve {request.call_context.function_name}?")
    reaction = await slack.wait_for_reaction(channel_msg, timeout=600)
    return ApprovalResponse(approved=reaction == "โœ…", approver="slack")

handler = CallbackApprovalHandler(slack_approval)

Decorator API

Protect functions directly with the @guarded decorator:

from agenthalt import PolicyEngine, BudgetGuard, BudgetConfig, guarded
from agenthalt.decorators import GuardedCallBlocked

engine = PolicyEngine()
engine.add_guard(BudgetGuard(BudgetConfig(max_daily_spend=10.0)))

@guarded(engine, agent_id="my_agent")
def call_api(prompt: str, model: str = "gpt-4") -> str:
    return openai_client.chat(prompt, model=model)

@guarded(engine)
async def search_web(query: str) -> list[str]:
    return await web_search(query)

# Calls are automatically evaluated
try:
    result = call_api("Hello world")
except GuardedCallBlocked as e:
    print(f"Blocked: {e.result.denial_reasons}")

OpenAI Integration

from openai import OpenAI
from agenthalt import PolicyEngine, BudgetGuard, BudgetConfig
from agenthalt.integrations.openai_adapter import OpenAIGuardedClient

engine = PolicyEngine()
engine.add_guard(BudgetGuard(BudgetConfig(max_daily_spend=10.0)))

client = OpenAI()
guarded = OpenAIGuardedClient(engine=engine, agent_id="assistant")

# Standard OpenAI chat completion with tools
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Delete all my emails"}],
    tools=[...],
)

# Evaluate each tool call before executing
for tool_call in response.choices[0].message.tool_calls or []:
    result = await guarded.evaluate_tool_call(tool_call)
    if result.is_allowed:
        output = execute_tool(tool_call)
    elif result.needs_approval:
        # Route to approval flow
        ...
    else:
        # Return denial to the model
        ...

Audit Logging

Every guard evaluation is logged for compliance and debugging:

from agenthalt import AuditLogger
from agenthalt.audit.logger import JsonFileSink, LoggingSink

audit = AuditLogger()
audit.add_sink(JsonFileSink("audit.jsonl"))     # JSON lines file
audit.add_sink(LoggingSink())                    # Python logging

# Attach to engine
engine.add_post_hook(audit.create_post_hook())

# Query audit history
denied = audit.query(decision="deny", limit=10)
for entry in denied:
    print(f"{entry.function_name}: {entry.final_decision}")

Custom Guards

Create your own guards by subclassing Guard:

from agenthalt import Guard, CallContext
from agenthalt.core.decision import Decision

class BusinessHoursGuard(Guard):
    """Only allow certain actions during business hours."""

    def __init__(self):
        super().__init__(name="business_hours")

    def should_apply(self, ctx: CallContext) -> bool:
        return ctx.function_name in ("send_email", "make_payment")

    async def evaluate(self, ctx: CallContext) -> Decision:
        import datetime
        now = datetime.datetime.now()
        if 9 <= now.hour < 17 and now.weekday() < 5:
            return self.allow("Within business hours")
        return self.require_approval(
            f"Outside business hours ({now.strftime('%A %H:%M')})",
            risk_score=0.6,
        )

engine.add_guard(BusinessHoursGuard())

Real-Time Dashboard

AgentHalt includes a built-in monitoring dashboard for live demos and production monitoring:

pip install agenthalt[dashboard]
python examples/live_demo.py
# Open http://localhost:8550

Dashboard features:

  • Live event feed โ€” Watch guard evaluations stream in real-time via WebSocket
  • Budget gauges โ€” Visual spend tracking with warn/danger thresholds
  • Stats overview โ€” Total evaluations, allow/deny/approval counts, avg risk score
  • Guard status โ€” Active guards and their evaluation counts
  • Dark theme โ€” Professional UI built for live demos

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   AI Agent   โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚   AgentHalt    โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚  Tool/Action  โ”‚
โ”‚  (LLM + tools)โ”‚    โ”‚  PolicyEngine  โ”‚     โ”‚  Execution    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                            โ”‚
         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
         โ”‚                                  โ”‚
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   Guards   โ”‚  โ”‚  HIL Flow   โ”‚  โ”‚  Dashboard  โ”‚
   โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”‚  (Approval) โ”‚  โ”‚ (Real-Time) โ”‚
   โ”‚ Budget     โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   โ”‚ Purchase   โ”‚
   โ”‚ Deletion   โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚ Rate Limit โ”‚  โ”‚ Audit Loggerโ”‚  โ”‚   SQLite   โ”‚
   โ”‚ Scope      โ”‚  โ”‚ (Compliance)โ”‚  โ”‚   State    โ”‚
   โ”‚ PII/Secret โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   โ”‚ Custom     โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key Design Principles:

  • Policy-as-code โ€” Rules defined in Python or YAML, never in prompts
  • Zero-trust default โ€” Guard errors result in denial (fail-safe)
  • Composable โ€” Stack multiple guards; most restrictive decision wins
  • Framework-agnostic โ€” Works with OpenAI, LangChain, CrewAI, or raw calls
  • Async-first โ€” Native async with sync wrappers
  • Concurrent evaluation โ€” All guards run in parallel for minimal latency

Decision Priority

When multiple guards evaluate a call, the most restrictive decision wins:

DENY > REQUIRE_APPROVAL > MODIFY > ALLOW

If any guard denies, the call is blocked โ€” regardless of other guards allowing it.

Development

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=agenthalt

# Type checking
mypy src/agenthalt

# Linting
ruff check src/

License

Apache 2.0 โ€” see LICENSE for details.

Copyright 2025 HZYAI Pty Ltd

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agenthalt-0.1.2.tar.gz (49.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agenthalt-0.1.2-py3-none-any.whl (52.8 kB view details)

Uploaded Python 3

File details

Details for the file agenthalt-0.1.2.tar.gz.

File metadata

  • Download URL: agenthalt-0.1.2.tar.gz
  • Upload date:
  • Size: 49.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agenthalt-0.1.2.tar.gz
Algorithm Hash digest
SHA256 46cfbb77f422c6a8ab73b6aaa3dd62cb48fb3537e8e47148ca9907b0c2f860e3
MD5 ba3772ba7de6fdb68cbfc1181c2e0180
BLAKE2b-256 e4ec8381e34fcefa0f3a37b48b5565e24367e6da475591fbefaa62aae8916851

See more details on using hashes here.

Provenance

The following attestation bundles were made for agenthalt-0.1.2.tar.gz:

Publisher: publish.yml on HZYAI/agenthalt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agenthalt-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: agenthalt-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 52.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agenthalt-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0f967e1e97b531beaa2e0d29e544cba110b0763ec029510055038b90b914c5c7
MD5 1d5b97aa4063fb958b9e32e61891d3e1
BLAKE2b-256 da48786c838f8bbb9615239e151ec172f206cc0e719e86dddad04574b8cf2015

See more details on using hashes here.

Provenance

The following attestation bundles were made for agenthalt-0.1.2-py3-none-any.whl:

Publisher: publish.yml on HZYAI/agenthalt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page