At-most-once side-effect enforcement + circuit breaker for agent tool calls (loops, retries, duplicate side effects).

These details have not been verified by PyPI

Project links

Project description

Aura Guard

Reliability middleware for tool-using AI agents. Prevents tool loops, duplicate side-effects, and retry storms.

from aura_guard import AgentGuard, PolicyAction

guard = AgentGuard(
    side_effect_tools={"refund", "cancel"},
    max_calls_per_tool=3,
    max_cost_per_run=1.00,
)

decision = guard.check_tool("search_kb", args={"query": "refund policy"})

if decision.action == PolicyAction.ALLOW:
    result = execute_tool(...)
    guard.record_result(ok=True, payload=result)
elif decision.action == PolicyAction.CACHE:
    result = decision.cached_result.payload
elif decision.action == PolicyAction.REWRITE:
    inject_into_prompt(decision.injected_system)
else:
    stop_agent(decision.reason)

Aura Guard sits between your agent and its tools. Before each tool call, it returns a deterministic decision: ALLOW, CACHE, BLOCK, REWRITE, or ESCALATE. No LLM calls, no network requests, sub-millisecond overhead.

Python 3.10+ · Zero dependencies · Apache-2.0

Install
The problem
Integration
Configuration
Shadow mode
Thread Safety
Async support
Status & limitations
Docs
License

Install

Option A (recommended):

pip install aura-guard

Try the built-in demo: aura-guard demo

Option B (from source / dev): install from a cloned repo

git clone https://github.com/auraguardhq/aura-guard.git
cd aura-guard
pip install -e .

Optional: LangChain adapter

pip install langchain-core

Benchmarks (synthetic)

aura-guard bench --all

These simulate common agent failure modes (tool loops, retry storms, duplicate side-effects). Costs are estimated — the important signal is the relative difference. See docs/EVALUATION_PLAN.md for real-model evaluation.

Real-model evaluation

Tested with Claude Sonnet 4 (claude-sonnet-4-20250514), 5 scenarios × 5 runs per variant, real LLM tool-use calls with rigged tool implementations.

Scenario	No Guard	With Guard	Result
A: Jitter Loop	$0.2778	$0.1446	48% saved
B: Double Refund	$0.1397	$0.1456	Prevented duplicate refund at +$0.006 overhead
C: Error Retry Spiral	$0.1345	$0.0953	29% saved
D: Smart Reformulation	$0.8607	$0.1465	83% saved
E: Flagship	$0.3494	$0.1446	59% saved

All costs are p50 (median) across 5 runs. Scenario B costs slightly more because the guard adds an intervention turn but prevents the duplicate side-effect (the refund only executes once). In Scenario B guard runs, 2 of 5 completed in fewer turns ($0.10), while 3 of 5 required the extra intervention turn ($0.145).

64 guard interventions across 25 runs. 0 false positives (expected — see caveat below). Task completion maintained or improved in all scenarios.

Full results, per-run data, and screenshots: docs/LIVE_AB_EXAMPLE.md | JSON report

The problem

Agent run without guard:

search_kb("refund policy") → 3 results
search_kb("refund policy EU") → 2 results
search_kb("refund policy EU Germany") → 2 results
search_kb("refund policy EU Germany 2024") → 1 result
search_kb("refund policy EU returns") → 2 results
refund(order="ORD-123", amount=50) → success
refund(order="ORD-123", amount=50) → success (DUPLICATE!)
search_kb("refund confirmation") → 1 result ... 14 tool calls, $0.56, customer refunded twice

With Aura Guard

Agent run with guard:

search_kb("refund policy") → ALLOW → 3 results
search_kb("refund policy EU") → ALLOW → 2 results
search_kb("refund policy EU Germany") → ALLOW → 2 results
search_kb("refund policy EU Germany 2024") → REWRITE (jitter loop detected)
refund(order="ORD-123", amount=50) → ALLOW → success
refund(order="ORD-123", amount=50) → CACHE (idempotent replay) ... 4 tool calls executed, $0.16, one refund

max_steps doesn't distinguish productive calls from loops. Retry libraries don't prevent duplicate side-effects. Idempotency keys protect writes but don't stop search spirals or stalled outputs. Aura Guard handles all of these with a single middleware layer.

Integration

Aura Guard does not call your LLM and does not execute tools.
You keep your agent loop. You just add 3 hook calls:

check_tool(...) before you execute a tool
record_result(...) after the tool finishes (success or error)
check_output(...) after the model produces text (optional but recommended)

Minimal example

from aura_guard import AgentGuard, PolicyAction

guard = AgentGuard(
    max_calls_per_tool=3,                 # stop “search forever”
    side_effect_tools={"refund", "cancel"},
    max_cost_per_run=1.00,                # optional budget (USD)
    tool_costs={"search_kb": 0.03},        # optional; improves cost reporting
)

def run_tool(tool_name: str, args: dict):
    decision = guard.check_tool(tool_name, args=args, ticket_id="ticket-123")

    if decision.action == PolicyAction.ALLOW:
        try:
            result = execute_tool(tool_name, args)  # <-- your tool function
            guard.record_result(ok=True, payload=result)
            return result
        except Exception as e:
            # classify errors however you want ("429", "timeout", "5xx", ...)
            guard.record_result(ok=False, error_code=type(e).__name__)
            raise

    if decision.action == PolicyAction.CACHE:
        # Aura Guard tells you “reuse the previous result”
        return decision.cached_result.payload if decision.cached_result else None

    if decision.action == PolicyAction.REWRITE:
        # You should inject decision.injected_system into your next prompt
        # and re-run the model.
        raise RuntimeError(f"Rewrite requested: {decision.reason}")

    # BLOCK / ESCALATE / FINALIZE
    raise RuntimeError(f"Stopped: {decision.action.value} — {decision.reason}")

Framework-specific adapters for OpenAI and LangChain are included. See examples/ for integration patterns.

Framework examples (Anthropic, OpenAI, LangChain)

Anthropic (Claude)

import anthropic
from aura_guard import AgentGuard, PolicyAction

client = anthropic.Anthropic()
guard = AgentGuard(max_cost_per_run=1.00, side_effect_tools={"refund", "send_email"})

# In your agent loop, after the model returns tool_use blocks:
for block in response.content:
    if block.type == "tool_use":
        decision = guard.check_tool(block.name, args=block.input)

        if decision.action == PolicyAction.ALLOW:
            result = execute_tool(block.name, block.input)
            guard.record_result(ok=True, payload=result)
        elif decision.action == PolicyAction.CACHE:
            result = decision.cached_result.payload  # reuse previous result
        else:
            # BLOCK / REWRITE / ESCALATE — handle accordingly
            break

# After each assistant text response:
guard.check_output(assistant_text)

# Track real token spend:
guard.record_tokens(
    input_tokens=response.usage.input_tokens,
    output_tokens=response.usage.output_tokens,
)

OpenAI

from aura_guard import AgentGuard, PolicyAction
from aura_guard.adapters.openai_adapter import (
    extract_tool_calls_from_chat_completion,
    inject_system_message,
)

guard = AgentGuard(max_cost_per_run=1.00)

# After each OpenAI response:
tool_calls = extract_tool_calls_from_chat_completion(response)
for call in tool_calls:
    decision = guard.check_tool(call.name, args=call.args)

    if decision.action == PolicyAction.ALLOW:
        result = execute_tool(call.name, call.args)
        guard.record_result(ok=True, payload=result)
    elif decision.action == PolicyAction.REWRITE:
        messages = inject_system_message(messages, decision.injected_system)
        # Re-call the model with updated messages

LangChain

from aura_guard.adapters.langchain_adapter import AuraCallbackHandler

handler = AuraCallbackHandler(
    max_cost_per_run=1.00,
    side_effect_tools={"refund", "send_email"},
)

# Pass as a callback — Aura Guard intercepts tool calls automatically:
agent = initialize_agent(tools=tools, llm=llm, callbacks=[handler])
agent.run("Process refund for order ORD-123")

# After the run:
print(handler.summary)
# {"cost_spent_usd": 0.12, "cost_saved_usd": 0.40, "blocks": 3, ...}

Recommended: record real token usage (more accurate costs)

After each LLM call, report usage:

guard.record_tokens(
    input_tokens=resp.usage.input_tokens,
    output_tokens=resp.usage.output_tokens,
)

Configuration (the knobs that matter)

Most teams start here:

Mark side-effect tools
e.g. {"refund", "cancel", "send_email"}
Cap expensive tools
e.g. max_calls_per_tool=3 for search/retrieval
Set a max budget per run
e.g. max_cost_per_run=1.00
Tell Aura Guard your tool costs
so reports are meaningful

For advanced options, see AuraGuardConfig in src/aura_guard/config.py.

Example: per-tool policies (deny / human approval)

from aura_guard import AgentGuard, AuraGuardConfig, ToolPolicy, ToolAccess

guard = AgentGuard(
    config=AuraGuardConfig(
        tool_policies={
            "delete_account": ToolPolicy(access=ToolAccess.DENY, deny_reason="Too risky"),
            "large_refund": ToolPolicy(access=ToolAccess.HUMAN_APPROVAL, risk="high"),
            "search_kb": ToolPolicy(max_calls=5),
        },
    ),
)

Telemetry & persistence (optional)

Telemetry

Aura Guard can emit structured events (counts + signatures, not raw args/payloads).
See src/aura_guard/telemetry.py.

Persist state (optional)

You can serialize guard state to JSON and store it in Redis / Postgres / etc.

from aura_guard.serialization import state_to_json, state_from_json

json_str = state_to_json(state)
state = state_from_json(json_str)

Status & limitations

Aura Guard is v0.3 — the API is stabilizing but may change before v1.0.

Stable: The 3-method API (check_tool / record_result / check_output), the 6 PolicyAction values, and AuraGuardConfig.

May change: Default threshold values, serialization format (versioned — old state will error, not silently corrupt), telemetry event names.

Limitations:

In-memory state only. Not thread-safe. Create one guard per agent run.
Side-effect enforcement is at-most-once within a single process. Not exactly-once across restarts.
Argument jitter detection uses token overlap, not semantic similarity. English-biased.
Cost estimates are configurable approximations, not actual billing data.
Guard state stores HMAC signatures only — no raw args or payloads in state or telemetry.

For architecture details, see docs/ARCHITECTURE.md.

Docs

docs/ARCHITECTURE.md — how the engine is structured
docs/EVALUATION_PLAN.md — how to evaluate credibly
docs/LIVE_AB_EXAMPLE.md — live A/B walkthrough and sample output
docs/RESULTS.md — how to publish results (recommended format)

Contributing

See CONTRIBUTING.md.

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.7.1

Mar 13, 2026

0.7.0

Mar 13, 2026

0.6.0

Mar 12, 2026

0.5.1

Mar 12, 2026

0.5.0

Mar 12, 2026

0.4.0

Mar 10, 2026

0.3.11

Mar 10, 2026

0.3.10

Mar 7, 2026

0.3.9

Mar 6, 2026

0.3.8

Feb 16, 2026

0.3.7

Feb 16, 2026

0.3.6

Feb 16, 2026

0.3.5

Feb 15, 2026

0.3.4

Feb 15, 2026

This version

0.3.3

Feb 12, 2026

0.3.2

Feb 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aura_guard-0.3.3.tar.gz (53.2 kB view details)

Uploaded Feb 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aura_guard-0.3.3-py3-none-any.whl (51.9 kB view details)

Uploaded Feb 12, 2026 Python 3

File details

Details for the file aura_guard-0.3.3.tar.gz.

File metadata

Download URL: aura_guard-0.3.3.tar.gz
Upload date: Feb 12, 2026
Size: 53.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aura_guard-0.3.3.tar.gz
Algorithm	Hash digest
SHA256	`2d8fe2b02ba01346584b33b98b2b07065ee2fb2727e0e0030fa7107c430f330d`
MD5	`fe27d04cbb4b35dbb3db2a4ed2e5136b`
BLAKE2b-256	`7444915072a66a97980e842b06665f2c2958653334af1844c13d26300a8df16b`

See more details on using hashes here.

File details

Details for the file aura_guard-0.3.3-py3-none-any.whl.

File metadata

Download URL: aura_guard-0.3.3-py3-none-any.whl
Upload date: Feb 12, 2026
Size: 51.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aura_guard-0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`810bba53be18f07f0513521444ae9d9de0d67fc7abea4a242d77555954e04636`
MD5	`95b4c163054714a53079a8925e5e09d6`
BLAKE2b-256	`697ceb62ce83e7994eff92390334761a99e50b726df57ac7c55ccc02dfc3e44e`

See more details on using hashes here.

aura-guard 0.3.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Aura Guard

Table of contents

Install

Option A (recommended):

Option B (from source / dev): install from a cloned repo

Optional: LangChain adapter

Benchmarks (synthetic)

Real-model evaluation

The problem

With Aura Guard

Integration

Minimal example

Anthropic (Claude)

OpenAI

LangChain

Recommended: record real token usage (more accurate costs)

Configuration (the knobs that matter)

Example: per-tool policies (deny / human approval)

Telemetry & persistence (optional)

Telemetry

Persist state (optional)

Status & limitations

Docs

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes