A layered protocol and reference implementation for codifying risk in autonomous agent actions.
Project description
agent-risk-engine
A layered protocol and reference implementation for codifying risk in autonomous agent actions.
See PROTOCOL.md for the language-agnostic protocol specification.
Installation
uv
uv add agent-risk-engine
pip
pip install agent-risk-engine
Quick Start
from agent_risk_engine import RuleGate, RiskEvaluator, Action, GateResult
gate = RuleGate(threshold="cautious")
evaluator = RiskEvaluator(rule_gate=gate)
result = await evaluator.evaluate(Action(kind="tool_call", name="read_file", risk=1))
assert result.decision == GateResult.ALLOWED
result = await evaluator.evaluate(
Action(kind="tool_call", name="execute_shell", parameters={"command": "rm -rf /"}, risk=5)
)
assert result.decision == GateResult.NEEDS_APPROVAL
Architecture
Actions pass through a 3-layer pipeline:
flowchart LR
A["Action arrives"] --> B
B["**RuleGate** · L1\nFast static rules\nNo LLM · Microseconds"]
B -->|denied| Z["DENIED"]
B -->|passes| C
C["**ActionAnalyzer** · L2\nArgument-aware scoring\nPassthrough stub by default"]
C -->|scored| D
D["**ActionGate** · L3\nRisk vs utility tradeoff\nOnly escalates, never relaxes"]
D --> E["ALLOWED / NEEDS_APPROVAL / DENIED"]
L1 (RuleGate) and the RiskUtilityGate implementation of L3 are fully implemented. L2 ships as a passthrough stub — plug in your own ActionAnalyzer.
Risk Levels
| Level | Label | Meaning |
|---|---|---|
| 1 | Info | Read-only, no side effects |
| 2 | Low | Reads potentially sensitive data |
| 3 | Moderate | Reversible mutations |
| 4 | High | Hard-to-reverse mutations |
| 5 | Critical | Destructive or irreversible |
RuleGate
Fast, deterministic, no LLM required. Supports per-kind threshold routing:
gate = RuleGate(
threshold="cautious",
kind_thresholds={
"tool_call": "standard",
"file_write": 2,
"code_execution": 1,
},
denied={"delete_database"},
allowed={"read_logs"},
)
Evaluation order: denied → allowed → approve → threshold comparison.
Threshold Aliases
| Alias | Level | Description |
|---|---|---|
read-only |
1 | Only info-level actions |
cautious |
2 | Info + low-risk actions |
standard |
3 | Up to reversible mutations |
full-trust |
5 | Everything auto-allowed |
PatternAnalyzer
Scores actions by matching regex patterns against serialized parameters. Supports kind-scoped patterns:
from agent_risk_engine import PatternAnalyzer, RiskPattern
analyzer = PatternAnalyzer(extra_patterns=[
RiskPattern(r"\bDROP\b", 5, "SQL drop", kinds=frozenset({"database_query"})),
])
Pass it to RiskEvaluator(rule_gate=gate, action_analyzer=analyzer).
RiskUtilityGate
Weighs risk against caller-provided utility. Utility is an input, not computed internally — the library evaluates risk; your framework understands agent goals.
evaluator = RiskEvaluator(
rule_gate=RuleGate(threshold="standard"),
action_gate=RiskUtilityGate(),
)
result = await evaluator.evaluate(
Action(kind="tool_call", name="write_file", risk=3),
utility=UtilityScore(level=4, reasoning="User explicitly requested"),
)
The gate only escalates, never relaxes — it cannot make a decision less restrictive than L1.
Extending with Custom Analyzers
ActionAnalyzer is a Protocol. Implement analyze(action: Action) -> RiskScore:
from agent_risk_engine import RiskEvaluator, RuleGate, RiskScore, Action
class LLMAnalyzer:
"""Use an LLM to evaluate the actual risk of action arguments."""
async def analyze(self, action: Action) -> RiskScore:
# Inspect action.parameters, reason about consequences
assessed_level = await my_llm_judge(action.name, action.parameters)
return RiskScore(level=assessed_level, reasoning="LLM analysis")
evaluator = RiskEvaluator(
rule_gate=RuleGate(threshold="cautious"),
action_analyzer=LLMAnalyzer(),
)
CallTracker
Standalone loop and repetition detection. Not a pipeline layer — use it to build context before evaluating:
from agent_risk_engine import CallTracker
tracker = CallTracker()
tracker.record(action.name)
context = tracker.check()
# Merge into action metadata before evaluating
action = Action(kind=action.kind, name=action.name, risk=action.risk, metadata=context)
check() returns {"healthy": bool, "warnings": list[str]}.
Framework Integration
Write a thin adapter that maps your framework's action primitives to Action:
async def before_action_hook(action_name, args, action_risk):
action = Action(kind="tool_call", name=action_name, parameters=args, risk=action_risk)
result = await evaluator.evaluate(action)
if result.decision == GateResult.DENIED:
raise PermissionError(result.reasoning)
if result.decision == GateResult.NEEDS_APPROVAL:
approved = await prompt_user(f"Allow '{action_name}'? Risk: {result.risk_score.level}/5")
if not approved:
raise PermissionError("User denied")
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_risk_engine-0.2.0.tar.gz.
File metadata
- Download URL: agent_risk_engine-0.2.0.tar.gz
- Upload date:
- Size: 19.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a44f88363dc422293a2bc3d0a5c1e691e5e3171e898e916110a00d4ae2bd508e
|
|
| MD5 |
4949e38457f21445c922860aafcaca92
|
|
| BLAKE2b-256 |
d44a0dbdaa93504b5d1fb04d1ecfe0b6beb90b01e9ea7371b7a0edb1cde9dc40
|
Provenance
The following attestation bundles were made for agent_risk_engine-0.2.0.tar.gz:
Publisher:
publish.yml on willdah/agent-risk-engine
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_risk_engine-0.2.0.tar.gz -
Subject digest:
a44f88363dc422293a2bc3d0a5c1e691e5e3171e898e916110a00d4ae2bd508e - Sigstore transparency entry: 1235067046
- Sigstore integration time:
-
Permalink:
willdah/agent-risk-engine@85878eedbb303cf0eefe22ec1573b87889192b01 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/willdah
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@85878eedbb303cf0eefe22ec1573b87889192b01 -
Trigger Event:
push
-
Statement type:
File details
Details for the file agent_risk_engine-0.2.0-py3-none-any.whl.
File metadata
- Download URL: agent_risk_engine-0.2.0-py3-none-any.whl
- Upload date:
- Size: 13.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
230a8b4f9a2b6e9a3ec36111adef234b54a6a03d464dc10d8cebea91a4647bef
|
|
| MD5 |
f188d942779e8a0255229ce174a309f3
|
|
| BLAKE2b-256 |
2cf2c91c563e228a5be68a9526927571f958833f942e03d80b415f262d1a4e39
|
Provenance
The following attestation bundles were made for agent_risk_engine-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on willdah/agent-risk-engine
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agent_risk_engine-0.2.0-py3-none-any.whl -
Subject digest:
230a8b4f9a2b6e9a3ec36111adef234b54a6a03d464dc10d8cebea91a4647bef - Sigstore transparency entry: 1235067051
- Sigstore integration time:
-
Permalink:
willdah/agent-risk-engine@85878eedbb303cf0eefe22ec1573b87889192b01 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/willdah
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@85878eedbb303cf0eefe22ec1573b87889192b01 -
Trigger Event:
push
-
Statement type: