Framework-agnostic confidence-gated escalation middleware for LLM agents: multi-signal scoring (logprob, verbalized, tool risk), threshold policies, and escalation handlers for LangChain, CrewAI, AutoGen, and Google ADK.
Project description
confidence-escalation
Framework-agnostic confidence-gated escalation middleware for LLM agents.
Multi-signal confidence scoring (logprob + verbalized + ASR + tool risk) with threshold-based escalation policies and pluggable handlers. Works with LangChain, LangGraph, CrewAI, AutoGen, Google ADK, and any Python agent framework.
Addresses OWASP Agentic AI Top 10 ASI-09: Human-Agent Trust Exploitation — prevents agents from taking high-stakes actions when confidence is insufficient.
The Problem
LLM agents fail silently. When an agent is uncertain, it still returns a response — often confidently-worded — with no mechanism to:
- Detect that confidence is low before executing a high-risk tool call
- Route uncertain responses to a human reviewer
- Escalate to a stronger model when needed
- Produce a compliance audit trail of every escalation event
confidence-escalation solves all four.
Features
- Multi-signal scoring — combine logprobs, verbalized confidence, and tool-call risk into a single composite score
- Threshold policies — single-threshold, dual-threshold (normal + critical), composite multi-policy chains
- Pluggable handlers — human-in-loop, model upgrade, tool restriction, compliance logging
- Framework adapters — LangChain callbacks, CrewAI step_callback, AutoGen reply function wrapper, Google ADK event interceptor
- EU AI Act Article 12 audit logging — structured JSON compliance log on every escalation
- Zero required dependencies — core library runs with no dependencies; framework integrations are optional extras
Quick Start
Installation
pip install confidence-escalation
# With LangChain:
pip install "confidence-escalation[langchain]"
# With all frameworks:
pip install "confidence-escalation[all]"
Basic Scoring
from confidence_escalation import MultiSignalConfidenceScorer
scorer = MultiSignalConfidenceScorer(
weights={"logprob": 0.5, "verbalized": 0.3, "tool_risk": -0.2}
)
score = scorer.score(
logprobs=[-0.1, -0.3, -0.2],
verbalized_response="I am 70% confident about this answer.",
tool_call_risk=0.15,
)
print(f"Confidence: {score.value:.3f}") # e.g. 0.712
print(f"Reliable: {score.is_reliable()}") # True (above 0.6 default)
Threshold Policy + Human-in-Loop
from confidence_escalation import (
ThresholdPolicy,
EscalationAction,
HumanInLoopHandler,
ComplianceLoggingHandler,
ConfidenceEscalationMiddleware,
)
def notify_human(ctx, result):
print(f"Routing to human review: session={ctx['session_id']}, confidence={result.confidence_score:.3f}")
policy = ThresholdPolicy(
threshold=0.65,
action=EscalationAction.HUMAN_IN_LOOP,
critical_threshold=0.3,
critical_action=EscalationAction.ABORT,
)
middleware = ConfidenceEscalationMiddleware(
policy=policy,
handlers=[
HumanInLoopHandler(callback=notify_human),
ComplianceLoggingHandler(),
],
)
result = middleware.call(
agent_step=lambda: my_llm.invoke(messages),
context={"session_id": "abc123", "model": "claude-sonnet-4-6"},
logprobs=[-0.4, -0.5],
)
if result["escalation"]["triggered"]:
print("Escalated — stopping agent execution.")
Model Upgrade Handler
from confidence_escalation import ModelUpgradeHandler, ThresholdPolicy, EscalationAction
handler = ModelUpgradeHandler(
upgrade_map={
"claude-haiku-4-5": "claude-sonnet-4-6",
"claude-sonnet-4-6": "claude-opus-4-7",
}
)
policy = ThresholdPolicy(threshold=0.7, action=EscalationAction.MODEL_UPGRADE)
result = policy.evaluate(score, context={"model": "claude-haiku-4-5"})
if result.triggered:
upgrade_info = handler.handle(result, context={"model": "claude-haiku-4-5"})
print(f"Retry with: {upgrade_info['upgraded_model']}")
Tool Restriction
from confidence_escalation import ToolRestrictionHandler, ThresholdPolicy, EscalationAction
handler = ToolRestrictionHandler(
high_risk_tools=["delete_record", "send_email", "execute_sql"],
allow_read_only=True,
)
policy = ThresholdPolicy(threshold=0.65, action=EscalationAction.TOOL_RESTRICTION)
result = policy.evaluate(score, context={"available_tools": ["get_customer", "delete_record"]})
if result.triggered:
restriction = handler.handle(result, context={"available_tools": agent_tools})
safe_tools = restriction["allowed_tools"]
# Re-invoke agent with only safe_tools
LangChain Integration
from confidence_escalation.adapters.langchain import LangChainEscalationAdapter
from confidence_escalation.handlers import HumanInLoopHandler
adapter = LangChainEscalationAdapter(
threshold=0.65,
handlers=[HumanInLoopHandler(raise_on_trigger=True)],
)
# Attach as LangChain callback
chain = LLMChain(llm=llm, callbacks=[adapter.as_callback()])
# Or call directly from a LangGraph node
def research_node(state):
response = llm.invoke(state["messages"])
try:
adapter.on_llm_end(response.content, logprobs=response.response_metadata.get("logprobs"))
except HumanInLoopHandler.HumanReviewRequired:
return {"status": "escalated"}
return {"response": response.content}
CrewAI Integration
from crewai import Agent
from confidence_escalation.adapters.crewai import CrewAIEscalationAdapter
adapter = CrewAIEscalationAdapter(threshold=0.65)
agent = Agent(
role="Research Specialist",
goal="Analyze market trends",
backstory="...",
step_callback=adapter.step_callback,
)
Google ADK Integration
from google.adk.agents import BaseAgent
from confidence_escalation.adapters.google_adk import ADKEscalationAdapter
class GovernedAgent(BaseAgent):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._escalation = ADKEscalationAdapter(threshold=0.65)
async def _run_async_impl(self, ctx):
async for event in self._llm_agent._run_async_impl(ctx):
if event.is_final_response():
result = self._escalation.evaluate_event(event, ctx)
if result["triggered"]:
yield self._escalation.build_escalation_event(result)
return
yield event
Composite Policy Chains
from confidence_escalation import ThresholdPolicy, EscalationAction
from confidence_escalation.policy import CompositePolicy
policy = CompositePolicy(policies=[
ThresholdPolicy(threshold=0.25, action=EscalationAction.ABORT),
ThresholdPolicy(threshold=0.55, action=EscalationAction.HUMAN_IN_LOOP),
ThresholdPolicy(threshold=0.75, action=EscalationAction.COMPLIANCE_LOG),
])
result = policy.evaluate(score, context={"session_id": "abc"})
# First matching threshold wins
OWASP Agentic AI Coverage
| OWASP ASI ID | Risk | Coverage |
|---|---|---|
| ASI-09 | Human-Agent Trust Exploitation | Confidence gating before high-stakes actions |
| ASI-02 | Tool Misuse | Tool restriction handler removes high-risk tools at low confidence |
| ASI-03 | Identity/Privilege Abuse | ComplianceLoggingHandler creates immutable audit trail |
Related Packages
- voice-ai-governance — HIPAA/FERPA/EU AI Act compliance for voice AI pipelines
- regulated-ai-governance — Runtime tool authorization and capability scoping
- enterprise-rag-patterns — FERPA/HIPAA/GDPR-compliant RAG patterns
License
MIT License. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file confidence_escalation-0.1.0.tar.gz.
File metadata
- Download URL: confidence_escalation-0.1.0.tar.gz
- Upload date:
- Size: 18.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4fc641f1a16b96ca9895525ed9b3321ee094fea3b2ee5221ab166066cf21fdf
|
|
| MD5 |
0779802ba2fefac880f69c0aa8f7c70a
|
|
| BLAKE2b-256 |
57ab54b8092a15ccc30256f6fc89a85135f53fcdfdf6d6f018f34a6e172bd84f
|
File details
Details for the file confidence_escalation-0.1.0-py3-none-any.whl.
File metadata
- Download URL: confidence_escalation-0.1.0-py3-none-any.whl
- Upload date:
- Size: 20.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3155fcb1fc2856ae4d0538f1c3deed9da1bba229b37a245b39dea876420d10be
|
|
| MD5 |
455889fa92985e0503f83c73d9b9301b
|
|
| BLAKE2b-256 |
e5d98a6548321e8d67480b007f98ff72d2b856a68e51f5f131395b0216695efd
|