Framework-agnostic confidence-gated escalation middleware for LLM agents: multi-signal scoring (logprob, verbalized, tool risk), threshold policies, and escalation handlers for LangChain, CrewAI, AutoGen, and Google ADK.

These details have not been verified by PyPI

Project links

Project description

confidence-escalation

Framework-agnostic confidence-gated escalation middleware for LLM agents.

Multi-signal confidence scoring (logprob + verbalized + ASR + tool risk) with threshold-based escalation policies and pluggable handlers. Works with LangChain, LangGraph, CrewAI, AutoGen, Google ADK, and any Python agent framework.

Addresses OWASP Agentic AI Top 10 ASI-09: Human-Agent Trust Exploitation — prevents agents from taking high-stakes actions when confidence is insufficient.

The Problem

LLM agents fail silently. When an agent is uncertain, it still returns a response — often confidently-worded — with no mechanism to:

Detect that confidence is low before executing a high-risk tool call
Route uncertain responses to a human reviewer
Escalate to a stronger model when needed
Produce a compliance audit trail of every escalation event

confidence-escalation solves all four.

Features

Multi-signal scoring — combine logprobs, verbalized confidence, and tool-call risk into a single composite score
Threshold policies — single-threshold, dual-threshold (normal + critical), composite multi-policy chains
Pluggable handlers — human-in-loop, model upgrade, tool restriction, compliance logging
Framework adapters — LangChain callbacks, CrewAI step_callback, AutoGen reply function wrapper, Google ADK event interceptor
EU AI Act Article 12 audit logging — structured JSON compliance log on every escalation
Zero required dependencies — core library runs with no dependencies; framework integrations are optional extras

Quick Start

Installation

pip install confidence-escalation
# With LangChain:
pip install "confidence-escalation[langchain]"
# With all frameworks:
pip install "confidence-escalation[all]"

Basic Scoring

from confidence_escalation import MultiSignalConfidenceScorer

scorer = MultiSignalConfidenceScorer(
    weights={"logprob": 0.5, "verbalized": 0.3, "tool_risk": -0.2}
)

score = scorer.score(
    logprobs=[-0.1, -0.3, -0.2],
    verbalized_response="I am 70% confident about this answer.",
    tool_call_risk=0.15,
)

print(f"Confidence: {score.value:.3f}")   # e.g. 0.712
print(f"Reliable: {score.is_reliable()}")  # True (above 0.6 default)

Threshold Policy + Human-in-Loop

from confidence_escalation import (
    ThresholdPolicy,
    EscalationAction,
    HumanInLoopHandler,
    ComplianceLoggingHandler,
    ConfidenceEscalationMiddleware,
)

def notify_human(ctx, result):
    print(f"Routing to human review: session={ctx['session_id']}, confidence={result.confidence_score:.3f}")

policy = ThresholdPolicy(
    threshold=0.65,
    action=EscalationAction.HUMAN_IN_LOOP,
    critical_threshold=0.3,
    critical_action=EscalationAction.ABORT,
)

middleware = ConfidenceEscalationMiddleware(
    policy=policy,
    handlers=[
        HumanInLoopHandler(callback=notify_human),
        ComplianceLoggingHandler(),
    ],
)

result = middleware.call(
    agent_step=lambda: my_llm.invoke(messages),
    context={"session_id": "abc123", "model": "claude-sonnet-4-6"},
    logprobs=[-0.4, -0.5],
)

if result["escalation"]["triggered"]:
    print("Escalated — stopping agent execution.")

Model Upgrade Handler

from confidence_escalation import ModelUpgradeHandler, ThresholdPolicy, EscalationAction

handler = ModelUpgradeHandler(
    upgrade_map={
        "claude-haiku-4-5": "claude-sonnet-4-6",
        "claude-sonnet-4-6": "claude-opus-4-7",
    }
)

policy = ThresholdPolicy(threshold=0.7, action=EscalationAction.MODEL_UPGRADE)
result = policy.evaluate(score, context={"model": "claude-haiku-4-5"})

if result.triggered:
    upgrade_info = handler.handle(result, context={"model": "claude-haiku-4-5"})
    print(f"Retry with: {upgrade_info['upgraded_model']}")

Tool Restriction

from confidence_escalation import ToolRestrictionHandler, ThresholdPolicy, EscalationAction

handler = ToolRestrictionHandler(
    high_risk_tools=["delete_record", "send_email", "execute_sql"],
    allow_read_only=True,
)

policy = ThresholdPolicy(threshold=0.65, action=EscalationAction.TOOL_RESTRICTION)
result = policy.evaluate(score, context={"available_tools": ["get_customer", "delete_record"]})

if result.triggered:
    restriction = handler.handle(result, context={"available_tools": agent_tools})
    safe_tools = restriction["allowed_tools"]
    # Re-invoke agent with only safe_tools

LangChain Integration

from confidence_escalation.adapters.langchain import LangChainEscalationAdapter
from confidence_escalation.handlers import HumanInLoopHandler

adapter = LangChainEscalationAdapter(
    threshold=0.65,
    handlers=[HumanInLoopHandler(raise_on_trigger=True)],
)

# Attach as LangChain callback
chain = LLMChain(llm=llm, callbacks=[adapter.as_callback()])

# Or call directly from a LangGraph node
def research_node(state):
    response = llm.invoke(state["messages"])
    try:
        adapter.on_llm_end(response.content, logprobs=response.response_metadata.get("logprobs"))
    except HumanInLoopHandler.HumanReviewRequired:
        return {"status": "escalated"}
    return {"response": response.content}

CrewAI Integration

from crewai import Agent
from confidence_escalation.adapters.crewai import CrewAIEscalationAdapter

adapter = CrewAIEscalationAdapter(threshold=0.65)

agent = Agent(
    role="Research Specialist",
    goal="Analyze market trends",
    backstory="...",
    step_callback=adapter.step_callback,
)

Google ADK Integration

from google.adk.agents import BaseAgent
from confidence_escalation.adapters.google_adk import ADKEscalationAdapter

class GovernedAgent(BaseAgent):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._escalation = ADKEscalationAdapter(threshold=0.65)

    async def _run_async_impl(self, ctx):
        async for event in self._llm_agent._run_async_impl(ctx):
            if event.is_final_response():
                result = self._escalation.evaluate_event(event, ctx)
                if result["triggered"]:
                    yield self._escalation.build_escalation_event(result)
                    return
            yield event

Composite Policy Chains

from confidence_escalation import ThresholdPolicy, EscalationAction
from confidence_escalation.policy import CompositePolicy

policy = CompositePolicy(policies=[
    ThresholdPolicy(threshold=0.25, action=EscalationAction.ABORT),
    ThresholdPolicy(threshold=0.55, action=EscalationAction.HUMAN_IN_LOOP),
    ThresholdPolicy(threshold=0.75, action=EscalationAction.COMPLIANCE_LOG),
])

result = policy.evaluate(score, context={"session_id": "abc"})
# First matching threshold wins

OWASP Agentic AI Coverage

OWASP ASI ID	Risk	Coverage
ASI-09	Human-Agent Trust Exploitation	Confidence gating before high-stakes actions
ASI-02	Tool Misuse	Tool restriction handler removes high-risk tools at low confidence
ASI-03	Identity/Privilege Abuse	ComplianceLoggingHandler creates immutable audit trail

Related Packages

voice-ai-governance — HIPAA/FERPA/EU AI Act compliance for voice AI pipelines
regulated-ai-governance — Runtime tool authorization and capability scoping
enterprise-rag-patterns — FERPA/HIPAA/GDPR-compliant RAG patterns

License

MIT License. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

confidence_escalation-0.1.0.tar.gz (18.1 kB view details)

Uploaded Apr 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

confidence_escalation-0.1.0-py3-none-any.whl (20.9 kB view details)

Uploaded Apr 27, 2026 Python 3

File details

Details for the file confidence_escalation-0.1.0.tar.gz.

File metadata

Download URL: confidence_escalation-0.1.0.tar.gz
Upload date: Apr 27, 2026
Size: 18.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for confidence_escalation-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`c4fc641f1a16b96ca9895525ed9b3321ee094fea3b2ee5221ab166066cf21fdf`
MD5	`0779802ba2fefac880f69c0aa8f7c70a`
BLAKE2b-256	`57ab54b8092a15ccc30256f6fc89a85135f53fcdfdf6d6f018f34a6e172bd84f`

See more details on using hashes here.

File details

Details for the file confidence_escalation-0.1.0-py3-none-any.whl.

File metadata

Download URL: confidence_escalation-0.1.0-py3-none-any.whl
Upload date: Apr 27, 2026
Size: 20.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for confidence_escalation-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3155fcb1fc2856ae4d0538f1c3deed9da1bba229b37a245b39dea876420d10be`
MD5	`455889fa92985e0503f83c73d9b9301b`
BLAKE2b-256	`e5d98a6548321e8d67480b007f98ff72d2b856a68e51f5f131395b0216695efd`

See more details on using hashes here.

confidence-escalation 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

confidence-escalation

The Problem

Features

Quick Start

Installation

Basic Scoring

Threshold Policy + Human-in-Loop

Model Upgrade Handler

Tool Restriction

LangChain Integration

CrewAI Integration

Google ADK Integration

Composite Policy Chains

OWASP Agentic AI Coverage

Related Packages

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes