Confidence-gated decisions for LLM agent outputs
Project description
Confgate
Confidence-gated decisions for LLM agent outputs.
Agents abstain when they're not sure enough. Findings reach developers only when they should.
The problem
LLM agents are noisy. A security agent that flags "possible SQL injection — confidence 0.4" wastes a developer's time more than it helps. Enough false positives and the tool gets disabled entirely.
This is false positive fatigue — and it's why most automated LLM-based review tools fail in practice.
What confgate does
Wraps agent functions with a confidence threshold. Findings below the threshold come back with abstained=True. Findings above it pass through unchanged. The caller decides what to surface.
from confgate import Decision, gate
@gate(threshold=0.75)
def security_agent(diff: str) -> Decision:
# your LLM call here
return Decision(
category="security",
confidence=0.5, # below threshold
reasoning="Possible SQL injection in query builder.",
severity="high",
line_ref="src/db.py:88",
)
result = security_agent(diff)
print(result.abstained) # True — suppressed before it reaches a developer
print(result.confidence) # 0.5 — still available for logging / monitoring
Install
pip install confgate
Zero runtime dependencies. Pure Python 3.10+.
Core concepts
Decision
The structured output every agent returns.
from dataclasses import dataclass
@dataclass
class Decision:
category: str # 'security' | 'style' | 'logic' | anything you define
confidence: float # 0.0 – 1.0, provided by your LLM
reasoning: str # one sentence, shown to the end user
severity: str # 'low' | 'medium' | 'high' | 'critical'
line_ref: str | None # optional — e.g. 'src/auth.py:42'
abstained: bool # set by @gate, never by you
Decision validates itself on construction — confidence must be in [0.0, 1.0], severity must be one of the four valid values.
@gate
A decorator factory. Apply it to any agent function that returns a Decision.
@gate(threshold=0.8) # threshold validated at decoration time
def my_agent(diff: str) -> Decision:
...
- Above threshold — returned as-is,
abstained=False - Below threshold — returned with
abstained=True - Equal to threshold — passes. A threshold is a minimum bar, not a ceiling.
- Wrong return type — raises
InvalidDecisionErrorimmediately
Real-world pattern: multi-agent PR reviewer
from confgate import Decision, gate
@gate(threshold=0.75)
def security_agent(diff: str) -> Decision:
raw = call_llm(SECURITY_PROMPT.format(diff=diff))
data = parse_json(raw)
return Decision(
category="security",
confidence=data["confidence"],
reasoning=data["reasoning"],
severity=data["severity"],
)
@gate(threshold=0.75)
def style_agent(diff: str) -> Decision:
raw = call_llm(STYLE_PROMPT.format(diff=diff))
data = parse_json(raw)
return Decision(
category="style",
confidence=data["confidence"],
reasoning=data["reasoning"],
severity=data["severity"],
)
def orchestrate(diff: str) -> list[Decision]:
agents = [security_agent, style_agent]
results = [agent(diff) for agent in agents]
# only surface what agents were confident about
return [r for r in results if not r.abstained]
Prompt pattern for LLM-generated confidence
Tell the model exactly what confidence scores mean — otherwise values are inconsistent across calls.
SECURITY_PROMPT = """
You are a security code reviewer. Analyze the following git diff.
Respond ONLY with valid JSON. No markdown, no backticks.
{
"has_finding": true | false,
"confidence": 0.0-1.0,
"reasoning": "one sentence explanation",
"severity": "low" | "medium" | "high" | "critical"
}
Confidence guide:
- 0.9+ : obvious issue (hardcoded secret, SQL injection)
- 0.7-0.9: likely issue, some context needed
- 0.5-0.7: possible issue, uncertain
- below 0.5: not a real finding
DIFF:
{diff}
"""
Serialisation
to_dict() returns a plain dict for JSON serialisation — useful for posting findings to GitHub review comments, logging, or monitoring.
result.to_dict()
# {
# "category": "security",
# "confidence": 0.5,
# "reasoning": "Possible SQL injection in query builder.",
# "severity": "high",
# "line_ref": "src/db.py:88",
# "abstained": True
# }
str(result)
# [HIGH] [ABSTAINED] security @ src/db.py:88 (confidence=0.50): Possible SQL injection...
Error handling
from confgate import GateError, InvalidDecisionError
# Bad threshold — caught at decoration time
@gate(threshold=1.5) # raises ValueError immediately
def agent(): ...
# Wrong return type — caught at call time
@gate(threshold=0.75)
def bad_agent(diff: str) -> Decision:
return {"confidence": 0.9} # raises InvalidDecisionError on call
# Catch any confgate error
try:
result = agent(diff)
except GateError as e:
print(f"confgate error: {e}")
API reference
gate(threshold: float = 0.8)
| Parameter | Type | Default | Description |
|---|---|---|---|
threshold |
float |
0.8 |
Minimum confidence to surface a finding. Must be 0.0 – 1.0. |
Raises ValueError at decoration time if threshold is out of range.
Decision
| Field | Type | Default | Description |
|---|---|---|---|
category |
str |
required | Free-form finding type. You define the taxonomy. |
confidence |
float |
required | LLM-generated confidence score. Must be 0.0 – 1.0. |
reasoning |
str |
required | One-sentence explanation shown to the end user. |
severity |
str |
"medium" |
One of low, medium, high, critical. |
line_ref |
str | None |
None |
Optional code location, e.g. src/auth.py:42. |
abstained |
bool |
False |
Set by @gate. Never set this yourself. |
Exceptions
| Exception | When |
|---|---|
GateError |
Base class for all confgate errors. |
InvalidDecisionError |
Decorated function returned a non-Decision value. |
Design decisions
Zero dependencies — installs into any Python environment without version conflicts.
Abstained findings are returned, not dropped — the caller decides whether to log, discard, or escalate. confgate never makes that choice silently.
No category validation — category is a free-form string. confgate doesn't know your domain vocabulary and shouldn't.
Strict less-than for threshold — equal confidence passes. A threshold is a minimum bar.
No async support in v0.1.0 — coming in v0.2.0. Handle concurrency at the orchestrator level with asyncio.gather for now.
Development
git clone https://github.com/harsha29292/confgate
cd confgate
pip install -e ".[dev]"
pytest -v
CI runs on Python 3.10, 3.11, and 3.12 on every push and PR.
Roadmap
-
@async_gate— async support for coroutine agent functions -
GateConfig— shared config object across multiple agents - Logprob-based confidence — derive confidence from token probabilities instead of prompting
-
Verdict— aggregate multipleDecisionobjects into a single review summary
License
MIT — see LICENSE.
Built by Sri Harsha · PyPI · Issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file confgate-0.1.1.tar.gz.
File metadata
- Download URL: confgate-0.1.1.tar.gz
- Upload date:
- Size: 9.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c91d569d10b21e2ba1f1a2c1fd73bd43083f43c232d3b7aadbd468cb98ccb17a
|
|
| MD5 |
d5bdbef729ba79a151c67fecc826bf68
|
|
| BLAKE2b-256 |
697299fa09e633d479a5f27ff014000f6765d9de9f61626fe006c679801dfcdd
|
File details
Details for the file confgate-0.1.1-py3-none-any.whl.
File metadata
- Download URL: confgate-0.1.1-py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aadc83d5b758a7808503bb88deb539c9d6b50b4495f55e052dd2a2bd689fca9d
|
|
| MD5 |
1b3f38d34b161464e0c56c463e1af6a6
|
|
| BLAKE2b-256 |
93d28f24ecd75fcb3464eff4421b834e0e683a13a5111ba9b2c1554cf9fd810e
|