Multi-Factor Generative-Deterministic Confidence (MFGC) scoring and safety gates for AI agents. Zero dependencies.

These details have not been verified by PyPI

Project links

Sponsor

Project description

murphy-confidence

Should your AI agent act? murphy-confidence answers that question with math, not vibes.

Zero dependencies · Pure Python 3.10+ · pip install murphy-confidence

The problem

Every AI agent framework gives you a way to call tools. None of them give you a principled way to decide whether to call them.

You end up with one of:

A hardcoded threshold — if confidence > 0.7: execute() — no phase awareness, no hazard weighting, no audit trail
A vibe check — asking the LLM "are you sure?" and hoping it says no when it should
Nothing — just letting the agent do whatever it calculates and hoping for the best

When you're automating actions that touch real data, real money, or real people, none of those options are acceptable.

The solution

murphy-confidence implements the Multi-Factor Generative-Deterministic Confidence (MFGC) formula:

C(t) = w_g · G(x) + w_d · D(x) − κ · H(x)

Where:

Symbol	Meaning	Range
`G(x)`	Generative quality score — how good is the LLM output?	[0, 1]
`D(x)`	Domain-deterministic score — does this match the rules?	[0, 1]
`H(x)`	Hazard factor — how bad if this is wrong?	[0, 1]
`w_g, w_d, κ`	Phase-locked weights — shift toward determinism as execution approaches	—

The weights are phase-locked: as your pipeline moves from brainstorming to executing, the formula automatically shifts trust away from the LLM and toward your domain rules. At EXECUTE phase, the threshold is 0.85. At EXPAND phase, it's 0.50.

5-second quickstart

pip install murphy-confidence

from murphy_confidence import compute_confidence
from murphy_confidence.types import Phase

result = compute_confidence(
    goodness=0.82,   # How good is the AI output?  [0-1]
    domain=0.75,     # How well does it match domain rules?  [0-1]
    hazard=0.10,     # How risky is this action?  [0-1]
    phase=Phase.EXECUTE,
)

print(result.score)    # 0.7585
print(result.action)   # GateAction.PROCEED_WITH_MONITORING
print(result.allowed)  # True
print(result.rationale)
# [ALLOWED] Phase=EXECUTE | C=0.7585 (threshold=0.85) | Action=PROCEED_WITH_MONITORING | ...

Complete feature walkthrough

The Confidence Engine

The engine is stateless. Call it anywhere, in any thread, with any inputs:

from murphy_confidence import ConfidenceEngine
from murphy_confidence.types import Phase

engine = ConfidenceEngine()

# Low hazard, high quality — proceeds automatically at EXECUTE
result = engine.compute(goodness=0.95, domain=0.90, hazard=0.02, phase=Phase.EXECUTE)
assert result.action.value == "PROCEED_AUTOMATICALLY"

# High hazard — blocked even with good quality
result = engine.compute(goodness=0.90, domain=0.85, hazard=0.80, phase=Phase.EXECUTE)
assert not result.allowed

The phase-locked weight schedule means the same inputs produce different outcomes at different phases — early phases are lenient, EXECUTE is strict:

Phase	Score (goodness=0.78, domain=0.72, hazard=0.15)	Allowed
EXPAND	0.6570	✓
TYPE	0.6410	✓
ENUMERATE	0.6250	✓
CONSTRAIN	0.6045	✓
COLLAPSE	0.5885	✓
BIND	0.5745	✗
EXECUTE	0.5555	✗

Safety Gates

Gates wrap a confidence result in a domain-specific policy check:

from murphy_confidence import SafetyGate
from murphy_confidence.types import GateType

# A compliance gate at 0.90 — blocking by default
gate = SafetyGate("hipaa_compliance", GateType.COMPLIANCE)

result = compute_confidence(0.82, 0.78, 0.08, Phase.EXECUTE)
gr = gate.evaluate(result)

if not gr.passed and gr.blocking:
    raise RuntimeError(gr.message)
    # Gate 'hipaa_compliance' (COMPLIANCE) FAILED [BLOCKING] — confidence 0.7368 < threshold 0.9000

Six gate types, each with sensible defaults:

Gate Type	Default Threshold	Blocking
`EXECUTIVE`	0.85	✓
`OPERATIONS`	0.70	✗
`QA`	0.75	✗
`HITL`	0.80	✓
`COMPLIANCE`	0.90	✓
`BUDGET`	0.65	✗

Gate Compiler

Don't know which gates you need? The compiler figures it out:

from murphy_confidence import GateCompiler, compute_confidence
from murphy_confidence.types import Phase

result = compute_confidence(0.72, 0.68, 0.18, Phase.EXECUTE)
compiler = GateCompiler()
gates = compiler.compile_gates(result, context={"compliance_required": True})

for gate in gates:
    gr = gate.evaluate(result)
    print(f"{gr.gate_id}: {'PASS' if gr.passed else 'FAIL'}")

The compiler uses a rule table that maps (phase, action) pairs to gate sets — so the right gates are automatically included for EXECUTE phase, for blocking actions, for compliance contexts, etc.

Domain Models

For vertical-specific scoring, the domain sub-package provides ready-made scorers for healthcare, financial, and manufacturing scenarios:

from murphy_confidence.domain.healthcare import HealthcareDomainEngine
from murphy_confidence import compute_confidence
from murphy_confidence.types import Phase

engine = HealthcareDomainEngine()
g, d, h = engine.compute(patient_record, prescription)

result = compute_confidence(g, d, h, Phase.EXECUTE)

Integration examples

FastAPI middleware

Gate every AI agent action before it hits your handler:

from fastapi import FastAPI, Request
from murphy_confidence import GateCompiler, compute_confidence
from murphy_confidence.types import Phase

app = FastAPI()
compiler = GateCompiler()

@app.middleware("http")
async def confidence_gate(request: Request, call_next):
    if request.url.path == "/agent/action":
        body = await request.json()
        result = compute_confidence(
            body["goodness"], body["domain"], body["hazard"], Phase.EXECUTE
        )
        gates = compiler.compile_gates(result, context={"compliance_required": True})
        for gate in gates:
            gr = gate.evaluate(result)
            if not gr.passed and gr.blocking:
                return JSONResponse({"blocked": True, "reason": gr.message}, status_code=403)
    return await call_next(request)

See examples/fastapi_middleware.py for the full runnable example.

LangChain callback

Intercept every tool call and gate it:

from murphy_confidence import GateCompiler, compute_confidence
from murphy_confidence.types import Phase

class MurphyConfidenceCallback:
    def on_tool_start(self, serialized, input_str, **kwargs):
        result = compute_confidence(
            kwargs.get("goodness", 0.70),
            kwargs.get("domain", 0.65),
            kwargs.get("hazard", 0.15),
            Phase.EXECUTE,
        )
        gates = GateCompiler().compile_gates(result)
        for gate in gates:
            gr = gate.evaluate(result)
            if not gr.passed and gr.blocking:
                raise RuntimeError(f"Tool blocked: {gr.message}")

See examples/langchain_callback.py for the full runnable example (no LangChain install required for the demo).

Raw Python

from murphy_confidence import compute_confidence, SafetyGate
from murphy_confidence.types import GateType, Phase

# Score the action
result = compute_confidence(
    goodness=0.88,
    domain=0.82,
    hazard=0.05,
    phase=Phase.EXECUTE,
)

# Create a domain-specific gate
gate = SafetyGate("production_deploy", GateType.EXECUTIVE, blocking=True)
gr = gate.evaluate(result)

if gr.passed:
    deploy_to_production()
else:
    notify_human(gr.message)

Why not just use a threshold?

A simple if confidence > 0.7: proceed has four failure modes that murphy-confidence fixes:

Problem	Simple threshold	murphy-confidence
Same threshold at brainstorm and execute	✗ both same	✓ 0.50 → 0.85 ramp
No hazard awareness	✗ ignored	✓ κ · H(x) penalty
No domain validation	✗ only LLM score	✓ w_d · D(x) component
No audit trail	✗ silent pass/fail	✓ rationale string on every result
No gate composition	✗ one boolean	✓ gate pipeline with blocking semantics
No serialisation	✗ raw float	✓ `as_dict()` on all results

Part of Murphy System

murphy-confidence was extracted from Murphy System, an autonomous AI orchestration platform. Inside Murphy, every agent decision — from executing a campaign to deploying code — passes through this confidence gate before it's allowed to act.

We extracted it because the gating problem is universal: if you're building any AI agent that takes real-world actions, you need this layer. A confidence gate stops your agent from acting when it shouldn't and lets it act when it can — with an auditable score behind every decision.

⚠️ Murphy System is currently beta software. We're being honest about that so you can set expectations accordingly.

If you find this library useful, check out the full system at github.com/IKNOWINOT/Murphy-System.

Pipeline phases

Phase	Description	Threshold
`EXPAND`	Brainstorming, ideation	0.50
`TYPE`	Classifying and labelling	0.55
`ENUMERATE`	Listing options	0.60
`CONSTRAIN`	Applying rules and limits	0.65
`COLLAPSE`	Selecting the best option	0.70
`BIND`	Binding to specific resources	0.78
`EXECUTE`	Taking real-world action	0.85

Action classification

Action	Score range	Meaning
`PROCEED_AUTOMATICALLY`	≥ 0.90	Full autonomy
`PROCEED_WITH_MONITORING`	≥ 0.80	Execute + log
`PROCEED_WITH_CAUTION`	≥ 0.70	Execute with extra checks
`REQUEST_HUMAN_REVIEW`	≥ 0.55	Flag for human, don't block
`REQUIRE_HUMAN_APPROVAL`	≥ 0.40	Block until approved
`BLOCK_EXECUTION`	< 0.40	Hard stop

Community

💬 Discussions — questions, ideas, show-and-tell
🐛 Issues — bugs and feature requests
🤝 Contributing — how to contribute
❤️ Sponsor — support the project

License

Apache License 2.0 — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Sponsor

Release history Release notifications | RSS feed

This version

0.1.0

Mar 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

murphy_confidence-0.1.0.tar.gz (46.3 kB view details)

Uploaded Mar 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

murphy_confidence-0.1.0-py3-none-any.whl (45.3 kB view details)

Uploaded Mar 29, 2026 Python 3

File details

Details for the file murphy_confidence-0.1.0.tar.gz.

File metadata

Download URL: murphy_confidence-0.1.0.tar.gz
Upload date: Mar 29, 2026
Size: 46.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for murphy_confidence-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`eeffef25b7bfbf8fd79948e4529bd2fea60d4c7b7c5f70f3d64ebc991b57d9cc`
MD5	`637edeacb32e6201f414ed72d024b153`
BLAKE2b-256	`0c42dec7350a51e70ce15e8036f405af12a501414733383dd216272caa19cf61`

See more details on using hashes here.

Provenance

The following attestation bundles were made for murphy_confidence-0.1.0.tar.gz:

Publisher: publish.yml on IKNOWINOT/murphy-confidence

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: murphy_confidence-0.1.0.tar.gz
- Subject digest: eeffef25b7bfbf8fd79948e4529bd2fea60d4c7b7c5f70f3d64ebc991b57d9cc
- Sigstore transparency entry: 1192250407
- Sigstore integration time: Mar 29, 2026
Source repository:
- Permalink: IKNOWINOT/murphy-confidence@5bc900681e0a485697b579b14272e95023847704
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/IKNOWINOT
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5bc900681e0a485697b579b14272e95023847704
- Trigger Event: release

File details

Details for the file murphy_confidence-0.1.0-py3-none-any.whl.

File metadata

Download URL: murphy_confidence-0.1.0-py3-none-any.whl
Upload date: Mar 29, 2026
Size: 45.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for murphy_confidence-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`023aff2d97aac0a2f19826abe86780e73494f2d34b7061ab1ec16cfa35614d03`
MD5	`8ea6e63e8cab2070ee8d91ed036abf75`
BLAKE2b-256	`f26afce065aae5e5a14cc349453522bf4ad22630995178b04fd4048d2320e9d3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for murphy_confidence-0.1.0-py3-none-any.whl:

Publisher: publish.yml on IKNOWINOT/murphy-confidence

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: murphy_confidence-0.1.0-py3-none-any.whl
- Subject digest: 023aff2d97aac0a2f19826abe86780e73494f2d34b7061ab1ec16cfa35614d03
- Sigstore transparency entry: 1192250423
- Sigstore integration time: Mar 29, 2026
Source repository:
- Permalink: IKNOWINOT/murphy-confidence@5bc900681e0a485697b579b14272e95023847704
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/IKNOWINOT
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5bc900681e0a485697b579b14272e95023847704
- Trigger Event: release

murphy-confidence 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

murphy-confidence

The problem

The solution

5-second quickstart

Complete feature walkthrough

The Confidence Engine

Safety Gates

Gate Compiler

Domain Models

Integration examples

FastAPI middleware

LangChain callback

Raw Python

Why not just use a threshold?

Part of Murphy System

Pipeline phases

Action classification

Community

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance