Skip to main content

Verifiable Multi-Agent Decision Reasoning Engine

Project description

⚖️ Verdict

Verifiable Multi-Agent Decision Reasoning Engine

PyPI License CI Python

Formal decision science meets LLM multi-agent systems. The first open-source framework where AI decisions come with mathematical guarantees.

Verdict is a Python framework that brings formal decision theory — game theory, mechanism design, Bayesian inference, Monte Carlo tree search — into LLM-powered multi-agent systems. Instead of agents chatting their way to a decision, Verdict agents deliberate through mathematically grounded protocols, producing decisions that are verifiable, auditable, and explainable.

Why Verdict?

Framework What it does What's missing
AutoGen / CrewAI Agents chat to collaborate No formal reasoning guarantees
OpenClaw AI executes tasks No decision theory
MiroFish Agents simulate social dynamics No formal verification
Verdict Agents deliberate with mathematical protocols

Quick Start

pip install verdict-engine
import asyncio
from verdict import Decision, Objective, Option, Tribunal
from verdict.agents import BayesianReasoner, DevilsAdvocate
from verdict.protocols import StructuredDebate

decision = Decision(
    name="Cloud Provider Selection",
    options=[
        Option(name="aws", params={"cost": 0.7, "ml": 0.8}),
        Option(name="gcp", params={"cost": 0.6, "ml": 0.95}),
        Option(name="azure", params={"cost": 0.65, "ml": 0.75}),
    ],
    objectives=[
        Objective(name="cost", direction="minimize"),
        Objective(name="ml_capability", direction="maximize"),
    ],
)

tribunal = Tribunal(
    agents=[
        BayesianReasoner(model="openai/gpt-4o"),
        DevilsAdvocate(model="openai/gpt-4o"),
    ],
    protocol=StructuredDebate(max_rounds=3),
)

result = asyncio.run(tribunal.deliberate(decision))
print(result.recommendation)       # "gcp"
print(f"{result.confidence:.0%}")   # "78%"
print(result.memo.to_markdown())    # Full executive summary

Key Features

🧠 Heterogeneous Reasoning Agents

Eight specialised agent types, each with a distinct reasoning strategy:

  • BayesianReasoner — prior/posterior updating, expected utility maximisation
  • GameTheorist — Nash equilibria, Shapley values, dominance analysis
  • DevilsAdvocate — systematic contrarian stress-testing
  • AdversarialVerifier — red-team attacks, bias detection, fallacy scanning
  • RiskAssessor — VaR/CVaR quantification, Monte Carlo stress tests
  • CausalAnalyst — do-calculus reasoning, confounder identification
  • DomainExpert — RAG-enhanced domain knowledge retrieval
  • MCTSExplorer — LLM-guided Monte Carlo tree search

📜 Formal Deliberation Protocols

Six mathematically grounded interaction protocols:

  • StructuredDebate — proponent/opponent/judge triangle with evidence requirements
  • DelphiProtocol — anonymous multi-round voting with rationale revision
  • BayesianAggregation — logarithmic opinion pooling with expert calibration
  • NashBargaining — multi-party interest balancing via Nash product maximisation
  • VCGMechanism — incentive-compatible, strategy-proof aggregation
  • MajorityJudgment — median-grade voting resistant to strategic manipulation

All protocols are composable via ProtocolChain.

✅ Decision Verification

Automated quality checks with mathematical foundations:

  • Pareto optimality — is the recommendation on the efficient frontier?
  • Logical consistency — does the reasoning chain contradict itself?
  • Confidence calibration — does stated confidence match vote distribution?
  • Robustness — would small perturbations flip the recommendation?
  • Dissent acknowledgment — are minority views documented?

📋 Full Audit Trail

Every reasoning step, vote, and belief update is recorded:

trail = result.audit_trail
trail.query(event_type=EventType.AGENT_REASONING, agent_id="bayesian_1")
trail.persist("audit_session.json")

📝 Decision Distillation

Compress multi-round deliberation into actionable executive memos:

print(result.memo.to_markdown())
# Outputs: recommendation, key arguments, dissent, risk assessment, next steps

Architecture

┌──────────────────────────────────────────────────────────┐
│                    verdict.Tribunal                       │
├──────────────────────────────────────────────────────────┤
│  Decision    →  Agents    →  Protocol  →  Verification   │
│  (problem)      (reason)     (deliberate)  (verify)      │
│                                             ↓            │
│                              Audit Trail + Distillation  │
└──────────────────────────────────────────────────────────┘

CLI

# Run a decision from a JSON file
verdict run decision.json --model openai/gpt-4o --agents bayesian,devils_advocate

# List available agents and protocols
verdict agents
verdict protocols

# Validate a decision file
verdict validate decision.json

Extensibility

Verdict is designed for extension at every layer:

from verdict.agents.base import ReasoningAgent, agent_registry

@agent_registry.register("my_agent")
class MyCustomAgent(ReasoningAgent):
    agent_type = "my_agent"

    async def reason(self, decision, context):
        # Your custom reasoning logic
        ...

Supports custom agents, protocols, verification checks, audit exporters, and LLM middleware — all via a unified registry pattern.

Model Support

Via litellm, Verdict works with 100+ LLM providers:

  • OpenAI (GPT-4o, o1, o3-mini)
  • Anthropic (Claude Sonnet 4, Claude Opus 4)
  • Google (Gemini 2.0 Flash, Gemini 2.5 Pro)
  • DeepSeek (DeepSeek-Chat, DeepSeek-Reasoner)
  • Local models via Ollama (Llama, Qwen, Mistral)
  • Azure OpenAI, AWS Bedrock, Google Vertex AI

Documentation

Full documentation: bingdongni.github.io/verdict

Contributing

Contributions are welcome. See CONTRIBUTING.md for guidelines.

Citation

@software{verdict2026,
  author = {bingdongni},
  title = {Verdict: Verifiable Multi-Agent Decision Reasoning Engine},
  url = {https://github.com/bingdongni/verdict},
  year = {2026},
}

License

Apache-2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

verdict_engine-0.1.0.tar.gz (58.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

verdict_engine-0.1.0-py3-none-any.whl (85.2 kB view details)

Uploaded Python 3

File details

Details for the file verdict_engine-0.1.0.tar.gz.

File metadata

  • Download URL: verdict_engine-0.1.0.tar.gz
  • Upload date:
  • Size: 58.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for verdict_engine-0.1.0.tar.gz
Algorithm Hash digest
SHA256 82c2c229eaa4ddc72263d29decd716c1981607bf16bf2ca822bf284e2dbb252d
MD5 e322fdd0356a027966a5aa8bd2ea626e
BLAKE2b-256 6ecbd4f5508835f14c29617f85039adc77d653fef717bf7086022b06621139ab

See more details on using hashes here.

File details

Details for the file verdict_engine-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: verdict_engine-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 85.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for verdict_engine-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7029546bdfc4403ad89d9e076e13e98dea69d0c2be75d05b003504cdf5319676
MD5 8846760465365e7e9ffacc2143f169fb
BLAKE2b-256 891053f3c8c609fbf4a07e7c5abee936e465f86d98e9b5a1af1830f66aac0606

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page