Skip to main content

Modular Python library for multi-agent collective decision-making via LLM expert panels

Project description

rc-verdict

CI

Multi-agent collective decision-making via LLM expert panels.

rc-verdict assembles a panel of LLM "experts" — each wearing a distinct expertise hat with its own behavioral disposition — and drives them through a structured vote → debate → converge protocol mediated by a single overseer agent. Instead of trusting one model's first answer, you get a verdict with calibrated confidence, recorded dissent, and a fully auditable deliberation trace. It is a modular Python library (not a framework): plug in OpenAI, Anthropic, HuggingFace, Ollama — or the built-in MockBackend — and call it from any project.

📖 How the protocol works → docs/PROTOCOL.md


Installation

pip install rc-verdict                  # base package (includes MockBackend)
pip install "rc-verdict[openai]"        # or [anthropic], [huggingface], [ollama]
pip install "rc-verdict[dev]"           # development: testing, linting, coverage

Requires Python ≥ 3.11. Until the first PyPI release lands, install from source: pip install git+https://github.com/spidey99/rc-verdict


Quickstart

Run a full deliberation locally — no API keys needed:

import asyncio, json
from rc_verdict import Panel, VerdictConfig
from rc_verdict.backends.mock import MockBackend

vote = json.dumps({"position": "approve", "conviction": "high",
                   "evidence_quality": "high", "reasoning": "Analysis complete.",
                   "dissent_points": []})
config = VerdictConfig(default_backend="mock", min_panel_size=3, max_panel_size=3,
                       max_token_budget=50000, pool_experts_ratio=1.0)
panel = Panel(config, backend_factory=lambda provider="mock", **kw: MockBackend(responses=[vote] * 3))

result = asyncio.run(panel.deliberate("A caching PR with 92% test coverage.",
                                      "Should we merge this PR?"))
print(f"Decision: {result.decision}  Conviction: {result.confidence.conviction.value}")

With a real backend it is one call — experts are selected and generated for your question automatically:

from rc_verdict import deliberate

result = await deliberate("Should we open-source our ETL tool?")  # uses OPENAI_API_KEY
print(result.decision)
print(result.trace.to_markdown(verdict=result.verdict))  # full deliberation trace

See examples/ for more — custom experts, multi-backend routing, and more.


Core Architecture

┌─────────────────────────────────────────┐
│              CALLER / HOST APP          │
│  verdict = await panel.deliberate(input)│
└────────────────┬────────────────────────┘
                 │
        ┌────────▼────────┐
        │    OVERSEER     │  Mediates all rounds
        │  (single agent) │  Controls escalation
        └────────┬────────┘
                 │
     ┌───────────┼───────────┐
     │           │           │
 ┌───▼───┐  ┌───▼───┐  ┌───▼───┐
 │Expert │  │Expert │  │Expert │   3-7 panelists
 │  #1   │  │  #2   │  │  #3   │   Diverse hats
 │ "hat" │  │ "hat" │  │ "hat" │   + dispositions
 └───────┘  └───────┘  └───────┘

Protocol Flow

  1. Expert Selection — Overseer analyzes input context and selects relevant experts from:

    • A pool of pre-canned experts (YAML-defined domain specialists)
    • Dynamically generated experts — model-derived hats + dispositions tailored to the input
  2. Vote Round — Each expert independently casts a confidence-weighted vote

  3. Unanimity Check — All agree → verdict returned immediately (cheap path)

  4. Debate Round — If not unanimous:

    • Each expert's reasoning is shared with all others
    • Experts process peer reasoning and update positions
    • Overseer may inject clarifying questions
  5. Re-Vote — Experts vote again with updated reasoning

  6. Convergence — If still stuck after N rounds:

    • Elimination — Remove the expert with lowest confidence
    • Supermajority — Accept at ≥80% agreement threshold
    • Overseer Override — Synthesize a verdict from all reasoning
    • Deadlock — Return structured disagreement with all positions

Key Differentiators

Feature rc-verdict Typical MAD Why it matters
Expert pool (pre-canned + dynamic) Right experts for the problem
Escalation ladder (vote→debate→eliminate) Cheap when agreement is easy
Overseer-mediated rounds Partial Prevents drift, controls cost
Confidence-gated elimination Novel convergence mechanism
Modular library (not a framework) Call from any project
Heterogeneous model backends Rare Diversity > count (empirically proven)

Research Foundation

Design informed by 30+ papers (2023–2026) on multi-agent deliberation:

  • Debate-or-Vote (NeurIPS 2025): Voting alone captures most MAD gains; debate is a martingale without bias correction
  • Diversity of Thought (2024): Diverse model families outperform N copies of the same model
  • MARS (2025): Meta-reviewer pattern achieves MAD accuracy at 50% token cost
  • Demystifying MAD (Jan 2026): Confidence-modulated debate + diversity-aware initialization are the two interventions that actually work
  • MachineSoM (EMNLP 2023): LLM agents exhibit human-like social dynamics; personality traits affect collaboration

More Resources

  • examples/ -- Runnable scripts for common use cases
  • docs/PROTOCOL.md -- Protocol state machine and convergence strategies
  • docs/ -- Signal capture protocol and design documentation
  • LICENSE -- MIT

Status

🚧 Alpha — Core protocol working. All backends (OpenAI, Anthropic, HuggingFace, Ollama, Mock) implemented. See KICKOFF_PROMPT.md for the full implementation spec and RELEASE_CHECKLIST.md for the path to PyPI.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rc_verdict-0.1.0a1.tar.gz (71.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rc_verdict-0.1.0a1-py3-none-any.whl (91.4 kB view details)

Uploaded Python 3

File details

Details for the file rc_verdict-0.1.0a1.tar.gz.

File metadata

  • Download URL: rc_verdict-0.1.0a1.tar.gz
  • Upload date:
  • Size: 71.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rc_verdict-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 66e511342916eec1e03a9fca627baa7b4bf9868d131e76a1386722c731af98e2
MD5 1f936658a0995c631c2298abe98e1798
BLAKE2b-256 da33bb4119ca67316c0d6e46db4a96d1d14a5a5276b87818478a307ac394b7ae

See more details on using hashes here.

Provenance

The following attestation bundles were made for rc_verdict-0.1.0a1.tar.gz:

Publisher: release.yml on spidey99/rc-verdict

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rc_verdict-0.1.0a1-py3-none-any.whl.

File metadata

  • Download URL: rc_verdict-0.1.0a1-py3-none-any.whl
  • Upload date:
  • Size: 91.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rc_verdict-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 2ddb722b07b8a0bb88e6f623c191adc1eb8d7e37adc0455c32bf8e44de2f3211
MD5 7968e5ca7dfa6bff924af4a96a8abc75
BLAKE2b-256 928d1134f5f5fc7e25c78b1ad44077778901e5e600eda0eac5ed79b3b25d6408

See more details on using hashes here.

Provenance

The following attestation bundles were made for rc_verdict-0.1.0a1-py3-none-any.whl:

Publisher: release.yml on spidey99/rc-verdict

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page