Skip to main content

Multi-model deliberation CLI. 4 frontier LLMs debate with rotating challenger, then Claude judges.

Project description

Consilium

Multi-model deliberation CLI. 4 frontier LLMs debate a question, then Claude judges and synthesizes.

consilium (Latin): counsel, deliberation, plan

Inspired by Andrej Karpathy's LLM Council, with added blind phase (anti-anchoring), explicit engagement requirements, rotating challenger role, and social calibration mode.

Models

Council (deliberators):

  • GPT (gpt-5.2-pro)
  • Gemini (gemini-3-pro-preview)
  • Grok (grok-4)
  • Kimi (kimi-k2.5)

Judge: Claude Opus 4.5 (synthesizes + adds own perspective)

Installation

pip install consilium

Or with uv:

uv tool install consilium

Setup

Set your OpenRouter API key:

export OPENROUTER_API_KEY=sk-or-v1-...

Optional fallback keys (for flaky models):

export GOOGLE_API_KEY=AIza...      # Gemini fallback
export MOONSHOT_API_KEY=sk-...     # Kimi fallback

Usage

# Basic question
consilium "Should we use microservices or monolith?"

# With social calibration (for interview/networking questions)
consilium "What questions should I ask in the interview?" --social

# With persona context
consilium "Should I take the job?" --persona "builder who hates process work"

# Multiple rounds
consilium "Architecture decision" --rounds 3

# Save transcript
consilium "Career question" --output transcript.md

# Share via GitHub Gist
consilium "Important decision" --share

# List past sessions
consilium --sessions

All sessions are auto-saved to ~/.consilium/sessions/ for later review.

Options

Flag Description
--rounds N Number of deliberation rounds (default: 1, exits early on consensus)
--output FILE Save transcript to file
--named Let models see real names during deliberation (may increase bias)
--no-blind Skip blind first-pass (faster, but first speaker anchors others)
--context TEXT Context hint for judge (e.g., "architecture decision")
--share Upload transcript to secret GitHub Gist
--social Enable social calibration mode (auto-detected for interview/networking)
--persona TEXT Context about the person asking
--challenger MODEL Which model starts as challenger (gpt/gemini/grok/kimi). Rotates each round.
--domain DOMAIN Regulatory domain context (banking, healthcare, eu, fintech, bio)
--followup Enable interactive drill-down after judge synthesis
--practical Actionable rules only, no philosophy
--quiet Suppress progress output
--sessions List recent saved sessions
--no-save Don't auto-save transcript to ~/.consilium/sessions/

How It Works

Blind First-Pass (Anti-Anchoring):

  1. All models generate short "claim sketches" independently and in parallel
  2. This prevents the "first speaker lottery" where whoever speaks first anchors the debate
  3. Each model commits to an initial position before seeing any other responses

Deliberation Protocol:

  1. All models see everyone's blind claims, then deliberate
  2. Each model MUST explicitly AGREE, DISAGREE, or BUILD ON previous speakers by name
  3. After each round, the system checks for consensus (3/4 non-challengers agreeing triggers early exit)
  4. Judge synthesizes the full deliberation

Rotating Challenger:

  • One model each round is assigned the "challenger" role
  • The challenger MUST argue the contrarian position and identify weaknesses in emerging consensus
  • Role rotates each round (GPT R1 → Gemini R2 → Grok R3 → Kimi R4...) to ensure sustained disagreement
  • Challenger is excluded from consensus detection (forced disagreement shouldn't block early exit)

Anonymous Deliberation:

  • Models see each other as "Speaker 1", "Speaker 2", etc. during deliberation
  • Prevents models from playing favorites based on vendor reputation
  • Output transcript shows real model names for readability

When to Use

Use the council when:

  • Making an important decision that benefits from diverse perspectives
  • You want models to actually debate, not just answer in parallel
  • You need a synthesized recommendation, not raw comparison
  • Exploring trade-offs where different viewpoints matter

Skip the council when:

  • You're just thinking out loud (exploratory discussions)
  • The answer depends on personal preference more than objective trade-offs
  • Speed matters (council takes 60-90 seconds)

Python API

from consilium import run_council, COUNCIL
import os

api_key = os.environ["OPENROUTER_API_KEY"]

transcript, failed_models = run_council(
    question="Should we use microservices or monolith?",
    council_config=COUNCIL,
    api_key=api_key,
    rounds=2,
    verbose=True,
    social_mode=False,
)

print(transcript)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

consilium-0.3.0.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

consilium-0.3.0-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file consilium-0.3.0.tar.gz.

File metadata

  • Download URL: consilium-0.3.0.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.2

File hashes

Hashes for consilium-0.3.0.tar.gz
Algorithm Hash digest
SHA256 f9a8b0f47c76bc6b2dab692c8f1ec4ad6d42e5e2ff9c68543fbf281c45982419
MD5 4879c894dca189c87c91df470c0925bd
BLAKE2b-256 7e9afda3e25f21917e5564ce6335cf858219af6477d2ae3c44c564873e9c8092

See more details on using hashes here.

File details

Details for the file consilium-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: consilium-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.2

File hashes

Hashes for consilium-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0b72cadc1e8a7c8926e93cd59a1c3d2ad45a6fd5be5e4c5a5e794c25b6deaffa
MD5 75e240f0c6de991906520d99e15b43d5
BLAKE2b-256 6ce49d01ad9fb90cdfbc0a7c2fe8b775a590b6117bcccc944fbd5d7df46aaa64

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page