Skip to main content

Achieve consensus across multiple LLMs through an AI judge coordinator

Project description

LLM Ensemble Banner

A Python library for achieving consensus across multiple Agents.

Features

  • Consensus: Uses a moderator to iteratively coordinate multiple LLMs until consensus is reached
  • Can use any model API based model such as OpenAI, Anthropic, Gemini, Grok supported by Langgraph
  • Supports web search for real-time data

Installation

uv add multi-llm-consensus

Or install from source:

git clone https://github.com/zzzrbx/llm-ensemble.git
cd llm-ensemble
uv sync

Environment Setup

Create a .env file with your API keys, for example:

OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
GOOGLE_API_KEY=your_google_key
XAI_API_KEY=your_xai_key
TAVILY_API_KEY=your_tavily_key  # For web search

You must provide at least two API keys for the models you want to use in the ensemble.

How It Works

User Query → Judge (configurable, default: Claude Opus 4.5)
    ↓
Judge calls run_llms tool
    ├── Model A (parallel)
    ├── Model B (parallel)
    ├── Model C (parallel)
    └── Model D (parallel)
         ↓
Judge analyzes responses
    ├── Consensus? → Return answer
    └── No consensus? → Refine query and call run_llms again
         ↓
Repeat until consensus or limit reached

Key Features:

  • Dynamic queries - Judge crafts different prompts each iteration:
    • Iteration 1: Sends initial question with research instructions
    • Iteration 2+: Summarizes agreements, highlights disagreements, requests refinements
    • Final iteration: Presents refined consensus statement for confirmation
  • Error handling - Returns default values on timeout or tool call limit reached

Tools currently available for LLMs:

  • search_the_web - Tavily web search for current events and factual data
  • add, subtract, multiply, divide - Math operations

Examples

Example 1: With structured output

from typing import TypedDict
from llm_ensemble import Consensus

class UserSchema(TypedDict):
    consensus: bool
    final_answer: str
    notes: str

consensus = Consensus(
    models=[
        "openai:gpt-5-mini",
        "google_genai:gemini-3-flash-preview",
        "anthropic:claude-3-5-haiku-20241022",
        "xai:grok-3-mini",
    ],
    response_schema=UserSchema
)

result = consensus.invoke(
    "If survival is arbitrary, is moral judgment arbitrary too?"
)

print(f"Consensus: {result['consensus']}")
print(f"Answer: {result['final_answer']}")
print(f"Notes: {result['notes']}")

Example 2: No structured output with web search enabled (you just need to mention it in the prompt)

from llm_ensemble import Consensus

# No response_schema - returns full agent result
consensus = Consensus(
    models=[
        "openai:gpt-5-mini",
        "google_genai:gemini-3-flash-preview",
        "anthropic:claude-3-5-haiku-20241022",
        "xai:grok-3-mini",
    ]
)

result = consensus.invoke(
    "What are the latest developments in quantum computing?\n\n"
    "Use the web search to research current news and breakthroughs."
)

# Access full agent result
print(result['messages'][-1].content)

Debugging and Observability

The library integrates with LangSmith for trace observability. Set LANGSMITH_API_KEY and LANGSMITH_PROJECT in your .env file to enable tracing.

License

MIT License

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multi_llm_consensus-0.1.6.tar.gz (535.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multi_llm_consensus-0.1.6-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file multi_llm_consensus-0.1.6.tar.gz.

File metadata

  • Download URL: multi_llm_consensus-0.1.6.tar.gz
  • Upload date:
  • Size: 535.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for multi_llm_consensus-0.1.6.tar.gz
Algorithm Hash digest
SHA256 0b8d4f18d029e9b32fea24c9c2df5b6bf4c622b5e2cbce9c2983cc268c124d50
MD5 74dd9d9a8126569d48d482fdd37209fe
BLAKE2b-256 6ce499cd04f46d3c7271385275ef4df9ba6df3343443fa40641268907fd6b128

See more details on using hashes here.

Provenance

The following attestation bundles were made for multi_llm_consensus-0.1.6.tar.gz:

Publisher: python-publish.yml on zzzrbx/llm-ensemble

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file multi_llm_consensus-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for multi_llm_consensus-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 6dda96f25c78478352b411a2335149b1fae3d49cd11845a9e76fedd1d74eb1ae
MD5 c5084ed6dfba6f0dacad572d391751b8
BLAKE2b-256 4364c3aed2c7edbfd52ac0b4e5fbb4540303ca42976576b1385223e2ec1bc989

See more details on using hashes here.

Provenance

The following attestation bundles were made for multi_llm_consensus-0.1.6-py3-none-any.whl:

Publisher: python-publish.yml on zzzrbx/llm-ensemble

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page