Open-source AI debate engine — multi-model adversarial debates with LangGraph orchestration
Project description
Duelyst.ai Core
Open-source AI debate engine — multi-model adversarial debates with LangGraph orchestration.
Two AI models argue, challenge each other, and converge toward a synthesis. The result is higher-quality analysis than any single model produces alone.
Why
Large language models have two well-known failure modes:
- Sycophancy — a single model tends to agree with the user rather than push back
- Hallucination — fabricated facts go unchallenged when there's no adversary
When two models debate adversarially, they challenge claims, request evidence, and surface disagreements. The result is more rigorous, balanced analysis.
Install
pip install duelyst-ai-core
To enable Tavily-backed web search via --tools search, install the optional
search dependencies too:
pip install "duelyst-ai-core[search]"
That extra installs the modern langchain-tavily integration used by LangChain 1.x.
Set API keys for the models you want to use. For CLI usage, you can either export
them in your shell or put them in a local .env file. The CLI now auto-loads .env
from the current working directory.
# Option 1: local .env file used by the CLI
cp .env.example .env
# Fill values as plain KEY=value lines. Quotes are optional for API keys.
ANTHROPIC_API_KEY=your-key-here
OPENAI_API_KEY=your-key-here
GOOGLE_API_KEY=your-key-here # optional
TAVILY_API_KEY=your-key-here # optional, for web search
# Option 2: export into the shell
export ANTHROPIC_API_KEY=your-key-here
export OPENAI_API_KEY=your-key-here
export GOOGLE_API_KEY=your-key-here # optional
export TAVILY_API_KEY=your-key-here # optional, for web search
Quick Start
CLI
# Basic debate — low-cost defaults (Claude Haiku + GPT mini)
duelyst debate "Should startups use microservices or monoliths?"
# Choose models
duelyst debate "Rust vs Go for backend" --model-a claude-sonnet --model-b gpt-5
# Custom instructions
duelyst debate "Will AI replace software engineers by 2030?" \
--model-a claude-haiku \
--model-b gemini-flash \
--instructions-a "Defend the position that AI will replace most jobs" \
--instructions-b "Defend the position that AI will augment, not replace" \
--rounds 5
# Enable web search for real-time evidence
duelyst debate "Bitcoin price prediction 2026" \
--model-a claude-haiku --model-b gpt-mini --tools search
# From source, include the optional search dependencies in your environment first
uv sync --group dev --extra search
# Output formats
duelyst debate "..." --output markdown > debate.md
duelyst debate "..." --output json > debate.json
Python API
import asyncio
from duelyst_ai_core import DebateConfig, DebateOrchestrator, ModelConfig
config = DebateConfig(
topic="Should startups use microservices or monoliths?",
model_a=ModelConfig(provider="anthropic", model_id="claude-haiku-4-5"),
model_b=ModelConfig(provider="openai", model_id="gpt-5.4-mini"),
max_rounds=5,
)
orchestrator = DebateOrchestrator(config)
result = asyncio.run(orchestrator.graph.ainvoke({
"config": config,
"turns": [],
"current_round": 0,
"current_agent": "a",
"convergence_history": [],
"status": "running",
"synthesis": None,
"error": None,
}))
# result["synthesis"] contains the judge's balanced analysis
# result["turns"] contains the full debate transcript
Streaming Events
Stream real-time events as the debate progresses using arun_with_events():
import asyncio
from duelyst_ai_core import DebateConfig, DebateOrchestrator, ModelConfig
config = DebateConfig(
topic="Will AI replace software engineers by 2030?",
model_a=ModelConfig(provider="anthropic", model_id="claude-haiku-4-5"),
model_b=ModelConfig(provider="openai", model_id="gpt-5.4-mini"),
)
async def main():
orchestrator = DebateOrchestrator(config)
async for event in orchestrator.arun_with_events():
print(f"{event.event}: {event}")
asyncio.run(main())
Events emitted: debate_started, round_started, turn_started, turn_completed,
convergence_update, synthesis_started, synthesis_completed, debate_completed.
For custom consumers (e.g., SSE relay), implement the DebateEventCallback protocol:
from duelyst_ai_core import DebateEventCallback, DebateOrchestrator
class MyCallback:
async def on_event(self, event):
# SSE, WebSocket, logging, etc.
print(event.model_dump_json())
orchestrator = DebateOrchestrator(config, callback=MyCallback())
See examples/ for more complete examples.
Supported Models
| Alias | Provider | Model |
|---|---|---|
claude-haiku |
Anthropic | Claude Haiku 4.5 |
claude-sonnet |
Anthropic | Claude Sonnet 4.6 |
claude-opus |
Anthropic | Claude Opus 4.6 |
gpt-mini |
OpenAI | GPT-5.4 mini |
gpt-5 |
OpenAI | GPT-5.4 |
gpt-nano |
OpenAI | GPT-5.4 nano |
gemini-flash |
Gemini 2.5 Flash | |
gemini-flash-lite |
Gemini 2.5 Flash-Lite | |
gemini-pro |
Gemini 2.5 Pro |
Use aliases with --model-a / --model-b, or pass full model IDs with provider prefix.
CLI defaults favor cheaper test runs: claude-haiku + gpt-mini, with gemini-flash
as the auto-selected Google judge default. Legacy OpenAI aliases such as gpt-4o
and gpt-4.1 remain available for compatibility.
How It Works
Debate Flow
START
|
v
init_debate ──> run_debater_a ──> run_debater_b ──> check_convergence
^ |
|______ continue ______________________|
|
converged/max ──> run_judge ──> END
- Two agents debate — each powered by a different LLM, arguing from assigned (or auto-assigned) sides
- Each turn, agents receive the full debate history, reflect on the opponent's arguments, optionally use tools (web search) for evidence, and produce a structured response with a convergence score (0-10)
- Convergence detection — when both agents score >= threshold for N consecutive rounds, the debate converges naturally
- Judge synthesis — a third model (automatically selected from a different provider) analyzes the full transcript and produces a balanced synthesis
Architecture
- Agents — thin wrappers around LangChain's
create_agentwith structured output (AgentResponse,JudgeSynthesis) - Orchestrator — LangGraph
StateGraphmanaging turn alternation, convergence tracking, and judge invocation - Model registry — factory functions mapping aliases to LangChain
BaseChatModelinstances - Tools — optional LangChain tools (Tavily web search) that agents invoke autonomously during debates
Architecture Diagram
The diagram shows the outer debate orchestrator and the two create_agent()-based
sub-agents on the same image. The orchestrator portion matches the real LangGraph
topology exactly. The sub-agent boxes are intentionally conceptual: they summarize
the model/tool loop that LangChain builds internally so the parts owned by this
repository are easier to study.
For the full walkthrough of agents, state, events, callbacks, streaming, async execution, GitHub Actions, and PyPI publishing, see docs/ARCHITECTURE.md.
CLI Options
| Option | Short | Default | Description |
|---|---|---|---|
--model-a |
-a |
claude-haiku |
Model for side A |
--model-b |
-b |
gpt-mini |
Model for side B |
--judge |
-j |
auto | Judge model (auto-selects different provider) |
--instructions-a |
auto | Custom instructions for side A | |
--instructions-b |
auto | Custom instructions for side B | |
--rounds |
-r |
5 |
Maximum debate rounds |
--threshold |
7 |
Convergence threshold (1-10) | |
--convergence-rounds |
2 |
Consecutive converged rounds needed | |
--tools |
-t |
none | Comma-separated tools: search |
--output |
-o |
rich |
Output format: rich, markdown, json |
--verbose |
-v |
off | Show debug logging |
Configuration
All configuration is via environment variables. No API keys are ever hardcoded.
# Required — at least two providers for a debate
ANTHROPIC_API_KEY=your-key-here
OPENAI_API_KEY=your-key-here
# Optional
GOOGLE_API_KEY=your-key-here
TAVILY_API_KEY=your-key-here # enables --tools search
Development
# Clone and install
git clone https://github.com/duelyst-ai/duelyst-ai-core.git
cd duelyst-ai-core
uv sync --group dev
# Include optional Tavily web search support
uv sync --group dev --extra search
# Run tests
uv run pytest -v
# Lint and type check
uv run ruff check src tests
uv run ruff format --check src tests
uv run mypy src
# Run all checks
uv run ruff check src tests && uv run ruff format --check src tests && uv run mypy src && uv run pytest
CI and Release
GitHub Actions mirrors the same validation path used locally:
.github/workflows/ci.ymlruns Ruff, format checks, mypy, and pytest on pushes and pull requests..github/workflows/publish.ymlreruns the quality gate on version tags, verifies the tag matchespyproject.toml, builds the package, smoke-tests wheel install/import in a clean virtualenv, publishes to PyPI with trusted publishing, and creates a GitHub Release.
Release flow:
- Bump
project.versioninpyproject.toml. - Run the local checks above.
- Push the commit to
main. - Push a tag like
v0.1.0to trigger the publish workflow.
Project Structure
src/duelyst_ai_core/
├── __init__.py # Public API exports
├── exceptions.py # DuelystError hierarchy
├── agents/
│ ├── debater.py # DebaterAgent (create_agent wrapper)
│ ├── judge.py # JudgeAgent (create_agent wrapper)
│ ├── prompts.py # System prompts and message builders
│ └── schemas.py # AgentResponse, JudgeSynthesis, DebateTurn, etc.
├── models/
│ └── registry.py # create_model(), resolve_alias(), get_judge_model()
├── orchestrator/
│ ├── engine.py # DebateOrchestrator (LangGraph StateGraph)
│ ├── state.py # OrchestratorState, DebateConfig, ModelConfig
│ ├── convergence.py # Convergence detection logic
│ ├── events.py # Streaming event types
│ └── callbacks.py # DebateEventCallback protocol + implementations
├── tools/
│ ├── __init__.py # create_search_tool(), is_search_available()
│ └── search.py # Tavily web search integration
├── formatters/
│ ├── __init__.py # Formatter exports
│ ├── base.py # Abstract BaseFormatter
│ ├── markdown.py # Markdown output
│ ├── json_fmt.py # JSON output
│ └── rich_terminal.py # Rich terminal output with panels and colors
└── cli/
├── main.py # Typer CLI entry point
├── display.py # Rich live display for debate progress
└── live_panel.py # Rich display callback for real-time updates
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file duelyst_ai_core-0.3.0.tar.gz.
File metadata
- Download URL: duelyst_ai_core-0.3.0.tar.gz
- Upload date:
- Size: 332.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b98821ca54e4e02caac228bf70644d26935c733be8130853b620dc833c6bccf0
|
|
| MD5 |
b173d3f6b894c8b36ce4d59739c80c3b
|
|
| BLAKE2b-256 |
28c0db4d8897dcf3ac00499c7c976c23d91fbcf08efd164ae4b6d9cb3c44022b
|
Provenance
The following attestation bundles were made for duelyst_ai_core-0.3.0.tar.gz:
Publisher:
publish.yml on venerass/duelyst-ai-core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
duelyst_ai_core-0.3.0.tar.gz -
Subject digest:
b98821ca54e4e02caac228bf70644d26935c733be8130853b620dc833c6bccf0 - Sigstore transparency entry: 1244909254
- Sigstore integration time:
-
Permalink:
venerass/duelyst-ai-core@e860ce8d69b2b9565d2a8edcb7641d425fde5ee7 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/venerass
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e860ce8d69b2b9565d2a8edcb7641d425fde5ee7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file duelyst_ai_core-0.3.0-py3-none-any.whl.
File metadata
- Download URL: duelyst_ai_core-0.3.0-py3-none-any.whl
- Upload date:
- Size: 41.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87c2125698e358f75de893b84e3b4e5ed9978490b0bc9b527ac5630ca3791af6
|
|
| MD5 |
343a0003edcae01ade95398df6d5747a
|
|
| BLAKE2b-256 |
c8ee10daa6dfa56c9c2f122f6452b6e20d71012db9098075f8e50efef68ef263
|
Provenance
The following attestation bundles were made for duelyst_ai_core-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on venerass/duelyst-ai-core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
duelyst_ai_core-0.3.0-py3-none-any.whl -
Subject digest:
87c2125698e358f75de893b84e3b4e5ed9978490b0bc9b527ac5630ca3791af6 - Sigstore transparency entry: 1244909280
- Sigstore integration time:
-
Permalink:
venerass/duelyst-ai-core@e860ce8d69b2b9565d2a8edcb7641d425fde5ee7 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/venerass
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e860ce8d69b2b9565d2a8edcb7641d425fde5ee7 -
Trigger Event:
push
-
Statement type: