Skip to main content

Modern multi-agent CLI orchestrating multiple AI models via OpenRouter

Project description

๐ŸŒ konklave.info


๐Ÿ›๏ธ ย Kย Oย Nย Kย Lย Aย Vย E

A council of AI models that thinks, codes, reviews, and ships โ€” together.


๐Ÿš€ Releases June 15, 2026


Python OpenRouter License Release


Stop talking to one AI. Run a team.

Konklave orchestrates six specialized agents โ€” Architect, Coder, Reviewer, Tester, Researcher, Auditor โ€” across any mix of models from OpenRouter. They catch each other's mistakes so you don't have to.



๐ŸŽฌ Screenshots

Full pipeline run โ€” Agents ยท Chat ยท Pipeline graph ยท Live cost tracking Pipeline run


Swarm mode โ€” 6 Researchers running in parallel, all writing findings simultaneously Swarm of researchers


Architect planning a large feature โ€” 6 Coders spawned in parallel Architect with parallel coders


Web Dashboard โ€” live runs, eval comparison, settings Konklave dashboard with eval comparison


Eval results in the terminal โ€” solo vs council, head to head Eval results table in the CLI


โœจ How It Works

Instead of one model doing everything, Konklave runs an actual pipeline of specialists:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                                                                 โ”‚
โ”‚   ๐Ÿ“ You describe the task                                      โ”‚
โ”‚          โ”‚                                                      โ”‚
โ”‚          โ–ผ                                                      โ”‚
โ”‚   ๐Ÿ›๏ธ  Architect  โ”€โ”€spawnsโ”€โ”€โ–บ  ๐Ÿ” Researcher ร— N (parallel)    โ”‚
โ”‚          โ”‚                                                      โ”‚
โ”‚          โ–ผ  concrete plan                                       โ”‚
โ”‚   ๐Ÿ’ป  Coder       (different model family โ€” blind spots cancel) โ”‚
โ”‚          โ”‚                                                      โ”‚
โ”‚          โ–ผ  implementation                                      โ”‚
โ”‚   ๐Ÿ”Ž  Reviewer  โ”€โ”€โ•ฎ                                            โ”‚
โ”‚                   โ”œโ”€โ”€โ–บ consensus vote                          โ”‚
โ”‚   ๐Ÿ›ก๏ธ  Auditor   โ”€โ”€โ•ฏ                                            โ”‚
โ”‚          โ”‚                                                      โ”‚
โ”‚          โ–ผ  approved                                            โ”‚
โ”‚   ๐Ÿงช  Tester     (runs real commands, reads real output)       โ”‚
โ”‚          โ”‚                                                      โ”‚
โ”‚          โ–ผ                                                      โ”‚
โ”‚   โœ…  Result                                                    โ”‚
โ”‚                                                                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Each role runs on the model you choose. Mix providers freely โ€” a DeepSeek coder with Anthropic reviewers means genuinely independent cross-checks, not just one model second-guessing itself.


โœจ What's New in This Release

Feature Details
Model-diverse review panel The review step now fields a third voter from a different model family (reviewer@google/gemini-2.5-pro by default). Independent models catch each other's blind spots instead of one model voting twice.
Deliberation round On a split vote, each reviewer sees the others' full reasoning before casting a FINAL verdict. Unanimous panels pay nothing extra โ€” the second round only fires on a real disagreement.
Image & video generation New image_gen and video_gen tools let agents produce and save images/videos via any OpenAI-compatible generation API (DALL-E, Sora, โ€ฆ). Configure endpoint + model in settings.
Configurable from TUI/GUI All three features above are tunable from the Settings screen โ€” no YAML editing required.

๐Ÿง  Six Specialized Roles

Role What it does
๐Ÿ›๏ธ Architect Reads your codebase, spawns parallel Researchers, writes a concrete step-by-step plan
๐Ÿ’ป Coder Implements the plan โ€” intentionally assigned a different model family than the reviewers
๐Ÿ”Ž Reviewer Checks the implementation against the plan and your coding standards
๐Ÿงช Tester Runs actual commands and verifies the result with real evidence, not guesses
๐Ÿ›ก๏ธ Auditor Final pass โ€” does the output actually solve the original request?
๐Ÿ” Researcher Lightweight parallel lookups, runs in swarms to gather context fast

โšก Work Modes

One flag to control how deep the pipeline goes:

konklave run "your task" --work very_fast    # โšก instant โ€” straight to code, no research
konklave run "your task" --work fast         # ๐Ÿš€ quick research + light review
konklave run "your task"                     # โš–๏ธ  balanced default
konklave run "your task" --work deep         # ๐Ÿ”ฌ multi-round review and testing
konklave run "your task" --work autonomous   # ๐Ÿค– autonomous loop until PERFECT
Mode Speed What runs
very_fast โšกโšกโšก No research โ†’ code โ†’ done
fast โšกโšก One researcher, quick review
balanced โš–๏ธ Research โ†’ plan โ†’ code โ†’ review โ†’ test
deep ๐Ÿ”ฌ Multi-researcher, multiple review/test rounds
exhaustive ๐Ÿงฌ Exhaustive edge cases, trade-off exploration
autonomous ๐Ÿค– Loop until Auditor approves (with budget cap)
auto-endless โ™พ๏ธ Loop until you stop it

๐ŸŒŠ Swarm โ€” Parallel Sub-Agents

Agents can spawn their own helpers for genuinely independent sub-tasks. Three Researchers run at once instead of sequentially. You control the aggressiveness:

:swarm off            โ†’  each agent works alone
:swarm encouraged     โ†’  agents use parallel helpers when it makes sense  (default)
:swarm aggressive     โ†’  maximally fan out work to sub-agents

๐ŸŽฎ Human-in-the-Loop

Inject a message mid-run, steer the direction, or just watch. Slash commands work in the TUI at any time:

Command What it does
:say "focus on auth" Inject a message into the active agent
:work autonomous Switch work mode on the fly
:swarm aggressive Turn up parallel agent spawning
:ask proactive Make the Architect ask you more questions
:cost Show live USD spend per agent
:models Reassign any role to a different model
:exit Clean stop at the next step boundary

๐Ÿ“ฆ Built-in Pipelines

Pipelines are plain YAML โ€” no Python required. Write your own or use the built-ins:

๐Ÿ—๏ธ  build_feature    โ†’  research + plan + code + review + test + audit
โšก  quick            โ†’  one-shot coder, no review overhead
โ™ป๏ธ  refactor         โ†’  architect-led refactor with before/after review
๐Ÿ›  debug_issue      โ†’  reproduction โ†’ root cause โ†’ fix โ†’ regression test
๐Ÿ”  research         โ†’  deep research swarm, no code changes
๐Ÿค–  autonomous_loop  โ†’  loop until done (budget-capped)

๐Ÿ—๏ธ Architecture

 You (TUI / CLI)
       โ”‚
       โ–ผ
 โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
 โ”‚  Conductor  โ”‚โ”€โ”€โ”€โ”€โ–บโ”‚  Pipeline YAML                       โ”‚
 โ”‚             โ”‚     โ”‚  โ€ข sequential steps                  โ”‚
 โ”‚             โ”‚     โ”‚  โ€ข parallel groups                   โ”‚
 โ”‚             โ”‚     โ”‚  โ€ข consensus votes                   โ”‚
 โ”‚             โ”‚     โ”‚  โ€ข loop steps                        โ”‚
 โ”‚             โ”‚     โ”‚  โ€ข human-approval gates              โ”‚
 โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                    โ”‚
              โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
              โ–ผ           โ–ผ         โ–ผ          โ–ผ          โ–ผ
        Architect      Coder    Reviewer    Tester    Auditor
        (Sonnet)    (DeepSeek)  (Sonnet)   (Haiku)   (Sonnet)
            โ”‚
            โ””โ”€โ”€spawnsโ”€โ”€โ–บ Researcher ร— N
                         (parallel swarm)
                              โ”‚
                              โ–ผ
                       OpenRouter API
                  (any model, any provider)

Key design decisions:

  • ๐Ÿ”’ Each agent has isolated conversation history โ€” no cross-contamination
  • ๐Ÿ”„ Event Bus decouples UI, Conductor, agents, and tools asynchronously
  • ๐Ÿ›ก๏ธ Fallback models โ€” if a model fails, the agent auto-retries on a backup
  • ๐Ÿ’พ SQLite persistence โ€” every step saved, resume from any checkpoint

๐Ÿค– Default Model Panel

Role Model Why
๐Ÿ›๏ธ Architect anthropic/claude-sonnet-4.6 Strong planning and tool use
๐Ÿ’ป Coder deepseek/deepseek-chat-v3 Fast, cheap, excellent at code
๐Ÿ”Ž Reviewer anthropic/claude-sonnet-4.6 Different family from coder โ†’ independent check
๐Ÿงช Tester anthropic/claude-haiku-4.5 Fast and cheap for test gates
๐Ÿ” Researcher anthropic/claude-haiku-4.5 High-volume parallel queries
๐Ÿ›ก๏ธ Auditor anthropic/claude-sonnet-4.6 Strict final quality gate

Swap any role to any model on OpenRouter โ€” use :models in the TUI or konklave init.


๐Ÿ“Š Benchmarks

We run Konklave against a suite of self-contained coding tasks (each with a hidden test file) and score by the fraction of tests that pass. To keep it fair, every configuration runs the exact same model โ€” google/gemini-2.5-flash-lite โ€” so the comparison measures the orchestration, not a model upgrade. Three configurations, head to head:

  • solo โ€” minimal pipeline, straight to code (no review)
  • council_same โ€” the full pipeline: architect โ†’ coder โ†’ reviewer โ†’ tester โ†’ auditor
  • council_diverse โ€” the full pipeline with the orchestration turned up: swarm high + autonomous work mode

Latest run โ€” 2026-06-07 ยท 5 tasks per config ยท all on gemini-2.5-flash-lite

Configuration Pass rate Mean score Mean cost Mean time
solo 20% (1/5) 54% $0.0026 8s
council_same 40% (2/5) 83% $0.018 5m 8s
council_diverse 80% (4/5) 98% $0.072 57m

Eval results table in the CLI

Takeaway: with the same model in every seat, just adding the multi-agent pipeline (review + test + audit) doubles the pass rate over a single call (solo 20% โ†’ council_same 40%), and turning the orchestration up โ€” swarm high + autonomous looping โ€” doubles it again to 80%, lifting the mean test score from 54% โ†’ 98%. The gains here come purely from how the agents work together, not from a stronger model.

๐Ÿค– Model under test

All three configurations ran on a single, cheap model so no config has a model advantage:

Configuration Model What's different
solo google/gemini-2.5-flash-lite minimal pipeline โ€” straight to code
council_same google/gemini-2.5-flash-lite full pipeline โ€” review, test, audit
council_diverse google/gemini-2.5-flash-lite full pipeline + swarm high + autonomous work mode

This isolates the value of the orchestration itself. In real use you can go further and assign a different model family per role (see Default Model Panel) so independent models catch each other's blind spots on top of these gains.

Cost and time scale with depth โ€” deeper orchestration thinks longer and harder. Use the work modes to dial the trade-off for your task. Full per-task results: eval_results/.


๐Ÿ’ก Why This Works โ€” Buy Quality With Cheap Tokens

Here's the whole idea in one sentence: spend more cheap tokens instead of paying for an expensive model โ€” and let it happen on its own.

A single call to a small, cheap model gets you a small, cheap answer (solo: 20% pass rate). But the same cheap model, when it's allowed to review its own work, run the tests, see the failures, fix them, and loop again โ€” autonomously โ€” climbs to 80%. Same model. The only thing that changed is that it kept working.

  • ๐Ÿ’ธ Cheap tokens, premium results. A frontier model charges a premium per token for one shot. Konklave takes a model that costs a fraction of that and simply uses more of it โ€” many small, cheap passes add up to an answer that rivals (or beats) the expensive one-shot, at a lower model tier.
  • ๐Ÿค– Fully autonomous โ€” you don't babysit it. You don't re-prompt, you don't paste error messages back in, you don't say "now write tests." The pipeline reviews, tests, audits, and re-tries by itself until the Auditor is satisfied (or the budget cap stops it). The extra token burn is automatic.
  • ๐Ÿ” More tokens are the feature, not a bug. Yes โ€” council_diverse uses far more tokens than a single call. That's the point: those tokens are the quality. You're trading something cheap (tokens from a small model) for something valuable (a correct, tested result) โ€” and you set the ceiling with work modes and a budget cap.

Bottom line: don't pay for a smarter model โ€” let a cheaper one think longer, automatically. More token burn, fully autonomous, better output.


๐Ÿš€ Installation

Requirements: Python 3.11+ ยท OpenRouter API key (free tier available)

# 1. Install
pip install konklave

# 2. First-time setup (language, API key, default models)
konklave init

# 3. Launch the interactive TUI
konklave

Windows: double-click start.bat โ€” activates the environment and launches Konklave automatically.

Desktop GUI (optional)

pip install "konklave[gui]"   # adds PySide6
konklave-gui                  # or: python -m konklave.gui

๐Ÿ Quickstart

# Interactive TUI (recommended)
konklave

# One-shot headless run
konklave run "Add input validation to the login form"

# Pick a pipeline depth
konklave run "Refactor the payment module" --work deep

# Resume an interrupted session
konklave resume

# Test your OpenRouter connection
konklave ping

# Show current config
konklave config

โš™๏ธ Configuration

konklave init      # re-run the setup wizard (API key, models, language)
konklave config    # print current settings and their config-file path

Config is stored per-platform โ€” never inside the project folder:

Platform Path
Windows %APPDATA%\konklave\config.toml
macOS ~/Library/Application Support/konklave/config.toml
Linux ~/.config/konklave/config.toml

Your API key is stored in the OS keyring (Windows Credential Manager / macOS Keychain / Linux SecretService) โ€” never in a plain file.

Quick-start with custom settings โ€” copy and edit the bundled template:

# find the correct path first
konklave config --path

# then copy config.example.toml from the repo there and edit it

config.example.toml (included in the repo) documents every available setting with inline comments, including model assignments, review-panel tuning, web search, semantic index, media generation, and dashboard options.


๐Ÿ“Š Web Dashboard

A FastAPI + browser dashboard for everything that's hard to see in a terminal: live runs, full conversation history, eval comparisons, and settings.

# Launch the dashboard on http://localhost:8000
python -m uvicorn konklave.dashboard.app:app --host 127.0.0.1 --port 8000

Windows: double-click start_dashboard.bat โ€” it sets up the environment, opens your browser, and starts the server.

  • ๐Ÿ“œ Browse and export any past session (Markdown / JSON)
  • ๐Ÿ“ˆ Compare eval runs side by side (the screenshot above)
  • โš™๏ธ Edit configuration and per-role models from the browser
  • ๐Ÿ”’ Binds to 127.0.0.1 only, with a same-origin CSRF guard โ€” local by default

๐Ÿง  Persistent Memory

Konklave remembers across sessions. Two human-editable files plus full-text search over every past run:

Store What it holds
MEMORY.md Workspace-scoped facts the agents learn while working (auto-compacted when it grows too large)
USER.md Global facts about you and your preferences, applied to every workspace
Session DB SQLite + FTS5 full-text index over all past messages โ€” agents can search what happened before

Memory lives under ~/.konklave/ and is never committed (it's in .gitignore) โ€” it can contain conversation history and personal context. Agents read it at the start of a run and can write new facts via the memory tool.


๐Ÿ’ป Platform Support

Platform Status
Windows 10 / 11 โœ…
macOS 13+ โœ…
Linux โœ…

๐Ÿงช Development & Tests

Konklave ships with a 628-test suite (unit, smoke, and integration) covering the pipeline engine, agents, tools, memory, the dashboard API, and the eval harness.

# Install with dev extras
pip install -e ".[dev]"

# Run the full suite
pytest

# Skip the TUI tests (need a real terminal) and the slow ones
pytest -m "not tui and not slow"

# Run the benchmarks yourself (real OpenRouter calls โ€” costs money)
konklave eval run --configs solo,council_same,council_diverse

Windows: test.bat runs the suite; eval_run.bat runs the benchmarks.


๐Ÿ“„ License

Konklave is dual-licensed.

  • ๐Ÿ†“ Open source โ€” GNU AGPL-3.0-or-later (see LICENSE). Free for personal, academic, and open-source use. Note: the AGPL is strong copyleft โ€” if you modify Konklave or offer it to others over a network, you must release your full source under the AGPL too.
  • ๐Ÿ’ผ Commercial license โ€” for companies that want to use Konklave in a closed-source product or internal tool without the AGPL's source-disclosure obligations. See COMMERCIAL-LICENSE.md.

ยฉ 2026 firo2525. All rights reserved.


Built with โค๏ธ using OpenRouter ยท UI powered by Textual


โญ Star this repo if you find it useful! โญ

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

konklave-0.1.0.tar.gz (358.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

konklave-0.1.0-py3-none-any.whl (455.2 kB view details)

Uploaded Python 3

File details

Details for the file konklave-0.1.0.tar.gz.

File metadata

  • Download URL: konklave-0.1.0.tar.gz
  • Upload date:
  • Size: 358.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for konklave-0.1.0.tar.gz
Algorithm Hash digest
SHA256 23a7f66c64a861989cb344373bbf98eafc59cec41358106118c29294420426be
MD5 0c346748d5bc3fc46efe60ae036528b8
BLAKE2b-256 b5ee5840535d818d387b8a13bcfa32f1ee3862c328aa7661caa548ba98c9664d

See more details on using hashes here.

Provenance

The following attestation bundles were made for konklave-0.1.0.tar.gz:

Publisher: publish.yml on firo2525/konklave

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file konklave-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: konklave-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 455.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for konklave-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0da6444300e01dbe94d2a79791284eb9e6b151716f33f747e7c7db79a8701fdb
MD5 1ecf7baf9f257e87c32dd8c21de9f33a
BLAKE2b-256 d78f029042f07d480fc74114a4b81e2af23a3cc92cd4eeaa47a3530e8ec2c18e

See more details on using hashes here.

Provenance

The following attestation bundles were made for konklave-0.1.0-py3-none-any.whl:

Publisher: publish.yml on firo2525/konklave

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page