An MCP server that delegates work to, and builds consensus across, a crew of agentic coding CLIs.
Project description
Rutherford MCP Server
Give your AI coding agent a crew. Rutherford is a Model Context Protocol server that lets one coding CLI delegate work to, debate with, and build consensus across a group of others — Claude Code, Codex, Cursor, Qwen Code, Antigravity, Kiro, OpenCode, and Goose. It runs them as headless subprocesses and brings their answers back in one normalized shape. It is CLI-only: it orchestrates terminal coding agents and never calls a model provider API directly.
.---------.
| \/\/\/ |
| O [==]|
| < |
| \___/ |
'---------'
-- Ensign Sam Rutherford --
USS Cerritos . Engineering
Named for the irrepressibly cheerful engineer aboard the USS Cerritos in Star Trek: Lower Decks, who has a gift for getting heterogeneous systems to cooperate. That is the job here: one agent hands work to a crew of others and brings the results back. Star Trek and Lower Decks are trademarks of their respective owners; this is an unaffiliated, fan-named open-source project.
Why you'd want this
You are deep in a session with one coding agent. Then you hit a moment where one opinion isn't enough:
- You're about to commit to a design and want a second and third opinion before you do.
- Two models disagree and you want to watch them actually argue it out, not just answer in parallel.
- A diff is risky and you want several reviewers on it, with the must-fix issues separated from nits.
- You want to hand off a long refactor to a different agent and keep working while it runs.
- You want a fresh, unbiased critique of the code you just wrote, from an instance with no memory of the conversation that produced it.
Rutherford does all of that from inside the agent you're already talking to. You describe what you want in plain language; your agent translates it into Rutherford's tools. You rarely name the tools yourself.
How it works
Rutherford is a stdio MCP server. Your MCP client (a coding CLI or a desktop app) calls it, and it spawns the target CLIs as fresh, isolated headless subprocesses — argv arrays, never shell strings, read-only by default, and depth-bounded so a CLI that calls itself can't recurse forever.
your MCP client (Claude Code, Cursor, Codex, Claude Desktop, ...)
| MCP over stdio
v
rutherford-mcp-server
| fresh subprocess per call (read_only by default)
+--> claude -p "..." --output-format json
+--> codex exec --json
+--> kiro-cli chat --no-interactive "..."
+--> opencode run --format json "..."
+--> goose run -t "..." --no-session
+--> cursor-agent -p --output-format json
+--> qwen -o json
+--> agy -p "..." (answer read from the transcript file)
Every answer comes back in the same envelope regardless of the CLI's native output format, so your agent compares apples to apples. A CLI that errors or isn't installed comes back as one failed voice without sinking the rest of a panel.
Supported CLIs
Invocation flags verified 2026-05-30. Pin your CLI versions and re-verify after upgrades; each adapter keeps all of its CLI-specific details in one file, so a change is a one-file edit.
| CLI | Adapter id | How Rutherford runs it | Auth |
|---|---|---|---|
| Claude Code | claude_code |
claude -p "<prompt>" --output-format json |
subscription/OAuth login or ANTHROPIC_API_KEY |
| Codex | codex |
codex exec --json (prompt on stdin) |
ChatGPT login or OPENAI_API_KEY |
| Cursor | cursor |
cursor-agent -p --output-format json (prompt on stdin) |
cursor-agent login or CURSOR_API_KEY |
| Qwen Code | qwen |
qwen -o json (prompt on stdin) |
qwen OAuth login or OPENAI_API_KEY |
| Kiro | kiro |
kiro-cli chat --no-interactive "<prompt>" |
KIRO_API_KEY or kiro-cli login |
| OpenCode | opencode |
opencode run --format json -q "<prompt>" |
provider key or opencode auth login |
| Goose | goose |
goose run -q -t "<prompt>" --no-session |
GOOSE_PROVIDER + provider key |
| Antigravity | antigravity |
agy -p "<prompt>" (answer from the transcript file) |
Google account login |
A ninth, well-behaved CLI can be added without code — see docs/adding-a-cli.md.
Setup
1. Install Rutherford
It's a Python 3.11+ package. Install it as a tool so the rutherford-mcp-server command is on your PATH:
uv tool install rutherford-mcp-server
# or: pipx install rutherford-mcp-server
# or: pip install rutherford-mcp-server
2. Install and sign in to the CLIs you want to orchestrate
Rutherford does not install or authenticate the target CLIs — it drives the ones you already have. Install whichever you want a crew of, and log in to each (subscription login or the relevant API key; see docs/integration-testing.md). You don't need all eight; two is enough for a consensus or a debate.
3. Register Rutherford with your MCP client
The command to run is rutherford-mcp-server (equivalently python -m rutherford). Same command on
Windows, macOS, and Linux.
Claude Code
claude mcp add rutherford -- rutherford-mcp-server
Codex
codex mcp add rutherford -- rutherford-mcp-server
Claude Desktop / Cursor / other JSON-config clients (Claude Desktop: claude_desktop_config.json;
Cursor: .cursor/mcp.json):
{
"mcpServers": {
"rutherford": {
"command": "rutherford-mcp-server"
}
}
}
If the command isn't on PATH, use an absolute path, or python -m rutherford with the interpreter from the
environment where Rutherford is installed. For WSL and more clients, see
docs/mcp-client-integration.md.
4. Scaffold your config, then confirm the crew is reachable
Run the setup wizard to detect which CLIs you're signed in to and write a starter config plus a panel of them:
rutherford-mcp-server init
It prints the plan and writes the files only after you confirm (it never overwrites an existing file).
Once Rutherford is registered, you can do the same conversationally — ask your agent to "set up
Rutherford" and the setup tool proposes the same files for you to approve. Then have your agent run
doctor to confirm each CLI is installed, authenticated, and reachable.
Tutorials
You drive Rutherford in plain language from your MCP client. Each tutorial below is a prompt you can paste, with a note on what Rutherford does under the hood. Everything defaults to read-only.
See who's on the crew
Which coding CLIs can Rutherford reach right now, and which am I signed in to? Then run doctor and tell me if anything's misconfigured.
capabilities is an instant, free snapshot of installed state, auth, and models. doctor goes further and
live-checks any CLI that has no status command (like Antigravity, whose auth only shows up once a real
round trip confirms it).
Hand one task to one agent
Use Rutherford to have Codex read
src/auth/session.pyand explain how token refresh works. Read-only.
A delegate to one CLI. You get back one normalized result: the answer, timing, token cost, and a session
id you can resume later. Add a model if you want a specific one ("ask Kiro with the cheap
claude-haiku-4.5 model to ...").
Get a second and third opinion
I think the deadlock is in
queue.py. Ask Claude Code, Codex, and Qwen the same question — where is it and how would you fix it? — and show me their answers side by side.
A consensus across three targets, one independent voice each, run in parallel. To poll everyone you're
signed in to, just don't name targets: "ask every coding agent I'm logged into whether a UUID or a ULID is
a better primary key." Rutherford builds the panel from every installed, authenticated CLI and tells you in
skipped which it left out and why.
Run a real debate
This is the one that isn't just parallel answers. In a debate, round one is each voice's independent take; in every later round, each voice sees the others' latest positions and is asked to rebut and revise.
Run a 3-round debate between Claude Code, Codex, and Kiro: "is UUIDv7 or ULID the better primary key for a high-write event table?" Show me how each position shifted, plus a closing summary.
A debate. The result carries the full per-round transcript, so you can retrace exactly who said what and
where someone changed their mind, followed by a closing synthesis of where the panel converged and where it
still splits. In a real run of that exact prompt, all three opened with "UUIDv7" for different reasons,
then in round two Claude Code and Codex corrected a factual error in Kiro's argument — and Kiro revised its
position in response. Optional stances ("have Cursor argue for it and Claude Code argue against") keep a
voice on an assigned side the whole way through.
Turn a panel into a decision
When you want an answer, not a transcript, give consensus a strategy. Each voice is asked for a verdict and Rutherford aggregates them.
Ask claude_code, codex, and qwen "is this migration safe to ship?" and take the majority verdict, with each ending in a one-word VERDICT line.
A consensus with strategy: majority. You get back the outcome, the winning decision, and every
voice's verdict alongside its full reasoning. The strategies:
| Strategy | What it does |
|---|---|
all-voices |
Every voice, no aggregation (the default). |
unanimous |
Agrees only if every voice matches, otherwise reports a split. |
majority |
One voice, one vote; the most-voted verdict wins. |
weighted |
The verdict with the greatest summed target weight wins. |
parity-pair |
Compares a proposer against parity counterweights; disagreement escalates. |
Verdicts are read from a final VERDICT: <token> line, or as JSON if you pass a verdict_schema.
Save a crew as a panel and reuse it
Once you have a group you keep reaching for, save it as a named panel instead of listing the targets every time.
# ~/.rutherford/panels.toon
panels:
design-roundtable:
description: Lineage-diverse design review
strategy: parity-pair
targets[3]:
- cli: claude_code
model: opus
label: proposer
- cli: codex
label: implementer
- cli: kiro
model: deepseek-3.2
label: dissenter
parity: true
Run my
design-roundtablepanel on this: "should this API return a stream or a page?"
consensus, debate, and review all accept panel="design-roundtable". Panels live in
~/.rutherford/panels.toon (global) or <project>/.rutherford/panels.toon (project-specific, which
overrides a global panel of the same name). After editing the file, ask your agent to "reload Rutherford's
panels" and it picks up the change without a restart.
Review a diff across several reviewers
Review my staged diff with Claude Code and Codex as reviewers. Findings by file and line, must-fix separated from nits, and call out anything only one of them flagged.
A review — read-only, using the codereviewer role — over a diff or a set of paths, across one or more
targets. Point it at paths instead ("review everything under src/payments/ for injection bugs") and the
reviewers read the files themselves.
Get an implementation plan
Use Rutherford's planner on Claude Code to turn "add OAuth2 device-code login to the CLI" into an ordered, step-by-step plan, with the files each step touches and the risky parts flagged.
A plan — one target, the planner role, read-only. The bundled roles are planner, codereviewer,
security, and debugger; ask "what roles does Rutherford have?" to list them (each shows its source). Add
your own as markdown or TOON files under ~/.rutherford/roles/ (or a project's .rutherford/roles/); a
project role overrides a same-named global one. See
docs/configuration.md.
Let an agent actually make the change
Let Codex apply the fix in
C:\work\myrepo— write mode, you have my permission to edit files there. Add the missing null check and a test that covers it.
A delegate in write mode. Write and yolo are never the default: they require both an explicit mode and a
trusted workspace (an allowlisted path or your per-call go-ahead), so an agent can't modify a directory by
accident. See the safety model below.
Kick off a long job and keep working
Start a big refactor on OpenCode in the background — "convert the data layer to the repository pattern" in
C:\work\myrepo— and just give me the job id.
delegate (or consensus / debate) in async mode returns a job id immediately. "Is that Rutherford job
done yet? Show me the result if it finished" polls job_status / job_result.
Get a fresh, unbiased take on your own work
Spin up a separate Claude Code instance through Rutherford — one with no memory of this conversation — to critique the design we just wrote.
Rutherford can target the very CLI you're talking to. It spawns a fresh, isolated subprocess that is
distinct from your session and can't reach back into it. A depth guard (max_depth, default 3) keeps a
CLI-calls-itself chain bounded.
Safety model
Every delegation runs in one of four safety modes, defaulting to the most restrictive:
| Mode | Meaning |
|---|---|
read_only (default) |
Inspect only. Review, consensus, debate, and plan are read-only by nature. |
propose |
The agent may propose changes (e.g. a diff) but not apply them. |
write |
The agent may modify the workspace, subject to the CLI's own approvals. |
yolo |
The agent may act without approval prompts (the CLI's bypass mode). |
write and yolo require an explicit argument and a trusted-workspace check: the target directory must be
on the configured trusted_workspaces allowlist, or the call must pass trust_workspace=true. No adapter
ever defaults to its permission-bypass flag, and invocations are always built as an argv list, never a
shell string. See docs/security.md.
Configuration
The main config is a small TOML file (config.toml in your platform config dir, or a project-local
rutherford.toml); panels and custom roles live in their own files under ~/.rutherford/ and a project's
.rutherford/. The full reference — every field, the discovery and precedence rules, the panel and role
file formats, and config-defined generic adapters — is in
docs/configuration.md.
Documentation
- docs/configuration.md — config file, panels, custom roles, strategies, generic adapters.
- docs/architecture.md — the layered design and the two core interfaces.
- docs/adding-a-cli.md — the contract and checklist for adding a CLI.
- docs/mcp-client-integration.md — registration for many clients.
- docs/integration-testing.md — installing and authenticating each CLI.
- docs/troubleshooting.md — common problems and fixes.
- docs/security.md — the security model in depth.
Experimental status
Rutherford drives independent third-party CLIs. Their headless flags, output formats, and auth mechanisms
change between releases, and a CLI update can change or remove something an adapter relies on. Every flag in
this repo was verified against the CLI's own --help and docs on the date noted above. Pin your CLI
versions, re-verify after upgrades, and treat the integration as evolving.
Contributing
See CONTRIBUTING.md. The whole
core is testable without a real CLI; run just check before pushing, then just test-integration for
whatever CLIs your machine has installed and authenticated.
License
MIT (c) John Chapman. See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rutherford_mcp_server-0.2.0.tar.gz.
File metadata
- Download URL: rutherford_mcp_server-0.2.0.tar.gz
- Upload date:
- Size: 88.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3a4f42b559b1e1a4b972f408ff28ac03aa01447e7d1dc66c47a36168a9e2f4e
|
|
| MD5 |
85fecd108bb2cecb8cae976ea2349ae2
|
|
| BLAKE2b-256 |
a92bb86289e25e21665cf0a8f9e036eba1c7f32c71aee13c08be189dac67db88
|
Provenance
The following attestation bundles were made for rutherford_mcp_server-0.2.0.tar.gz:
Publisher:
release.yml on chapmanjw/rutherford-mcp-server
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rutherford_mcp_server-0.2.0.tar.gz -
Subject digest:
b3a4f42b559b1e1a4b972f408ff28ac03aa01447e7d1dc66c47a36168a9e2f4e - Sigstore transparency entry: 1738526784
- Sigstore integration time:
-
Permalink:
chapmanjw/rutherford-mcp-server@5fca991390f2a40acbdc6c0e72b5bf2eaa6884ae -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/chapmanjw
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5fca991390f2a40acbdc6c0e72b5bf2eaa6884ae -
Trigger Event:
push
-
Statement type:
File details
Details for the file rutherford_mcp_server-0.2.0-py3-none-any.whl.
File metadata
- Download URL: rutherford_mcp_server-0.2.0-py3-none-any.whl
- Upload date:
- Size: 123.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
472ad80846e7d19cb17c0e9820b8ac56cc74adf673f0590ce7082692ffd5aa0e
|
|
| MD5 |
8201c68cc6c906b47d38d4d78d9d8ab7
|
|
| BLAKE2b-256 |
4f438b0a82c33b820ec5dd0b877eb3d5ffc8775416a6f0ede09407f80766d8dd
|
Provenance
The following attestation bundles were made for rutherford_mcp_server-0.2.0-py3-none-any.whl:
Publisher:
release.yml on chapmanjw/rutherford-mcp-server
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rutherford_mcp_server-0.2.0-py3-none-any.whl -
Subject digest:
472ad80846e7d19cb17c0e9820b8ac56cc74adf673f0590ce7082692ffd5aa0e - Sigstore transparency entry: 1738526791
- Sigstore integration time:
-
Permalink:
chapmanjw/rutherford-mcp-server@5fca991390f2a40acbdc6c0e72b5bf2eaa6884ae -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/chapmanjw
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5fca991390f2a40acbdc6c0e72b5bf2eaa6884ae -
Trigger Event:
push
-
Statement type: