Skip to main content

An MCP server that delegates work to, and builds consensus across, a crew of agentic coding CLIs.

Project description

Rutherford logo

Rutherford MCP Server

Give your AI coding agent a crew. Rutherford is a Model Context Protocol server that lets one coding CLI delegate work to, debate with, and build consensus across a group of others — Claude Code, Codex, Cursor, Qwen Code, Antigravity, Kiro, OpenCode, and Goose. It runs them as headless subprocesses and brings their answers back in one normalized shape. It is CLI-only: it orchestrates terminal coding agents and never calls a model provider API directly.

.---------.
|  \/\/\/ |
|  O  [==]|
|    <    |
|  \___/  |
'---------'
-- Ensign Sam Rutherford --
USS Cerritos . Engineering

Named for the irrepressibly cheerful engineer aboard the USS Cerritos in Star Trek: Lower Decks, who has a gift for getting heterogeneous systems to cooperate. That is the job here: one agent hands work to a crew of others and brings the results back. Star Trek and Lower Decks are trademarks of their respective owners; this is an unaffiliated, fan-named open-source project.

Why you'd want this

You are deep in a session with one coding agent. Then you hit a moment where one opinion isn't enough:

  • You're about to commit to a design and want a second and third opinion before you do.
  • Two models disagree and you want to watch them actually argue it out, not just answer in parallel.
  • A diff is risky and you want several reviewers on it, with the must-fix issues separated from nits.
  • You want to hand off a long refactor to a different agent and keep working while it runs.
  • You want a fresh, unbiased critique of the code you just wrote, from an instance with no memory of the conversation that produced it.

Rutherford does all of that from inside the agent you're already talking to. You describe what you want in plain language; your agent translates it into Rutherford's tools. You rarely name the tools yourself.

How it works

Rutherford is a stdio MCP server. Your MCP client (a coding CLI or a desktop app) calls it, and it spawns the target CLIs as fresh, isolated headless subprocesses — argv arrays, never shell strings, read-only by default, and depth-bounded so a CLI that calls itself can't recurse forever.

   your MCP client (Claude Code, Cursor, Codex, Claude Desktop, ...)
        |  MCP over stdio
        v
   rutherford-mcp-server
        |  fresh subprocess per call (read_only by default)
        +--> claude -p "..." --output-format json
        +--> codex exec --json
        +--> kiro-cli chat --no-interactive "..."
        +--> opencode run --format json "..."
        +--> goose run -t "..." --no-session
        +--> cursor-agent -p --output-format json
        +--> qwen -o json
        +--> agy -p "..."   (answer read from the transcript file)

Every answer comes back in the same envelope regardless of the CLI's native output format, so your agent compares apples to apples. A CLI that errors or isn't installed comes back as one failed voice without sinking the rest of a panel.

Supported CLIs

Invocation flags verified 2026-05-30. Pin your CLI versions and re-verify after upgrades; each adapter keeps all of its CLI-specific details in one file, so a change is a one-file edit.

CLI Adapter id How Rutherford runs it Auth
Claude Code claude_code claude -p "<prompt>" --output-format json subscription/OAuth login or ANTHROPIC_API_KEY
Codex codex codex exec --json (prompt on stdin) ChatGPT login or OPENAI_API_KEY
Cursor cursor cursor-agent -p --output-format json (prompt on stdin) cursor-agent login or CURSOR_API_KEY
Qwen Code qwen qwen -o json (prompt on stdin) qwen OAuth login or OPENAI_API_KEY
Kiro kiro kiro-cli chat --no-interactive "<prompt>" KIRO_API_KEY or kiro-cli login
OpenCode opencode opencode run --format json -q "<prompt>" provider key or opencode auth login
Goose goose goose run -q -t "<prompt>" --no-session GOOSE_PROVIDER + provider key
Antigravity antigravity agy -p "<prompt>" (answer from the transcript file) Google account login

A ninth, well-behaved CLI can be added without code — see docs/adding-a-cli.md.


Setup

1. Install Rutherford

It's a Python 3.11+ package. Install it as a tool so the rutherford-mcp-server command is on your PATH:

uv tool install rutherford-mcp-server
# or: pipx install rutherford-mcp-server
# or: pip install rutherford-mcp-server

2. Install and sign in to the CLIs you want to orchestrate

Rutherford does not install or authenticate the target CLIs — it drives the ones you already have. Install whichever you want a crew of, and log in to each (subscription login or the relevant API key; see docs/integration-testing.md). You don't need all eight; two is enough for a consensus or a debate.

3. Register Rutherford with your MCP client

The command to run is rutherford-mcp-server (equivalently python -m rutherford). Same command on Windows, macOS, and Linux.

Claude Code

claude mcp add rutherford -- rutherford-mcp-server

Codex

codex mcp add rutherford -- rutherford-mcp-server

Claude Desktop / Cursor / other JSON-config clients (Claude Desktop: claude_desktop_config.json; Cursor: .cursor/mcp.json):

{
  "mcpServers": {
    "rutherford": {
      "command": "rutherford-mcp-server"
    }
  }
}

If the command isn't on PATH, use an absolute path, or python -m rutherford with the interpreter from the environment where Rutherford is installed. For WSL and more clients, see docs/mcp-client-integration.md.

4. Scaffold your config, then confirm the crew is reachable

Run the setup wizard to detect which CLIs you're signed in to and write a starter config plus a panel of them:

rutherford-mcp-server init

It prints the plan and writes the files only after you confirm (it never overwrites an existing file). Once Rutherford is registered, you can do the same conversationally — ask your agent to "set up Rutherford" and the setup tool proposes the same files for you to approve. Then have your agent run doctor to confirm each CLI is installed, authenticated, and reachable.


Tutorials

You drive Rutherford in plain language from your MCP client. Each tutorial below is a prompt you can paste, with a note on what Rutherford does under the hood. Everything defaults to read-only.

See who's on the crew

Which coding CLIs can Rutherford reach right now, and which am I signed in to? Then run doctor and tell me if anything's misconfigured.

capabilities is an instant, free snapshot of installed state, auth, and models. doctor goes further and live-checks any CLI that has no status command (like Antigravity, whose auth only shows up once a real round trip confirms it).

Hand one task to one agent

Use Rutherford to have Codex read src/auth/session.py and explain how token refresh works. Read-only.

A delegate to one CLI. You get back one normalized result: the answer, timing, token cost, and a session id you can resume later. Add a model if you want a specific one ("ask Kiro with the cheap claude-haiku-4.5 model to ...").

Get a second and third opinion

I think the deadlock is in queue.py. Ask Claude Code, Codex, and Qwen the same question — where is it and how would you fix it? — and show me their answers side by side.

A consensus across three targets, one independent voice each, run in parallel. To poll everyone you're signed in to, just don't name targets: "ask every coding agent I'm logged into whether a UUID or a ULID is a better primary key." Rutherford builds the panel from every installed, authenticated CLI and tells you in skipped which it left out and why.

Run a real debate

This is the one that isn't just parallel answers. In a debate, round one is each voice's independent take; in every later round, each voice sees the others' latest positions and is asked to rebut and revise.

Run a 3-round debate between Claude Code, Codex, and Kiro: "is UUIDv7 or ULID the better primary key for a high-write event table?" Show me how each position shifted, plus a closing summary.

A debate. The result carries the full per-round transcript, so you can retrace exactly who said what and where someone changed their mind, followed by a closing synthesis of where the panel converged and where it still splits. In a real run of that exact prompt, all three opened with "UUIDv7" for different reasons, then in round two Claude Code and Codex corrected a factual error in Kiro's argument — and Kiro revised its position in response. Optional stances ("have Cursor argue for it and Claude Code argue against") keep a voice on an assigned side the whole way through.

Turn a panel into a decision

When you want an answer, not a transcript, give consensus a strategy. Each voice is asked for a verdict and Rutherford aggregates them.

Ask claude_code, codex, and qwen "is this migration safe to ship?" and take the majority verdict, with each ending in a one-word VERDICT line.

A consensus with strategy: majority. You get back the outcome, the winning decision, and every voice's verdict alongside its full reasoning. The strategies:

Strategy What it does
all-voices Every voice, no aggregation (the default).
unanimous Agrees only if every voice matches, otherwise reports a split.
majority One voice, one vote; the most-voted verdict wins.
weighted The verdict with the greatest summed target weight wins.
parity-pair Compares a proposer against parity counterweights; disagreement escalates.

Verdicts are read from a final VERDICT: <token> line, or as JSON if you pass a verdict_schema.

Save a crew as a panel and reuse it

Once you have a group you keep reaching for, save it as a named panel instead of listing the targets every time.

# ~/.rutherford/panels.toon
panels:
  design-roundtable:
    description: Lineage-diverse design review
    strategy: parity-pair
    targets[3]:
      - cli: claude_code
        model: opus
        label: proposer
      - cli: codex
        label: implementer
      - cli: kiro
        model: deepseek-3.2
        label: dissenter
        parity: true

Run my design-roundtable panel on this: "should this API return a stream or a page?"

consensus, debate, and review all accept panel="design-roundtable". Panels live in ~/.rutherford/panels.toon (global) or <project>/.rutherford/panels.toon (project-specific, which overrides a global panel of the same name). After editing the file, ask your agent to "reload Rutherford's panels" and it picks up the change without a restart.

Review a diff across several reviewers

Review my staged diff with Claude Code and Codex as reviewers. Findings by file and line, must-fix separated from nits, and call out anything only one of them flagged.

A review — read-only, using the codereviewer role — over a diff or a set of paths, across one or more targets. Point it at paths instead ("review everything under src/payments/ for injection bugs") and the reviewers read the files themselves.

Get an implementation plan

Use Rutherford's planner on Claude Code to turn "add OAuth2 device-code login to the CLI" into an ordered, step-by-step plan, with the files each step touches and the risky parts flagged.

A plan — one target, the planner role, read-only. The bundled roles are planner, codereviewer, security, and debugger; ask "what roles does Rutherford have?" to list them (each shows its source). Add your own as markdown or TOON files under ~/.rutherford/roles/ (or a project's .rutherford/roles/); a project role overrides a same-named global one. See docs/configuration.md.

Let an agent actually make the change

Let Codex apply the fix in C:\work\myrepo — write mode, you have my permission to edit files there. Add the missing null check and a test that covers it.

A delegate in write mode. Write and yolo are never the default: they require both an explicit mode and a trusted workspace (an allowlisted path or your per-call go-ahead), so an agent can't modify a directory by accident. See the safety model below.

Kick off a long job and keep working

Start a big refactor on OpenCode in the background — "convert the data layer to the repository pattern" in C:\work\myrepo — and just give me the job id.

delegate (or consensus / debate) in async mode returns a job id immediately. "Is that Rutherford job done yet? Show me the result if it finished" polls job_status / job_result.

Get a fresh, unbiased take on your own work

Spin up a separate Claude Code instance through Rutherford — one with no memory of this conversation — to critique the design we just wrote.

Rutherford can target the very CLI you're talking to. It spawns a fresh, isolated subprocess that is distinct from your session and can't reach back into it. A depth guard (max_depth, default 3) keeps a CLI-calls-itself chain bounded.


Safety model

Every delegation runs in one of four safety modes, defaulting to the most restrictive:

Mode Meaning
read_only (default) Inspect only. Review, consensus, debate, and plan are read-only by nature.
propose The agent may propose changes (e.g. a diff) but not apply them.
write The agent may modify the workspace, subject to the CLI's own approvals.
yolo The agent may act without approval prompts (the CLI's bypass mode).

write and yolo require an explicit argument and a trusted-workspace check: the target directory must be on the configured trusted_workspaces allowlist, or the call must pass trust_workspace=true. No adapter ever defaults to its permission-bypass flag, and invocations are always built as an argv list, never a shell string. See docs/security.md.

Configuration

The main config is a small TOML file (config.toml in your platform config dir, or a project-local rutherford.toml); panels and custom roles live in their own files under ~/.rutherford/ and a project's .rutherford/. The full reference — every field, the discovery and precedence rules, the panel and role file formats, and config-defined generic adapters — is in docs/configuration.md.

Documentation

Experimental status

Rutherford drives independent third-party CLIs. Their headless flags, output formats, and auth mechanisms change between releases, and a CLI update can change or remove something an adapter relies on. Every flag in this repo was verified against the CLI's own --help and docs on the date noted above. Pin your CLI versions, re-verify after upgrades, and treat the integration as evolving.

Contributing

See CONTRIBUTING.md. The whole core is testable without a real CLI; run just check before pushing, then just test-integration for whatever CLIs your machine has installed and authenticated.

License

MIT (c) John Chapman. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rutherford_mcp_server-0.2.0.tar.gz (88.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rutherford_mcp_server-0.2.0-py3-none-any.whl (123.9 kB view details)

Uploaded Python 3

File details

Details for the file rutherford_mcp_server-0.2.0.tar.gz.

File metadata

  • Download URL: rutherford_mcp_server-0.2.0.tar.gz
  • Upload date:
  • Size: 88.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rutherford_mcp_server-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b3a4f42b559b1e1a4b972f408ff28ac03aa01447e7d1dc66c47a36168a9e2f4e
MD5 85fecd108bb2cecb8cae976ea2349ae2
BLAKE2b-256 a92bb86289e25e21665cf0a8f9e036eba1c7f32c71aee13c08be189dac67db88

See more details on using hashes here.

Provenance

The following attestation bundles were made for rutherford_mcp_server-0.2.0.tar.gz:

Publisher: release.yml on chapmanjw/rutherford-mcp-server

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rutherford_mcp_server-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for rutherford_mcp_server-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 472ad80846e7d19cb17c0e9820b8ac56cc74adf673f0590ce7082692ffd5aa0e
MD5 8201c68cc6c906b47d38d4d78d9d8ab7
BLAKE2b-256 4f438b0a82c33b820ec5dd0b877eb3d5ffc8775416a6f0ede09407f80766d8dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for rutherford_mcp_server-0.2.0-py3-none-any.whl:

Publisher: release.yml on chapmanjw/rutherford-mcp-server

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page