Skip to main content

Two CLI agents in conversation. One Python file. Stdlib only.

Project description

duet

Two CLI agents in conversation. One Python file. Stdlib only.

duet.py runs two command-line coding agents, usually Claude and Codex, in alternating turns. You can also pair two agents from the same backend, such as Codex planner + Codex coder or Claude coder + Claude reviewer. One agent can plan or review while the other implements. The loop stops when they agree, the turn limit is reached, a timeout happens, or you stop them.

Use duet when you want:

  • A planner/reviewer agent to keep pressure on an implementation agent.
  • A second agent to inspect test failures, issue text, or review output.
  • A transcript and run directory you can inspect after the agents finish.
  • Isolation through an optional git worktree while the partner agent edits.

Quick Start

Pair-programming pattern: plan with codex in its own session first, then hand the session id to duet — codex implements with the plan in context while claude reviews each turn.

cd ~/code/myrepo

# Find the codex session you just planned in:
#   ls -lt ~/.codex/sessions/ | head
# or look for `session id: <uuid>` on `codex exec`'s stderr.

./duet.py \
    --resume-codex <codex-session-id> \
    --worktree \
    --reasoning max \
    --task "Implement the plan from your codex planning session."

Four flags carry their weight; everything else is a default. Codex (resumed, with the plan in context) speaks first as the coder. Claude reviews each turn as the planner. The worktree keeps the host checkout clean until you merge. Sentinel + rationale convergence rules are baked into both role prompts — you do not need to restate them in --task.

Resume flags attach to the matching backend even when you override --lead/--partner. A resumed Claude agent is normalized to lead so duet can extract its latest message as the seed; a resumed Codex agent is normalized to partner/coder so it speaks first with its existing plan in context.

The symmetric --resume-claude <session-id> does the inverse — plan in claude, hand off to codex — and is duet's founding workflow, documented in docs/USAGE.md.

Have a task in words but no prior planning session? Let codex plan inside the loop while claude implements:

./duet.py \
    --recap \
    --task "Add Codex fast mode for duet-managed Codex runs, don't miss any doc files" \
    --lead claude:coder \
    --partner codex:planner \
    --worktree --worktree-for lead \
    --turns 4

--recap keeps the live output compact and the worktree keeps the host checkout clean until you merge.

# Run a fresh task in a target project.
./duet.py --task "Implement fizzbuzz in Go with tests" \
    --lead claude:coder --partner codex:planner \
    --cwd ~/code/scratch

# Seed duet from Claude Code's real /review output.
./duet.py --recap --task-from-cmd 'claude -p /review' \
    --lead claude:reviewer --partner codex:coder \
    --worktree \
    --cwd .

In the review recipe, Claude's /review runs once to produce the kickoff critique. Duet then hands that critique to Codex, preserves both agent sessions, and manages the back-and-forth until convergence or the turn limit.

Install the duet command:

make install      # symlinks duet.py to ~/.local/bin/duet
make ci           # everything the CI gate runs: unit + reasoning + smoke + complexity
make test         # unit tests (tests/test_duet.py) + scripts/smoke.sh dry-run checks
make unit-test    # only the stdlib unittest suite under tests/
make smoke-test   # only scripts/smoke.sh dry-run regression checks
make complexity   # cyclomatic-complexity/length gate (single-file sprawl guard)
make build        # sdist + wheel into dist/ (needs: python3 -m pip install build)
make loop-test    # slow real Claude/Codex loop checks; writes runs/test-loop/

Or skip the clone — the PyPI package is duet-cli (bare duet is taken) and the command it installs is duet:

uvx --from duet-cli duet --task "..."   # one-shot run, isolated, no install
pipx install duet-cli                   # puts the duet command on PATH
pipx install 'duet-cli[yaml]'           # include PyYAML for --config foo.yaml

Plain pip install duet-cli works too, but the installed top-level module is named duet, which collides with Google's PyPI duet package in a shared environment — pipx/uvx isolation avoids that.

Claude Code plugin — adds the /duet command (it shells out to the duet CLI, so install the binary with one of the methods above as well):

/plugin marketplace add volkan/duet
/plugin install duet@volkan-duet

CI (.github/workflows/ci.yml) runs make ci's checks on every PR across Python 3.9/3.11/3.13. To make them block merges, mark them required in branch protection — see .github/BRANCH_PROTECTION.md (admins can still force-merge).

How It Works

Each agent keeps its own conversation memory:

  • Claude resumes with claude -p --resume <session_id>.
  • Codex resumes with codex exec resume <session_id> when duet captured one from Codex's stderr, or codex exec resume --last in the working directory as a fallback for older builds that don't print a session id.

On each turn, duet sends the latest reply from one agent to the other. It continues until both agents accept convergence in back-to-back turns, --turns is reached, a timeout happens, or you press Ctrl-C. A convergence proposal must include an LGTM rationale: explaining why the work is done, followed by the sentinel <<<LGTM>>> on its own line; a bare sentinel is ignored.

Finished runs record the reason in state.json: user interruption stays force_stop, per-turn agent timeouts are timeout, and non-timeout agent command failures or malformed required output are agent_error.

If you pass --verify-cmd, duet runs that shell command before counting a valid convergence proposal. Exit code 0 allows the proposal to count; any non-zero exit, timeout, or execution error feeds a capped failure block to the next agent turn. --dry-run records and prints the configured command but does not execute it.

After normal loop endings, duet opens a force> prompt. Press Enter to finish, or type feedback to force another round; duet sends the next agent the previous reply plus your feedback, including any appended worktree handoff block and diff.

Common Recipes

Call Claude Code's real /review skill through duet:

./duet.py --recap --task-from-cmd 'claude -p /review' \
    --lead claude:reviewer --partner codex:coder \
    --worktree \
    --cwd ~/workspace/project \
    --turns 6

The /review skill supplies the initial findings; duet handles the subsequent Codex fix turn, Claude verification turn, worktree diff handoff, and any extra rounds.

With the /duet Claude Code command installed (the plugin from the install section above, or the manual skill copy in docs/USAGE.md), plain /duet runs that same /review kickoff recipe.

Let duet run the upstream command inside the target project:

./duet.py --task-from-cmd 'npm test 2>&1' \
    --lead claude:coder --partner codex:planner \
    --cwd ~/workspace/project \
    --worktree --worktree-for lead

Use a repeatable config:

./duet.py --config duet.example.yaml

Useful packaged configs:

  • examples/pr-review.yaml - deep review of the latest commit.
  • examples/codex-test-fix.yaml - Codex planner diagnoses failing checks; Codex coder fixes them in a worktree.

Same-backend peering:

# Codex planner + Codex coder. The worktree gives one Codex peer a separate cwd.
./duet.py --task "Fix the issue" \
    --lead codex:planner --partner codex:coder \
    --worktree --turns 6

# Claude coder + Claude reviewer.
./duet.py --task "Review and fix the current change" \
    --lead claude:coder --partner claude:reviewer \
    --turns 6

For codex/codex runs in one cwd, duet requires both peers to produce Codex session UUIDs on their first turns. If either peer would fall back to codex exec resume --last, duet aborts rather than risk resuming the other peer's session. Use --worktree when in doubt.

Require a mechanical check before convergence:

./duet.py \
  --task "Fix the issue" \
  --lead claude:coder \
  --partner codex:reviewer \
  --worktree --worktree-for lead \
  --verify-cmd 'make test'

Check an in-progress run from another terminal:

./duet.py --status .duet/runs/<id>/

Gate convergence on P0/P1 review findings:

./duet.py --task "Fix the issue" \
    --lead claude:coder --partner codex:triage-reviewer \
    --cwd ~/workspace/project

Review a recent implementation - Codex reviews at max effort, Claude applies only requested fixes:

./duet.py --recap \
    --task "Review the current main branch changes. Codex should act as reviewer: identify any blocking issues in the latest commit. Claude should act as coder: implement only the fixes Codex explicitly requests. Preserve project constraints and run make test before convergence." \
    --lead claude:coder \
    --partner codex:reviewer \
    --reasoning max \
    --worktree --worktree-for lead \
    --turns 6

The partner speaks first, so Codex (reviewer) opens turn 1 with its critique and Claude (coder) responds in turn 2 with the fixes. --worktree-for lead keeps the editable checkout under the coder. Keep --codex-fast off in this recipe: Codex is the reviewer, so max effort is the point.

That same recipe is also packaged as a YAML config you can drop into any repo — examples/pr-review.yaml reviews HEAD's diff with the same agent/effort/worktree pairing, with comments calling out which keys to swap for variants (review uncommitted changes, review a specific PR by number, faster iteration once review is mostly done).

Review the latest commit plus an untracked notes file by seeding both into the task:

./duet.py --recap \
    --task-from-cmd 'git show --stat --patch --no-ext-diff HEAD && printf "\n\n--- TODO.md ---\n" && cat TODO.md' \
    --lead claude:coder \
    --partner codex:reviewer \
    --reasoning max \
    --worktree --worktree-for lead \
    --turns 6

Fresh worktrees start from committed HEAD; commit the notes first if the coder must edit them as a normal tracked file.

Deep planner, fast coder — Claude plans at high effort, Codex coder turns drop to low for latency (uses the default claude:planner + codex:coder pairing):

./duet.py --reasoning high --codex-fast \
    --task "Fix the issue" \
    --cwd ~/workspace/project

Compact live debug view — see only what each turn produced, in real time:

./duet.py --recap --task "Fix the issue" \
    --lead claude:coder --partner codex:planner \
    --cwd ~/workspace/project

Output

Every run writes a directory containing:

  • transcript.md - the full conversation.
  • recap.md - compact per-turn debug view when --recap is enabled; --status shows this path when present.
  • state.json - run state, agent roles, session ids, finish reason, worktree metadata, and recap_path for recap runs.
  • turn-*.stderr.log - live stderr from each agent invocation.
  • turn-*-verify.log - verify command metadata, stdout, and stderr when --verify-cmd runs.
  • turn-*.pid - present only while an agent or verify command is running.
  • wt/ - the git worktree, when --worktree is enabled.

When a worktree agent replies, duet appends a handoff block to that reply before the diff. The block names the exact worktree path and branch, warns that the receiving agent's cwd may be a clean checkout, and includes git -C <wt> review commands so verification happens against the edited tree.

When --cwd points outside the invocation directory and --runs-dir is not set, artifacts go under the target project at .duet/runs/<run_id>/.

Documentation

Read docs/USAGE.md for the full reference: flags, sandbox and network rules, worktree mode, output layout, --status / --continue, force prompt behavior, session memory, the post-run "apply / iterate / discard" checklist, and the optional /duet Claude Code command (plugin or manual skill).

For contributor guidance, read CLAUDE.md. Codex-specific entry notes live in AGENTS.md.

Limits

  • duet --continue <run> starts a fresh run from a prior state.json, restores saved session ids, and reuses the previous worktree when available. It does not append to the old transcript.
  • Parallel Codex sessions in the same cwd are safe when duet captured a session id from Codex's stderr — that turn pins to the UUID, not to recency. When the UUID was not captured (old Codex builds, or continuing a pre-UUID run), duet falls back to codex exec resume --last, which is cwd-based and unsafe to share. --worktree isolates duet's Codex cwd from the host repo; in --last fallback mode, do not start another Codex session inside that same worktree while the run is active.
  • Transcripts capture full agent text. Convergence detection only counts rationale-backed sentinels outside fenced markdown code blocks.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duet_cli-0.1.0.tar.gz (58.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

duet_cli-0.1.0-py3-none-any.whl (45.9 kB view details)

Uploaded Python 3

File details

Details for the file duet_cli-0.1.0.tar.gz.

File metadata

  • Download URL: duet_cli-0.1.0.tar.gz
  • Upload date:
  • Size: 58.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for duet_cli-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0cb777fafc0c1e6ca478a42d456889c81adaefd2a3be01d31aba81130161f06b
MD5 52b3ab8072d5cabefabf7edb09c3c4e0
BLAKE2b-256 cffae9cb31fdd85c9a77e9d91d046c7a05dbfd5b39f616afbd82f091f2ece2aa

See more details on using hashes here.

File details

Details for the file duet_cli-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: duet_cli-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 45.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for duet_cli-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1c3e520e78c717f1fb0604aa473941f7628de20d65039d4d5333ac2fd3b2af7b
MD5 56336dadcd14531e7998ac815c6f029f
BLAKE2b-256 8924b961aa51ae9a3dc5c524de7244a8b2d17de93cfd603f4680c766b68de54f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page