Two CLI agents in conversation. One Python file. Stdlib only.
Project description
duet
Two CLI agents in conversation. One Python file. Stdlib only.
duet.py runs two command-line coding agents, usually Claude and Codex, in
alternating turns. You can also pair two agents from the same backend, such as
Codex planner + Codex coder or Claude coder + Claude reviewer. One agent can
plan or review while the other implements. The loop stops when they agree, the
turn limit is reached, a timeout happens, or you stop them.
Use duet when you want:
- A planner/reviewer agent to keep pressure on an implementation agent.
- A second agent to inspect test failures, issue text, or review output.
- A transcript and run directory you can inspect after the agents finish.
- Isolation through an optional git worktree while the partner agent edits.
Quick Start
Pair-programming pattern: plan with codex in its own session first, then hand the session id to duet — codex implements with the plan in context while claude reviews each turn.
cd ~/code/myrepo
# Find the codex session you just planned in:
# ls -lt ~/.codex/sessions/ | head
# or look for `session id: <uuid>` on `codex exec`'s stderr.
./duet.py \
--resume-codex <codex-session-id> \
--worktree \
--reasoning max \
--task "Implement the plan from your codex planning session."
Four flags carry their weight; everything else is a default. Codex (resumed,
with the plan in context) speaks first as the coder. Claude reviews each
turn as the planner. The worktree keeps the host checkout clean until you
merge. Sentinel + rationale convergence rules are baked into both role
prompts — you do not need to restate them in --task.
Resume flags attach to the matching backend even when you override
--lead/--partner. A resumed Claude agent is normalized to lead so duet can
extract its latest message as the seed; a resumed Codex agent is normalized to
partner/coder so it speaks first with its existing plan in context.
The symmetric --resume-claude <session-id> does the inverse — plan in
claude, hand off to codex — and is duet's founding workflow, documented in
docs/USAGE.md.
Have a task in words but no prior planning session? Let codex plan inside the loop while claude implements:
./duet.py \
--recap \
--task "Add Codex fast mode for duet-managed Codex runs, don't miss any doc files" \
--lead claude:coder \
--partner codex:planner \
--worktree --worktree-for lead \
--turns 4
--recap keeps the live output compact and the worktree keeps the host
checkout clean until you merge.
# Run a fresh task in a target project.
./duet.py --task "Implement fizzbuzz in Go with tests" \
--lead claude:coder --partner codex:planner \
--cwd ~/code/scratch
# Seed duet from Claude Code's real /review output.
./duet.py --recap --task-from-cmd 'claude -p /review' \
--lead claude:reviewer --partner codex:coder \
--worktree \
--cwd .
In the review recipe, Claude's /review runs once to produce the kickoff
critique. Duet then hands that critique to Codex, preserves both agent
sessions, and manages the back-and-forth until convergence or the turn limit.
Install the duet command:
make install # symlinks duet.py to ~/.local/bin/duet
make ci # everything the CI gate runs: unit + reasoning + smoke + complexity
make test # unit tests (tests/test_duet.py) + scripts/smoke.sh dry-run checks
make unit-test # only the stdlib unittest suite under tests/
make smoke-test # only scripts/smoke.sh dry-run regression checks
make complexity # cyclomatic-complexity/length gate (single-file sprawl guard)
make build # sdist + wheel into dist/ (needs: python3 -m pip install build)
make loop-test # slow real Claude/Codex loop checks; writes runs/test-loop/
Or skip the clone — the PyPI package is duet-cli (bare duet is taken) and
the command it installs is duet:
uvx --from duet-cli duet --task "..." # one-shot run, isolated, no install
pipx install duet-cli # puts the duet command on PATH
pipx install 'duet-cli[yaml]' # include PyYAML for --config foo.yaml
Plain pip install duet-cli works too, but the installed top-level module is
named duet, which collides with Google's PyPI duet package in a shared
environment — pipx/uvx isolation avoids that.
Claude Code plugin — adds the /duet command (it shells out to the duet
CLI, so install the binary with one of the methods above as well):
/plugin marketplace add volkan/duet
/plugin install duet@volkan-duet
CI (.github/workflows/ci.yml) runs make ci's checks on every PR across
Python 3.9/3.11/3.13. To make them block merges, mark them required in branch
protection — see .github/BRANCH_PROTECTION.md
(admins can still force-merge).
How It Works
Each agent keeps its own conversation memory:
- Claude resumes with
claude -p --resume <session_id>. - Codex resumes with
codex exec resume <session_id>when duet captured one from Codex's stderr, orcodex exec resume --lastin the working directory as a fallback for older builds that don't print a session id.
On each turn, duet sends the latest reply from one agent to the other. It
continues until both agents accept convergence in back-to-back turns, --turns
is reached, a timeout happens, or you press Ctrl-C. A convergence proposal must
include an LGTM rationale: explaining why the work is done, followed by the
sentinel <<<LGTM>>> on its own line; a bare sentinel is ignored.
Finished runs record the reason in state.json: user interruption stays
force_stop, per-turn agent timeouts are timeout, and non-timeout agent
command failures or malformed required output are agent_error.
If you pass --verify-cmd, duet runs that shell command before counting a
valid convergence proposal. Exit code 0 allows the proposal to count; any
non-zero exit, timeout, or execution error feeds a capped failure block to the
next agent turn. --dry-run records and prints the configured command but
does not execute it.
After normal loop endings, duet opens a force> prompt. Press Enter to
finish, or type feedback to force another round; duet sends the next agent the
previous reply plus your feedback, including any appended worktree handoff
block and diff.
Common Recipes
Call Claude Code's real /review skill through duet:
./duet.py --recap --task-from-cmd 'claude -p /review' \
--lead claude:reviewer --partner codex:coder \
--worktree \
--cwd ~/workspace/project \
--turns 6
The /review skill supplies the initial findings; duet handles the subsequent
Codex fix turn, Claude verification turn, worktree diff handoff, and any extra
rounds.
With the /duet Claude Code command installed (the plugin from the install
section above, or the manual skill copy in docs/USAGE.md),
plain /duet runs that same /review kickoff recipe.
Let duet run the upstream command inside the target project:
./duet.py --task-from-cmd 'npm test 2>&1' \
--lead claude:coder --partner codex:planner \
--cwd ~/workspace/project \
--worktree --worktree-for lead
Use a repeatable config:
./duet.py --config duet.example.yaml
Useful packaged configs:
examples/pr-review.yaml- deep review of the latest commit.examples/codex-test-fix.yaml- Codex planner diagnoses failing checks; Codex coder fixes them in a worktree.
Same-backend peering:
# Codex planner + Codex coder. The worktree gives one Codex peer a separate cwd.
./duet.py --task "Fix the issue" \
--lead codex:planner --partner codex:coder \
--worktree --turns 6
# Claude coder + Claude reviewer.
./duet.py --task "Review and fix the current change" \
--lead claude:coder --partner claude:reviewer \
--turns 6
For codex/codex runs in one cwd, duet requires both peers to produce Codex
session UUIDs on their first turns. If either peer would fall back to
codex exec resume --last, duet aborts rather than risk resuming the other
peer's session. Use --worktree when in doubt.
Require a mechanical check before convergence:
./duet.py \
--task "Fix the issue" \
--lead claude:coder \
--partner codex:reviewer \
--worktree --worktree-for lead \
--verify-cmd 'make test'
Check an in-progress run from another terminal:
./duet.py --status .duet/runs/<id>/
Gate convergence on P0/P1 review findings:
./duet.py --task "Fix the issue" \
--lead claude:coder --partner codex:triage-reviewer \
--cwd ~/workspace/project
Review a recent implementation - Codex reviews at max effort, Claude applies only requested fixes:
./duet.py --recap \
--task "Review the current main branch changes. Codex should act as reviewer: identify any blocking issues in the latest commit. Claude should act as coder: implement only the fixes Codex explicitly requests. Preserve project constraints and run make test before convergence." \
--lead claude:coder \
--partner codex:reviewer \
--reasoning max \
--worktree --worktree-for lead \
--turns 6
The partner speaks first, so Codex (reviewer) opens turn 1 with its critique
and Claude (coder) responds in turn 2 with the fixes. --worktree-for lead
keeps the editable checkout under the coder. Keep --codex-fast off in this
recipe: Codex is the reviewer, so max effort is the point.
That same recipe is also packaged as a YAML config you can drop into any
repo — examples/pr-review.yaml reviews HEAD's diff with the same
agent/effort/worktree pairing, with comments calling out which keys to swap
for variants (review uncommitted changes, review a specific PR by number,
faster iteration once review is mostly done).
Review the latest commit plus an untracked notes file by seeding both into the task:
./duet.py --recap \
--task-from-cmd 'git show --stat --patch --no-ext-diff HEAD && printf "\n\n--- TODO.md ---\n" && cat TODO.md' \
--lead claude:coder \
--partner codex:reviewer \
--reasoning max \
--worktree --worktree-for lead \
--turns 6
Fresh worktrees start from committed HEAD; commit the notes first if the coder
must edit them as a normal tracked file.
Deep planner, fast coder — Claude plans at high effort, Codex coder turns drop to low for latency (uses the default claude:planner + codex:coder pairing):
./duet.py --reasoning high --codex-fast \
--task "Fix the issue" \
--cwd ~/workspace/project
Compact live debug view — see only what each turn produced, in real time:
./duet.py --recap --task "Fix the issue" \
--lead claude:coder --partner codex:planner \
--cwd ~/workspace/project
Output
Every run writes a directory containing:
transcript.md- the full conversation.recap.md- compact per-turn debug view when--recapis enabled;--statusshows this path when present.state.json- run state, agent roles, session ids, finish reason, worktree metadata, andrecap_pathfor recap runs.turn-*.stderr.log- live stderr from each agent invocation.turn-*-verify.log- verify command metadata, stdout, and stderr when--verify-cmdruns.turn-*.pid- present only while an agent or verify command is running.wt/- the git worktree, when--worktreeis enabled.
When a worktree agent replies, duet appends a handoff block to that reply before
the diff. The block names the exact worktree path and branch, warns that the
receiving agent's cwd may be a clean checkout, and includes git -C <wt> review
commands so verification happens against the edited tree.
When --cwd points outside the invocation directory and --runs-dir is not
set, artifacts go under the target project at .duet/runs/<run_id>/.
Documentation
Read docs/USAGE.md for the full reference: flags, sandbox and
network rules, worktree mode, output layout, --status / --continue, force
prompt behavior, session memory, the post-run "apply / iterate / discard"
checklist, and the optional /duet Claude Code command (plugin or manual
skill).
For contributor guidance, read CLAUDE.md. Codex-specific entry notes live in AGENTS.md.
Limits
duet --continue <run>starts a fresh run from a priorstate.json, restores saved session ids, and reuses the previous worktree when available. It does not append to the old transcript.- Parallel Codex sessions in the same cwd are safe when duet captured a
session id from Codex's stderr — that turn pins to the UUID, not to recency.
When the UUID was not captured (old Codex builds, or continuing a pre-UUID
run), duet falls back to
codex exec resume --last, which is cwd-based and unsafe to share.--worktreeisolates duet's Codex cwd from the host repo; in--lastfallback mode, do not start another Codex session inside that same worktree while the run is active. - Transcripts capture full agent text. Convergence detection only counts rationale-backed sentinels outside fenced markdown code blocks.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file duet_cli-0.1.0.tar.gz.
File metadata
- Download URL: duet_cli-0.1.0.tar.gz
- Upload date:
- Size: 58.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0cb777fafc0c1e6ca478a42d456889c81adaefd2a3be01d31aba81130161f06b
|
|
| MD5 |
52b3ab8072d5cabefabf7edb09c3c4e0
|
|
| BLAKE2b-256 |
cffae9cb31fdd85c9a77e9d91d046c7a05dbfd5b39f616afbd82f091f2ece2aa
|
File details
Details for the file duet_cli-0.1.0-py3-none-any.whl.
File metadata
- Download URL: duet_cli-0.1.0-py3-none-any.whl
- Upload date:
- Size: 45.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c3e520e78c717f1fb0604aa473941f7628de20d65039d4d5333ac2fd3b2af7b
|
|
| MD5 |
56336dadcd14531e7998ac815c6f029f
|
|
| BLAKE2b-256 |
8924b961aa51ae9a3dc5c524de7244a8b2d17de93cfd603f4680c766b68de54f
|