Skip to main content

Coordinate multi-agent implementation workflows from markdown plans.

Project description

diamond-dev

PyPI CI Ruff Python 3.14+ ty Ask DeepWiki

diamond-dev orchestrates a configurable multi-agent implementation workflow from a single markdown plan. By default it asks two coding agents (Codex and Claude) to implement the same plan on separate branches, asks a third agent (Gemini) to compare the results, waits for you to accept one branch in the repository wiki, refines the selected branch, runs a CodeRabbit review, applies the accepted fixes, and opens a GitHub PR.

Workflow

flowchart TD
    plan["📄 markdown plan"] --> impl
    subgraph impl["1 · Implement in parallel"]
        codex["codex/<slug>"]
        claude["claude/<slug>"]
    end
    impl --> bundle["2 · Build deterministic comparison bundle"]
    bundle --> judge["3 · Gemini compares branches → comparison.md (wiki)"]
    judge --> accept{"4 · You accept one branch<br/>in the wiki"}
    accept --> refine["5 · Refine the accepted branch"]
    refine --> review["6 · CodeRabbit review → judge findings → apply accepted fixes"]
    review --> finalpr["7 · Final review → open GitHub PR"]
  1. Implement — each configured implementer (default codex, claude) implements the plan on its own branch and commits without pushing.
  2. Bundlediamond-dev builds a deterministic comparison bundle (branch metadata, diffs, optional test results) for the judge to read.
  3. Compare — the comparison judge (default gemini) writes comparison.md, which is pushed to the GitHub wiki with an acceptance checkbox.
  4. Accept — you check exactly one box in the wiki to choose a branch. The workflow polls the wiki and resumes when it sees your choice.
  5. Refine — the accepted branch is refined per the comparison follow-up.
  6. Review — CodeRabbit reviews the branch, the review judge classifies each finding, and the review fixer applies accepted fixes.
  7. PR — a final reviewer runs and diamond-dev opens a GitHub PR.

⚠️ Security: diamond-dev runs coding agents with their sandbox and approval prompts disabled, and executes package-install and test commands from the target repository. Run it only against repositories and plans you trust. See Security.

Table of Contents

Prerequisites

  • Python 3.14+
  • External CLIs, installed and authenticated where needed. The default workflow needs:
    • git
    • gh (authenticated — diamond-dev verifies gh auth status at startup)
    • codex
    • claude
    • gemini
    • coderabbit
  • Optional CLIs, required only when the cloned target repository has matching root lockfiles:
    • uv — for uv.lock (uv sync --locked)
    • pnpm — for pnpm-lock.yaml (pnpm install --frozen-lockfile)

Before cloning or launching agents, diamond-dev runs a fast preflight that checks the configured commands are available on PATH and verifies gh auth status. When you use custom agents, only the CLIs for the adapters you configure are required.

Installation

diamond-dev targets Python 3.14+. The recommended installer is uv.

Install the CLI directly from the repository:

uv tool install git+https://github.com/hbmartin/diamond-dev.git

Or clone and install from source for development:

git clone https://github.com/hbmartin/diamond-dev.git
cd diamond-dev
uv sync --all-groups
uv run diamond-dev --version

Quickstart

# 1. Generate a starter config in your working directory.
diamond-dev init

# 2. Write a plan describing the change you want.
$EDITOR my-plan.md

# 3. Run the workflow.
diamond-dev my-plan.md

init asks for the target repository URL, an optional wiki repository URL, and optional notification URLs, then writes .diamond-dev.toml. Everything else uses the defaults described under Configuration.

When the run reaches step 3 of the workflow, it pauses for your input. Open the comparison page in the repository's GitHub wiki (<slug>-comparison.md), read Gemini's comparison of the two branches, and check exactly one box:

- [x] Accept: codex

diamond-dev polls the wiki, sees your choice, and resumes automatically — refining the accepted branch, running the review, and opening the PR. You do not need to restart the command; if it has exited, rerunning diamond-dev my-plan.md auto-resumes from where it left off.

Usage

diamond-dev path/to/my-plan.md

The command must be run from a directory containing .diamond-dev.toml (or pass --config). It takes a path to a .md plan file.

To create a starter config interactively:

diamond-dev init

The initializer asks for the target repository URL, an optional wiki repository URL, and optional notification URLs. It writes only .diamond-dev.toml; workflow, agent, prompt, and comparison settings use the defaults documented below unless you edit the generated file.

Useful flags:

  • --config PATH: Load configuration from a specific TOML file instead of .diamond-dev.toml in the current directory. Relative paths resolve from the invocation directory. With init, this selects the config file to write.
  • --force: With init, overwrite an existing config file without asking.
  • --version: Show the installed diamond-dev version.

Configuration

.diamond-dev.toml requires a target repository URL:

repository_url = "git@github.com:owner/repo.git"

repository_url must be a Git remote URL in a supported URL form such as https://github.com/owner/repo, ssh://git@github.com/owner/repo.git, git://host/owner/repo.git, file:///path/to/repo.git, or an SCP-like form such as git@github.com:owner/repo.git.

Optional top-level keys:

  • wiki_repository_url: GitHub Gollum wiki repository URL. If omitted, GitHub remotes are derived as <repo>.wiki.git. The local wiki clone directory is named from the effective wiki repository URL.

Optional tables:

[notifications]
initial_implementation_url = "https://example.test/initial"
comparison_url = "https://example.test/comparison"
comparison_implementation_url = "https://example.test/followup"
review_input_needed_url = "https://example.test/review"
open_pr_url = "https://example.test/open-pr"

[prompts]
initial_implementation_file = "prompts/initial.md"
comparison_judgment_file = "prompts/compare.md"
comparison_implementation_file = "prompts/followup.md"
review_judgment_file = "prompts/review-judgment.md"
review_fix_file = "prompts/review-fixes.md"

[workflow]
implementers = ["codex", "claude"]
comparison_judge = "gemini"
# comparison_fixer is optional; omitted means the first non-selected implementer.
review_provider = "coderabbit"
review_judge = "codex"
review_fixer = "codex"
final_reviewer = "claude"

[comparison]
test_commands = []
max_total_diff_bytes = 200000
max_file_diff_bytes = 40000
max_test_output_bytes = 20000

[agents.codex]
model = "gpt-5"

[agents.claude]
model = "opus"

[agents.gemini]
model = "gemini-3"

[agents.claude-fixer]
adapter = "claude"
model = "opus"

Prompt file paths resolve from the config file directory. Prompt overrides replace the built-in task instructions while keeping Diamond Dev's required workflow context, such as artifact filenames and commit/no-push requirements.

Agent table names are workflow-local agent names. Built-in names such as codex, claude, gemini, and coderabbit implicitly use matching adapters. Additional agent names must set adapter to one of those built-ins, which lets a workflow use the same CLI in multiple roles with different models.

The [comparison] table controls the deterministic comparison bundle generated before the comparison judge runs. test_commands defaults to empty, which records tests: not_run for each implementation branch. When set, commands run with sh -lc in each implementation clone; nonzero exits are recorded in the bundle and the workflow still continues to comparison judgment. Test commands are trusted project-specific commands. If they leave uncommitted files, diamond-dev records those dirty files but does not clean them.

Notification URLs are best-effort GET requests. Failures are logged but do not stop the workflow.

Legacy and removed keys (migration)

[prompts].gemini_comparison_file and the legacy top-level gemini_comparison_prompt_file key are still accepted as aliases for [prompts].comparison_judgment_file.

Legacy top-level notification keys are still accepted: notify_initial_implementation_url, notify_comparison_url, notify_comparison_implementation_url, notify_review_input_needed_url, and notify_open_pr_url. A config fails if a legacy key and its table replacement are both present.

The previous notes_repository_url key has been removed. Use wiki_repository_url; configs that still contain the old key fail at startup.

Prompts

Each built-in prompt has a fallback that can be replaced by a configured prompt file (see the [prompts] table); overrides keep the required context wrapper. The prompt builders live in diamond_dev/commands.py:

  • initial_implementation_prompt: asks each configured implementer to implement the plan and commit without pushing.
  • comparison_implementation_prompt: asks the configured comparison fixer to apply requested follow-up changes from the comparison.
  • review_judgment_prompt: asks the configured review judge to classify review findings and write <slug>-review-judgments.json.
  • review_fix_prompt: asks the configured review fixer to implement accepted review fixes, preferring the JSON sidecar when valid and falling back to legacy markdown judgments when it is absent or malformed.
  • gemini_comparison_prompt: adds required branch, repository, and output-file context to the comparison judge prompt.
  • _fallback_prompt: the built-in comparison judgment prompt used when [prompts].comparison_judgment_file is unset or empty.

Generated Repositories

For a plan named My Plan.md, the command uses the slug my-plan. With the default implementers it creates:

  • codex-my-plan on branch codex/my-plan
  • claude-my-plan on branch claude/my-plan
  • <repo-name>.wiki for the GitHub Gollum wiki

For custom implementers, generated implementation clones and branches use the same pattern: <agent-name>-my-plan on branch <agent-name>/my-plan.

The wiki clone is reused if present and synchronized with fast-forward-only pulls. On a fresh run, diamond-dev clones the implementation repository once, makes a preserving local copy for the second agent, then checks out each workflow branch. Implementation clone directories are required on an auto-resume run.

After each implementation clone is prepared on its workflow branch, diamond-dev checks that clone root for package lockfiles. If uv.lock exists, it runs uv sync --locked; if pnpm-lock.yaml exists, it runs pnpm install --frozen-lockfile. Repositories with both lockfiles run both commands in that order in each clone. Repositories with neither lockfile skip package install. These install commands can execute dependency lifecycle scripts from the target repository, so run diamond-dev only against repositories you trust (see Security).

Auto-Resume

diamond-dev does not write checkpoint files. Rerunning the same plan automatically resumes from existing local implementation clones, workflow branch state, wiki artifacts, and PR state.

The source plan file is immutable for resume. Editing the plan after a run starts causes plan drift failure when the wiki or implementation-clone copy no longer matches the source. Use a new plan filename/slug, or reset the generated repositories and wiki artifacts, to start a different plan.

Auto-resume requires every configured implementer clone to exist as a Git repository with the configured repository_url as origin. If only some clones are missing, or workflow branches exist on origin while local clones are missing, the run fails clearly.

Branch resume rules:

  • Remote workflow branches must match the local branch exactly; divergence fails.
  • A zero-commit branch counts as complete only when the matching remote branch exists and matches local.
  • Local commits with no remote branch are pushed instead of rerunning that agent.
  • If only one initial agent branch is incomplete, only that agent is rerun.
  • The default branch may have advanced; diamond-dev does not rebase or merge.

Artifact resume rules:

  • If the wiki comparison page exists, it overwrites local comparison.md.
  • If only local comparison.md exists, it is promoted to the wiki with the acceptance checkbox added when missing.
  • The comparison bundle is reused or promoted alongside the comparison page when present.
  • If only a local review file exists, it is promoted to the wiki.
  • If local and wiki review files both exist and differ, the run fails.
  • A valid local review judgment sidecar and valid wiki sidecar must match after canonical JSON parsing. Missing or malformed sidecars are logged and ignored.
  • Existing review files do not skip the configured review fixer; fixes rerun when resume reaches the review phase.
  • Existing PRs for the selected branch, open, closed, or merged, fail before PR creation.
  • Notifications are sent only for phases completed by the current process.

Acceptance & Review Judgments

Before comparison judgment, diamond-dev writes <slug>-comparison-bundle.md in the invocation directory and wiki. The bundle includes branch metadata, changed-file stats, capped file lists and diffs, configured comparison test results, command log paths, and explicit omitted-file lists. The configured comparison judge must read that bundle and write comparison.md in the invocation directory. The command then appends this default line and pushes the file to the GitHub Gollum wiki as <slug>-comparison.md:

- [ ] Accept: (codex/claude)

The workflow accepts only one of these edited values:

- [x] Accept: codex
- [x] Accept: claude

With custom implementers, the checkbox and accepted values use the configured implementer names, for example - [ ] Accept: (codex/claude/aider).

Malformed acceptance markers fail immediately. The command checks once immediately, then waits 2 minutes, then retries with waits of 3 through 12 minutes.

Review judgment creates a machine-readable sidecar named <slug>-review-judgments.json with schema_version, review_file, review_provider, review_judge, and per-finding id, decision, confidence, and rationale. Valid sidecars are rendered into a deterministic Structured review judgments section in <slug>-review.md; the PR body only includes compact decision counts and any needs_input IDs.

Security

diamond-dev executes code on your machine from two untrusted-by-default sources — the coding agents and the target repository — so run it only against repositories and plans you trust:

  • Agents run with sandbox and approval prompts disabled. Implementers are launched non-interactively with full edit permissions: codex exec --dangerously-bypass-approvals-and-sandbox, claude -p --permission-mode bypassPermissions --dangerously-skip-permissions, and gemini -p … --skip-trust -y. The agents can read and write files and run commands in their clones without prompting.
  • Package install runs repository lifecycle scripts. When a clone contains uv.lock or pnpm-lock.yaml, diamond-dev runs the matching install command, which can execute dependency lifecycle scripts defined by the target repository.
  • Comparison test commands are trusted and run via sh -lc. Any [comparison].test_commands you configure run in each implementation clone.
  • Logs may contain secrets. Loguru exception logs include local variable values by default. Set DIAMOND_DEV_LOG_DIAGNOSE=0 (or false/no/off) to disable this if your logs may capture sensitive values.

Logging

diamond-dev uses Loguru for console, readable text file, and JSONL file logging. Logs are written to stderr, logs/diamond-dev.log, and logs/diamond-dev.jsonl by default.

Agent subprocess logs are written under logs/ and streamed through Loguru. Agents commit their changes; diamond-dev pushes committed work. If uncommitted files remain, they are logged and included in the final PR body.

Each run also writes logs/run-report.json, a structured summary containing the run status, chosen agent, branches, PR URL, dirty-file records, per-phase timings, non-fatal phase warnings, preflight details, and per-step command log paths. The report includes comparison bundle and review judgment sidecar paths, plus the sidecar parse status. Runs that finish after skipped or failed best-effort phases report succeeded_with_warnings and include those warnings in the PR body.

Configure logging with environment variables:

  • DIAMOND_DEV_LOG_LEVEL: Log level for console, text file, and JSONL output. Defaults to INFO.
  • DIAMOND_DEV_LOG_FILE: File path for readable persistent logs. Defaults to logs/diamond-dev.log.
  • DIAMOND_DEV_JSON_LOG_FILE: File path for serialized JSONL logs. Defaults to logs/diamond-dev.jsonl.
  • DIAMOND_DEV_LOG_DIAGNOSE: Whether Loguru should include local variable values in exception tracebacks. Defaults to enabled. Disable with 0, false, no, or off if logs may contain secrets.

File logs rotate at 10 MB, retain rotated files for 30 days, compress rotated logs as zip files, use UTF-8 with fallback escaping, and are created with owner read/write permissions. Exception logs include extended tracebacks. When OpenTelemetry is installed, log records include the active trace ID, span ID, sampled flag, and service name; otherwise those fields are present with default zero or empty values.

Troubleshooting & FAQ

Preflight fails with a missing command. The named CLI is not on PATH. Install it (see Prerequisites) or, if you don't use it, remove the corresponding agent from [workflow]. Only the CLIs for configured adapters are checked.

Preflight fails on gh auth status. Authenticate the GitHub CLI with gh auth login (or set GH_TOKEN) before running.

"Plan drift" failure. The source plan was edited after a run started, so it no longer matches the copy stored in the wiki or an implementation clone. The plan is immutable for resume — start over with a new plan filename/slug, or reset the generated repositories and wiki artifacts. See Auto-Resume.

The run exited while waiting for acceptance. That's fine. Edit the acceptance checkbox in the wiki, then rerun diamond-dev my-plan.md; it auto-resumes and picks up your choice.

A run finished with succeeded_with_warnings. One or more best-effort phases (such as a notification or an optional test command) were skipped or failed but did not block the workflow. The specific warnings are listed in logs/run-report.json and the PR body.

A PR already exists for the selected branch. Auto-resume fails before PR creation if any PR (open, closed, or merged) already exists for the accepted branch. Resolve or rename the branch, or start a new plan slug.

Where are the artifacts? Comparison bundle, comparison page, review file, and review-judgment sidecar are written in the invocation directory and pushed to the GitHub wiki. Per-run logs and run-report.json are under logs/.

License

diamond-dev is (C) 2026 Harold Martin and licensed under the Apache License, Version 2.0. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diamond_dev-0.1.0.tar.gz (56.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diamond_dev-0.1.0-py3-none-any.whl (60.6 kB view details)

Uploaded Python 3

File details

Details for the file diamond_dev-0.1.0.tar.gz.

File metadata

  • Download URL: diamond_dev-0.1.0.tar.gz
  • Upload date:
  • Size: 56.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for diamond_dev-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c53db52f6410727349061427c54f70450cae162220bb2b90fc593a556ed185f3
MD5 474e9d9a5e5b6046daaccc665fb3f64f
BLAKE2b-256 6367b724a5a331e9fd62bc097578789d22d707acc7e83215123cb4169e9d922f

See more details on using hashes here.

File details

Details for the file diamond_dev-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: diamond_dev-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 60.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for diamond_dev-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 77cb613821e289153c3a1d8a2db3c00770a7f0c102566aa2c3fa805c7bf8c6fa
MD5 7b8d954ef75e10f8fd9acb1e1170ef6c
BLAKE2b-256 73197a07a86a24c26e5ad1f87d2104193492196ffd25b53c372942c3a9553531

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page