Multi-agent adversarial review & deliberation for plans/specs on subscription CLIs (reduce rework before execution)
Project description
challenge-plans
中文文档: README-zh.md
Adversarially review a plan before you execute it — across the AI coding CLIs you already have logged in. No API keys.
challenge-plans orchestrates the subscription CLIs on your machine (Claude Code, Codex, …) to cross-examine a plan and surface the flaws that cause rework — then aggregates only the objections that survive scrutiny into a single verdict. It runs as a CLI, an agent skill, or a CI gate, and slots into the superpowers plan lifecycle: writing-plans → challenge-plans → executing-plans.
Why challenge-plans
- 🔑 No API keys, no per-token charges — it drives the subscription CLIs you're already logged into (Claude Code, Codex). Bring at least one.
- 🧪 Evidence beats headcount — a minority objection with a reproduction can override a majority; correctness isn't decided by voting.
- 🤝 Cross-family verification — an objection only earns hard-gate authority (
✓) when an independent model family reproduces it with line-anchored evidence. Single-model claims stay advisory. - 🛡️ Guards 7 multi-agent failure modes — vote loss, option anchoring, premature hand-off, majority-over-minority, single-round complacency, false consensus, false convergence (how).
- 🌍 Speaks your language —
--lang zh(orja,de, …) localizes every finding; one flag, no separate build.
What it reviews
You don't have to write specs to use this. Three things it does:
- 📋 Any plan — a trip, a launch, a hire, a move.
--type planchecks it for the ways plans go wrong (more below). - 📝 A drafted spec or design doc before you build —
--type spec. - 🔧 A code change as a lightweight review —
--type diff. And when you're torn between options,weighvotes across them with weighted, dissent-exposing deliberation.
Quickstart
Easiest — hand it to your agent. Tell it:
Install challenge-plans from https://github.com/hiadrianchen/challenge-plans and run
challenge-plans doctor.
It sets up the package and reports which backends are ready.
Or install it yourself (Python ≥ 3.10):
pip install challenge-plans # or: pipx install challenge-plans · uvx challenge-plans doctor
challenge-plans doctor # which CLIs are logged in
Bring at least one logged-in CLI — Claude Code (claude) or OpenAI Codex (codex); two different vendors unlock cross-family verification. To use it as an agent skill, drop SKILL.md where your agent discovers skills. (Developing? git clone … && pip install -e ..)
See it work
Here's the "any plan" scenario — one of the three above — on a rough Kyoto-trip plan (examples/plan-sample.md):
$ challenge-plans run examples/plan-sample.md --type plan --sink markdown
# challenge-plans · challenge · verdict: request_changes
- [high✓] Non-refundable flights locked before validating the trip @L10 (irreversibility_or_high_cost, by claude:feasibility)
- [med ] Day 2 packs six sights across the city — likely undoable @L4 (ignored_constraint, by gpt:risk)
- [med ] "A good trip" is never defined, so nothing can be judged @L1 (missing_success_criteria, by claude:goal-alignment)
Each line is a tagged objection — anchored to a line, raised by one reviewer with a single job. The three findings above came from three reviewers:
- Feasibility (can it actually be done?) → caught the non-refundable flight locked in before the plan is even validated.
- Risk (what's most likely to go wrong / is irreversible?) → caught Day 2 cramming six sights across the city with no time budget.
- Goal-alignment (do the steps serve the goal?) → caught that "a good trip" is never defined, so nothing can be judged.
Every objection is tagged from a fixed menu of ways a plan breaks — which keeps findings concrete and de-duplicable. Here's what each one catches, in this trip:
irreversibility_or_high_cost— booking a non-refundable flight before validating the planignored_constraint— six sights in one day, no time or energy budgetmissing_success_criteria— never saying what "a good trip" meansdependency_or_sequencing_gap— a 10:00 train right after "last-minute shopping"unaddressed_risk— going in mid-July with no rain/heat backupunstated_assumption— assuming the famous kaiseki place has a tableno_fallback— no plan B if that restaurant is booked outgoal_misalignment— a "relaxing" trip scheduled dawn to midnight
--profile fast runs one reviewer, standard all three, deep several rounds until no new objection survives.
Usage
As a skill, you don't memorize flags — just ask your agent. Say "review this plan with challenge-plans", "is this spec ready to build?", or even "how do I use challenge-plans?" — it picks the mode and command for you and brings back the surviving objections.
Running it directly? Here's the map:
| I want to… | Run |
|---|---|
| See which backends are ready | challenge-plans doctor |
| Review any plan | challenge-plans run trip.md --type plan --sink markdown |
| Review a spec before building | challenge-plans run spec.md --type spec --sink markdown |
| Review a code change | git diff > c.diff && challenge-plans run c.diff --type diff |
| Choose among options | challenge-plans weigh options.yaml --sink markdown |
| Get findings in Chinese | add --lang zh |
| Use it as a CI gate | add --enforce (non-approve exits non-zero) |
--profile fast|standard|deep trades speed for depth. [sev✓] = cross-family verified (may hard-gate); [sev?] = unverified, advisory. Ready-to-run samples live in examples/. Not pip-installed? Prefix with PYTHONPATH=src python3 -m challenge_plans.cli ….
Output in your language
The codebase is English, but reviewers can answer in any language — add --lang:
challenge-plans run plan.md --type plan --lang zh # findings in Chinese
challenge-plans weigh options.yaml --lang ja # deliberation in Japanese
It switches only the human-readable prose; JSON keys, enum values, and line anchors stay verbatim, so parsing and CI gates are unaffected (equivalent to setting CHALLENGE_PLANS_LANG). Your agent can pass --lang <user-language> to localize the whole review.
How it works
Multiple persona/CLI challengers steelman then attack the artifact; a cross-family Verifier must reproduce a high/critical objection with line-anchored evidence before it can hard-gate; findings are de-duplicated and resolved into one 6-state verdict — with an incomplete panel never passing as approve. The full mechanism, the two modes, the three-phase deliberation flow, and the 7 failure modes are in docs/how-it-works.md. It also composes with superpowers and grill-me — see there.
Backends
Drives whatever subscription coding CLI you already have logged in — Claude Code (claude) or OpenAI Codex (codex); not tied to any one. Two different vendors cross-verify findings; with one, results stay advisory. No API keys, no per-token charges from this tool. It needs at least one logged-in CLI — challenge-plans doctor names each backend's state and the exact fix (install, or log in).
Status
v1 — usable. Both modes work end-to-end, validated against real plans/specs, pinned by a pytest suite, and hardened across multiple cross-agent adversarial-review rounds (the README itself included). Known boundaries are listed in docs/how-it-works.md.
Contributing
Issues and PRs welcome — see CONTRIBUTING.md. The project is dogfooded: review your own change with challenge-plans run <change>.diff --type diff before opening a PR.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file challenge_plans-0.1.1.tar.gz.
File metadata
- Download URL: challenge_plans-0.1.1.tar.gz
- Upload date:
- Size: 46.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0f5bb0119b5c6d3034717b5acd14eb438753508129720c98cc6d21e8b382920
|
|
| MD5 |
6717490a3582cb35047a2bcb314b04fc
|
|
| BLAKE2b-256 |
795f1daaa7f46029bbb7300528baeb3e7ffa5c3487cf85e1863b52973bd7fe61
|
Provenance
The following attestation bundles were made for challenge_plans-0.1.1.tar.gz:
Publisher:
release.yml on hiadrianchen/challenge-plans
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
challenge_plans-0.1.1.tar.gz -
Subject digest:
d0f5bb0119b5c6d3034717b5acd14eb438753508129720c98cc6d21e8b382920 - Sigstore transparency entry: 1937445757
- Sigstore integration time:
-
Permalink:
hiadrianchen/challenge-plans@cfd05580a2e0d401bb503d38787100bbb91e21f2 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/hiadrianchen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@cfd05580a2e0d401bb503d38787100bbb91e21f2 -
Trigger Event:
push
-
Statement type:
File details
Details for the file challenge_plans-0.1.1-py3-none-any.whl.
File metadata
- Download URL: challenge_plans-0.1.1-py3-none-any.whl
- Upload date:
- Size: 37.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87771e09d67ff0e15c579d67912e58be97e1e431a53ce2ce92705bb0e1bb8a85
|
|
| MD5 |
b53ac015e84553ef19056f9e0f81f3f7
|
|
| BLAKE2b-256 |
f1f13ab23307fb06b4f69b3ad1abc83aa52c3e2558f305fdf9cde383d2b0783c
|
Provenance
The following attestation bundles were made for challenge_plans-0.1.1-py3-none-any.whl:
Publisher:
release.yml on hiadrianchen/challenge-plans
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
challenge_plans-0.1.1-py3-none-any.whl -
Subject digest:
87771e09d67ff0e15c579d67912e58be97e1e431a53ce2ce92705bb0e1bb8a85 - Sigstore transparency entry: 1937445884
- Sigstore integration time:
-
Permalink:
hiadrianchen/challenge-plans@cfd05580a2e0d401bb503d38787100bbb91e21f2 -
Branch / Tag:
refs/tags/v0.1.1 - Owner: https://github.com/hiadrianchen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@cfd05580a2e0d401bb503d38787100bbb91e21f2 -
Trigger Event:
push
-
Statement type: