Evidence-first rails for agentic software delivery: Markdown roadmaps, paired evidence, and a machine-verified commit gate.
Project description
Delivery Workbench
Delivery Workbench is a planning and commit gate system for Git repositories where AI agents do much of the work. It addresses two problems: agents claim work is done when it is not, and months later nobody can tell what a commit shipped or what tested it.
Plans are Markdown files in the repo, organized as phases and stories. A story cannot be marked done until a command run is recorded in its evidence file. A commit cannot land until a pre-commit hook checks a contract whose facts (branch, HEAD, staged tree) are stamped and re-verified. Each commit carries trailers naming the story it shipped and the contract that certified it. State is Markdown files and git data; there is no database or server.
Humans and agents use the same commands. Agents can also use the included MCP server.
Install
pipx install delivery-workbench
# or
brew install karolswdev/tap/delivery-workbench
Then set up any Git repository:
dw install /path/to/repo --skip-bootstrap
This copies the hooks, the CLI, and the MCP server into the repo's
.githooks/ directory and points core.hooksPath at it. Commits
are gated by the copy inside the repo, not by the global install.
dw update /path/to/repo refreshes the copy;
dw update /path/to/repo --check reports if it is stale.
For a project with existing history, there is an adoption flow that inspects the repo and drafts a roadmap for you. See the framework README.
The daily loop
.githooks/dw next # what should I work on?
.githooks/dw story status myapp 2 3 in-progress
# ... do the work ...
.githooks/dw evidence capture myapp 2 3 -- npm test
.githooks/dw story status myapp 2 3 done # refuses if no evidence exists
git add -A
.githooks/dw contract new # stamps verified facts into .tmp/CONTRACT.md
# read the contract, verify each rule actually holds, check its boxes
git commit # the hook re-verifies everything
Checking the contract's boxes is deliberately manual: it is the attestation that each rule was verified. No command or tool does it.
sequenceDiagram
participant Dev as Human or agent
participant DW as dw CLI
participant Git as git commit
participant Gate as pre-commit gate
Dev->>DW: dw story status ... in-progress
Dev->>Dev: do the work
Dev->>DW: dw evidence capture ... -- <verify command>
Dev->>DW: dw story status ... done (refuses without evidence)
Dev->>DW: dw contract new (stamps verified facts)
Dev->>Dev: verify each rule, check its boxes
Dev->>Git: git commit
Git->>Gate: re-derive every stamped fact
Gate-->>Git: pass, or block naming the failed rule
Git->>Git: stamp PMO trailers, archive the contract
Tracing a commit
The artifact chain:
flowchart LR
C[commit + PMO trailers] --> S[story file]
S --> E[evidence file with captured runs]
C --> A[archived contract in .git]
P[current-phase-status] --> S
E -.proves.-> S
This repository uses its own gate, so the chain can be inspected here. One commit:
$ git log -1 --format='%h %s%n%(trailers:key=PMO-Story)%(trailers:key=PMO-Contract-Digest)' ec1fb4a
ec1fb4a Complete WLA-10-03: guarded mutation tools on the MCP surface
PMO-Story: WLA-10-03
PMO-Contract-Digest: sha256:2700dd6a9c8e8ee8ec6053e7a741ace4123ba6750b8946bf2331af9ecadc3777
The trailer names the story. The story file states the acceptance criteria. Its paired evidence file contains the recorded run that justified marking it done, including the exact command, exit code, and staged-tree hash at capture time:
### Captured run — 2026-07-03T19:59:44Z
- **Command:** `bash -c ... bash pmo-roadmap/tests/mcp-server.sh; python3 pmo-roadmap/tests/dw-core-tests.py ...`
- **Exit code:** 0
- **Index-tree:** b1c5aaa6e7845d8143d9f3cf24c039d491e7e1fd
The certified contract is archived under
.git/pmo-contract-archive/<sha>. Because hooks only run where they
are installed, dw verify re-checks the structural rules from
pushed history, and CI catches commits that bypassed a local gate:
$ .githooks/dw verify --all
dw verify: ok (45 commits verified, 17 pre-epoch skipped)
The CLI
| Command | What it does |
|---|---|
dw next |
The next actionable story. Exit 0 found, 2 nothing to do. |
dw context --compact |
JSON snapshot of the roadmap: issues, warnings, next story, trace paths. |
dw check |
Lints roadmap structure and evidence content. Greppable errors, exit 1 on issues. |
dw story status <p> <ph> <st> <status> |
Updates a story's status transactionally. Refuses done without evidence. |
dw evidence capture <p> <ph> <st> -- <cmd> |
Runs the command and records it into the story's evidence file. |
dw contract new |
Writes .tmp/CONTRACT.md with stamped, machine-verified facts. |
dw gate |
Dry-runs the commit gate against the current stage. |
dw verify [--all] |
Re-checks the gate's structural rules over pushed history. |
dw phase create, dw story create |
Scaffolding for new roadmap work. |
dw doctor |
Checks the wiring in this clone. |
All commands have stable exit codes. The orientation commands
support --json or --porcelain output.
The MCP server
dw install also vendors .githooks/dw-mcp and writes an entry into
the repo's .mcp.json, which Claude Code and other MCP clients pick
up automatically. The server exposes nine tools backed by the same
code as the CLI: dw_context, dw_next, dw_check, dw_doctor,
dw_verify, dw_gate, dw_story_status, dw_evidence_capture, and
dw_contract_new.
An agent can take a story from backlog to done through tool calls alone, with the same refusals the CLI gives. Two operations are deliberately absent: certifying a contract and creating a commit. Schemas and design are in docs/mcp.md.
The web view
dw-workbench --root /path/to/repo serves a localhost-only page for
browsing the roadmap: phase tables, story and evidence pairs, a
health console, and the trace from a story to the commits that
shipped it. It can edit roadmap files through a guarded
preview-then-apply flow. It never stages or commits.
More screenshots and two terminal recordings are in demos/.
Other components
- Local work logs: consent-gated daily notes of what each commit delivered.
- A Claude Code plugin with slash commands and a skill covering the operating loop.
- A managed
CLAUDE.mdblock installed into adopted repos. - A copyable
verify-historyCI job that re-checks pushed history on every pull request.
This repo runs on it
Every phase and story of the framework was shipped through its own
gate: ten phases, each story with evidence, every commit with
trailers and an archived contract, the full history passing
dw verify --all. The trail is in
pmo-roadmap/pm/roadmap/work-log-automation/.
Documentation
- Architecture, with the test that proves each claim
- Framework README: install, update, adopt, operate
- The contract rules
- Remote verification design
- Contribution rails: what survives a pull request
- MCP surface design
- Riders: the symbiosis contract: one brief, every agent surface (Claude Code, Codex, pi, HoldSpeak)
- The Phase 12 journal: the worked example — a phase delivered on its own rails, written in the moment, refusals and dead ends included
- Distribution design
- Contributing and changelog
Tests
The suites live in pmo-roadmap/tests/ and run standalone. CI runs
all of them on ubuntu and macos, the unit suite on python 3.9 (the
floor), and history verification on every push.
License
MIT. Current version: 1.7.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file delivery_workbench-1.9.0.tar.gz.
File metadata
- Download URL: delivery_workbench-1.9.0.tar.gz
- Upload date:
- Size: 138.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb33fd22c5164fc0b34eed990888af24517addd824909974ed38dbc73192325d
|
|
| MD5 |
28f29ffe7a3ff875bbe89da3c8fe6b9f
|
|
| BLAKE2b-256 |
6307fa4092a10d54eb2351ca2757dd1fb28e7f83cffb58cc4c694f27461fb6c1
|
Provenance
The following attestation bundles were made for delivery_workbench-1.9.0.tar.gz:
Publisher:
release.yml on karolswdev/delivery-workbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
delivery_workbench-1.9.0.tar.gz -
Subject digest:
fb33fd22c5164fc0b34eed990888af24517addd824909974ed38dbc73192325d - Sigstore transparency entry: 2065946611
- Sigstore integration time:
-
Permalink:
karolswdev/delivery-workbench@9302492f94879e8333b8fc4f2041a0dee6ffb77b -
Branch / Tag:
refs/tags/v1.9.0 - Owner: https://github.com/karolswdev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9302492f94879e8333b8fc4f2041a0dee6ffb77b -
Trigger Event:
release
-
Statement type:
File details
Details for the file delivery_workbench-1.9.0-py3-none-any.whl.
File metadata
- Download URL: delivery_workbench-1.9.0-py3-none-any.whl
- Upload date:
- Size: 245.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9107260e16f7f36655e6c0ba9f97c6cd6e714e277c9ad9362bc1775002031460
|
|
| MD5 |
580b18daa081ee90c2c48c5d2621f5ee
|
|
| BLAKE2b-256 |
e3e8c50f146ff175edfe75a3cfe5b5bda94735cfe74f57323e3ff67f1c60cb85
|
Provenance
The following attestation bundles were made for delivery_workbench-1.9.0-py3-none-any.whl:
Publisher:
release.yml on karolswdev/delivery-workbench
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
delivery_workbench-1.9.0-py3-none-any.whl -
Subject digest:
9107260e16f7f36655e6c0ba9f97c6cd6e714e277c9ad9362bc1775002031460 - Sigstore transparency entry: 2065946642
- Sigstore integration time:
-
Permalink:
karolswdev/delivery-workbench@9302492f94879e8333b8fc4f2041a0dee6ffb77b -
Branch / Tag:
refs/tags/v1.9.0 - Owner: https://github.com/karolswdev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9302492f94879e8333b8fc4f2041a0dee6ffb77b -
Trigger Event:
release
-
Statement type: