AI development governance engine — prevent hallucinated completion and vanishing intent in agent-assisted software teams.

These details have not been verified by PyPI

Project description

RunProof

Make AI prove it works.

RunProof is a verification engine for AI-driven development.

It prevents agents from claiming work is complete without real execution evidence.

If a command didn’t run — or failed — the system blocks progress.

Why not existing SDD frameworks?

Feature	RunProof	Spec-Kit	OpenSpec	BMAD	Agent Teams
Spec-driven workflow	✅	✅	✅	✅	⚠️ Partial
Structured artifacts	✅	✅	✅	✅	⚠️ Partial
Persistent state (on disk)	✅	❌	⚠️ Partial	⚠️ Partial	❌
Execution verification	✅	❌	❌	❌	❌
Blocks fake completion	✅	❌	❌	❌	❌
Command execution evidence	✅	❌	❌	❌	❌
Checksum validation	✅	❌	❌	❌	❌
Enforced workflow transitions	✅	⚠️ Soft	⚠️ Soft	⚠️ Soft	❌
CI / Git hook enforcement	✅	❌	❌	❌	❌
Anti-hallucination guarantees	✅	❌	❌	❌	❌

⚠️ Partial = supported conceptually but not enforced or persisted

🚫 The problem

AI says:

"Done"

Reality:

npm test
# FAIL

✅ With RunProof

runproof verify --require-execution-evidence

Result:

BLOCKED: command failed (exit_code=1)

The system forces real execution before accepting completion.

Why It Exists

AI coding agents are excellent at producing code. They are much weaker at:

remembering why a decision was made after the chat scrolls away
proving that tests ran, rather than saying they did
keeping specs in sync when behavior changes

RunProof turns "the agent said it passed" into structured, checksummed evidence stored in your repository. No lock-in. No operating-system lock-in. The workflow state is explicit, versioned, and repository-native in .runproof/state.json.

Try It Now (30 seconds, no setup)

Runs the full Golden Path in a temp directory and cleans up after itself:

npx -y runproof-cli@latest demo

What you will see:

── Step 1/7: runproof init
   ✓ Initialized .runproof/ (adapters, agents, profiles, schemas, skills, specs)
── Step 2/7: runproof new demo-harden-login --profile quick
   ✓ Created .runproof/changes/demo-harden-login/ (proposal.md, tasks.md, verification.md)
   ✓ Phase automatically recorded → propose
── Step 3/7: Agent fills proposal.md → status: ready
── Step 4/7: Agent closes tasks.md → status: ready
── Step 5/7: runproof transition demo-harden-login task
   ✓ Phase recorded in .runproof/state.json → task
── Step 6/7: runproof verify --command 'echo all-tests-pass'
   ✓ Command executed; output checksummed → .runproof/evidence/
   ✓ verification.md updated automatically → status: verified
── Step 7/7: runproof transition archive  &&  runproof archive
   ✓ Change closed → .runproof/archive/2026-05-05-demo-harden-login/
── runproof validate
   ✓ Repository governance passed — zero errors

Install

uv tool install runproof-cli

Or with pipx:

pipx install runproof-cli

Or via the Node wrapper:

npm install -g runproof-cli

Or one-shot:

npx -y runproof-cli@latest version

Or from source (requires Python 3.11+):

uv tool install runproof-cli --from git+https://github.com/sebamar88/RunProof.git

User Guide

Detailed bilingual user guides live in Notion:

RunProof User Guide

The guide includes:

Quick Start for Engineers
Team Workflow Guide
Production Rollout Guide

Each guide is split into EN and ES subpages.

Golden Path: Guard A Real Change

1) Initialize your repository

runproof init --root my-app
runproof validate --root my-app

2) Open a governed change

runproof run harden-login-rate-limit --profile standard --title "Harden login rate limits" --root my-app

Result: .runproof/changes/harden-login-rate-limit/ with six artifacts — proposal.md, delta-spec.md, design.md, tasks.md, verification.md, archive.md.

run creates the change if needed, reads the artifact state, names the current phase, and tells the agent the next allowed action.

3) Give the agent a repository contract

Instead of "fix login security", point the agent at the change folder:

Run `runproof run harden-login-rate-limit --root .` before each handoff.
Follow the phase it reports.
After each completed artifact phase, record it with `runproof transition`.
Do not archive until `runproof run` reports `sync-specs` or `archive`.

The agent now has a repository contract, not just a chat instruction.

4) Block fake completion

runproof check harden-login-rate-limit --root my-app

Open tasks or missing evidence will surface here. The change cannot close cleanly.

5) Record state, verify, sync, and archive

runproof transition harden-login-rate-limit specify --root my-app
runproof transition harden-login-rate-limit design --root my-app
runproof transition harden-login-rate-limit task --root my-app
runproof verify harden-login-rate-limit --command "pytest -q" --root my-app
runproof transition harden-login-rate-limit archive-record --root my-app
runproof transition harden-login-rate-limit sync-specs --root my-app
runproof sync-specs harden-login-rate-limit --root my-app
runproof archive harden-login-rate-limit --root my-app
runproof validate --root my-app

Outcome: the code change, specs, executed verification evidence, state transitions, checksums, and archive record all stay in the repo.

Orchestrator API (Python)

The CLI is not the only binding layer. The Python core exposes a strict workflow object for tools, adapters, and IDE integrations:

from runproof import SDDWorkflow, WorkflowPhase

workflow = SDDWorkflow("my-app")

result = workflow.run(
    "harden-login-rate-limit",
    profile="standard",
    title="Harden login rate limits",
)

if result.state.phase == WorkflowPhase.PROPOSE:
    print(result.state.next_action)

transitioned = workflow.transition("harden-login-rate-limit", WorkflowPhase.SPECIFY)
verified = workflow.verify(
    "harden-login-rate-limit",
    commands=["pytest -q"],
    require_command=True,
)

transition(), verify(), sync_specs(), and archive() refuse invalid phase order. verify executes commands and stores reproducible logs under .runproof/evidence/ with SHA-256 checksums. That is the difference between SDD helpers and SDD enforcement.

WorkflowEngine — Agent Execution Loop

WorkflowEngine.next_step() gives agent integrations everything they need in a single call — no N+1 lookups:

from runproof import WorkflowEngine

engine = WorkflowEngine("my-repo")
step = engine.next_step("harden-login-rate-limit")

# EngineStep(
#   phase=WorkflowPhase.TASK,
#   next_action="Complete tasks.md, close all task checkboxes, and set status to ready.",
#   suggested_command="runproof transition harden-login-rate-limit task",
#   allowed_commands=[],
#   blocking_findings=[],
# )

# Agent-driven loop:
while not step.is_complete and not step.is_blocked:
    agent_do_work(step.next_action)
    run_command(step.suggested_command)
    step = engine.next_step("harden-login-rate-limit")

engine.guard(change_id, "archive") checks the phase gate only. engine.execute(change_id, "archive") checks the gate and executes. engine.allowed_commands(change_id) returns the gated commands that would pass right now.

Hard Enforcement

RunProof can enforce governance at git/CI boundaries:

runproof guard --root my-app --require-active-change --strict-state
runproof guard --root my-app --require-execution-evidence
runproof install-hooks --root my-app

guard fails when:

the repository foundation is invalid
a workflow is blocked
an archived delta was not synced into living specs
the policy requires an active .runproof/changes/* record and none exists
strict state finds stale artifact checksums
execution evidence is required but missing

install-hooks writes a pre-commit hook that runs:

runproof guard --require-active-change --strict-state

That makes ungoverned commits fail locally. CI can run the same guard command server-side.

Reference

Command Guide

runproof version
runproof demo
runproof init --root <path>
runproof validate --root <path>
runproof status --root <path>
runproof new <change-id> --profile <profile> --title "Human intent" --root <path>
runproof run <change-id> --profile <profile> --title "Human intent" --root <path>
runproof transition <change-id> <phase> --root <path>
runproof verify <change-id> --command "pytest -q" --root <path>
runproof guard --require-active-change --strict-state --root <path>
runproof install-hooks --root <path>
runproof check <change-id> --root <path>
runproof sync-specs <change-id> --root <path>
runproof archive <change-id> --root <path>
runproof phase <change-id> --root <path>
runproof log <change-id> --root <path>

Add --trace to any command for component-level diagnostic output:

runproof --trace transition my-change specify --root .
# [TRACE] REGISTRY     → transition my-change → specify
# [TRACE] REGISTRY     → require_phase my-change expected=propose
# [TRACE] INFERENCE    → workflow_state my-change

Current Status

Current release: v0.28.1

Production-ready:

contract-tested modular architecture
execution-verified workflows with checksummed evidence
strict guard enforcement for CI and git hooks
--trace mode for component-level debugging
Golden Path demo and CLI tooling

Influences And Attribution

RunProof is original work, informed by MIT-licensed workflow ideas from:

Attribution and compatibility notes are in NOTICE.md.

License

RunProof is released under the MIT License.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.28.1

May 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

runproof_cli-0.28.1.tar.gz (115.6 kB view details)

Uploaded May 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

runproof_cli-0.28.1-py3-none-any.whl (114.8 kB view details)

Uploaded May 8, 2026 Python 3

File details

Details for the file runproof_cli-0.28.1.tar.gz.

File metadata

Download URL: runproof_cli-0.28.1.tar.gz
Upload date: May 8, 2026
Size: 115.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for runproof_cli-0.28.1.tar.gz
Algorithm	Hash digest
SHA256	`ad1307d804a774c83c37472678efc28237a296f3bde4fb0c0a83bff111ec8094`
MD5	`2f2694bd46fc175821816ee2f57dea85`
BLAKE2b-256	`6faf63a7ef35a2c227ba3889dc2e78a4e0b2092a1c01c489f3d44c471f520488`

See more details on using hashes here.

Provenance

The following attestation bundles were made for runproof_cli-0.28.1.tar.gz:

Publisher: pypi-publish.yml on sebamar88/RunProof

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: runproof_cli-0.28.1.tar.gz
- Subject digest: ad1307d804a774c83c37472678efc28237a296f3bde4fb0c0a83bff111ec8094
- Sigstore transparency entry: 1477548287
- Sigstore integration time: May 8, 2026
Source repository:
- Permalink: sebamar88/RunProof@1f57d3d91844fc89bbe586d76c68ee3e43124fbe
- Branch / Tag: refs/tags/v0.28.1
- Owner: https://github.com/sebamar88
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@1f57d3d91844fc89bbe586d76c68ee3e43124fbe
- Trigger Event: release

File details

Details for the file runproof_cli-0.28.1-py3-none-any.whl.

File metadata

Download URL: runproof_cli-0.28.1-py3-none-any.whl
Upload date: May 8, 2026
Size: 114.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for runproof_cli-0.28.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e8e45ae81796f8d6211bc428b8261a850796eef0870828826a544ca8e517fd16`
MD5	`ca65e583e5a57696c6e4f87c5d6c2415`
BLAKE2b-256	`700dd950f0c319b8d4f260195a99517ad6077d1cd879ea12bc764b92d2bd4b14`

See more details on using hashes here.

Provenance

The following attestation bundles were made for runproof_cli-0.28.1-py3-none-any.whl:

Publisher: pypi-publish.yml on sebamar88/RunProof

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: runproof_cli-0.28.1-py3-none-any.whl
- Subject digest: e8e45ae81796f8d6211bc428b8261a850796eef0870828826a544ca8e517fd16
- Sigstore transparency entry: 1477548457
- Sigstore integration time: May 8, 2026
Source repository:
- Permalink: sebamar88/RunProof@1f57d3d91844fc89bbe586d76c68ee3e43124fbe
- Branch / Tag: refs/tags/v0.28.1
- Owner: https://github.com/sebamar88
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@1f57d3d91844fc89bbe586d76c68ee3e43124fbe
- Trigger Event: release

runproof-cli 0.28.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

RunProof

Why not existing SDD frameworks?

🚫 The problem

✅ With RunProof

Why It Exists

Try It Now (30 seconds, no setup)

Install

User Guide

Golden Path: Guard A Real Change

1) Initialize your repository

2) Open a governed change

3) Give the agent a repository contract

4) Block fake completion

5) Record state, verify, sync, and archive

Orchestrator API (Python)

WorkflowEngine — Agent Execution Loop

Hard Enforcement

Reference

Command Guide

Current Status

Influences And Attribution

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance