Skip to main content

Track the promises a codebase makes and back them with runnable evidence.

Project description

promise

Track the promises a codebase makes and back each one with runnable evidence.

The core idea: a promise ("a failed update doesn't corrupt the user's list", "checkout is fast", "no insecure dependencies") is only as real as the mechanical check behind it. So if an agent or a human deletes the test backing a promise, the promise visibly goes red — and verify tells you exactly which higher-level promises that breaks.

It's language-agnostic (verifiers are just shell commands) and designed to be driven by coding agents that want to know what they might break before and after a change.

Install

uv sync
uv run promise --help

Model

Everything lives in a single, human-readable promises.yaml, but the CLI is the safe write path — it owns id generation and reference integrity.

A node is one promise:

field meaning
id stable UUID (CLI-generated); reference it by id-prefix or title substring
title the promise, in prose
parent id of the parent promise, or null for a root
affects free-form tokens — file paths or concepts (e.g. "SOC2 compliance")
verifiers zero or more shell checks backing the promise

A node with no verifiers is an internal node; its verdict is purely the rollup of its children. Rollup is a pure AND: a node passes only if all its own verifiers pass and all its children pass.

A verifier:

field meaning
run shell command; exit 0 = pass, anything else = fail
scope repo · artifact · deploy · runtime · external (see below)
blocking whether a failure affects the exit code (defaults per scope)
cwd working dir, relative to the repo root (defaults to the repo root)
timeout seconds before the command is killed and marked failed (default 300)

scope is metadata: it never changes a verdict, it only seeds the default blocking bit. repo/artifact default to blocking; deploy/runtime default to non-blocking; external must set blocking explicitly.

Verdicts

Binary: pass / fail. The why of a failure (a deleted test, a missing binary, an assertion) lives in the captured detail, not in extra verdict states. Good signal out requires good verifiers in — point each verifier narrowly at the evidence it asserts.

Commands

promise add "<title>" [--parent REF] [--verify "<cmd>" --scope SCOPE [--blocking/--no-blocking]] [--affects TOKEN ...]
promise rm REF [--cascade]
promise mv REF --parent REF|root          # guarded against cycles
promise edit REF [--title ...] [--affects TOKEN ...]
promise check                             # lint: dangling parents, cycles, unbacked nodes
promise verify [REF] [--json] [--full] [--junit PATH] [--fail-on blocking|any] [--workers N]
promise impact <file-or-concept> [--json]

REF is an id, a unique id-prefix, or a unique title substring.

Verifiers run concurrently (they are subprocess-bound); --workers N caps the pool and --workers 1 forces serial. Rollup is deterministic regardless of completion order.

verify output formats

  • plain text (human): a one-line status plus the root-cause failures.
  • --json (default for agents): just verdict, exit, and the root-cause failed list — each with the breaks it causes (rendered as titles) and a detail.
  • --json --full: the complete promise.verify/v1 tree, every node with its verifiers, failing_leaves (down) and breaks (up).
  • --junit PATH: JUnit XML so promises show up in CI test dashboards. Each promise with verifiers is a <testcase>; a red one is a <failure> carrying the detail and the breaks chain. Internal (verifier-less) nodes are omitted.

Compact example, after a backing test was deleted:

{
  "verdict": "fail",
  "exit": 1,
  "failed": [
    {
      "id": "4c9f1a77-...",
      "title": "A failed update doesn't corrupt the list",
      "breaks": ["A user's data is never silently lost or corrupted",
                 "Users can manage their todo list and trust it"],
      "detail": "ERROR: file or directory not found: tests/test_storage_rollback.py"
    }
  ]
}

breaks is the headline for agents: it answers "what higher-level promise did I just make unfulfillable?" without walking the tree.

Exit codes

verify reports a fact: 0 = nothing failed, 1 = something failed. What to do about it is policy: by default only blocking failures gate (exit 1); --fail-on any widens the gate to every failure.

impact

impact is a fast, static pre-flight: it matches a file or concept against every node's affects (and verifier strings). Path-like tokens are normalised to repo-relative POSIX form, so impact storage.py finds a promise whose affects lists todo-app/storage.py.

Limitation: impact only matches the literal tokens authors record — it does not trace the implementation files a test depends on (there is no coverage analysis). For that, rely on post-hoc verify, or run coverage yourself.

Claude Code skill

skills/promise/SKILL.md is an installable Claude Code skill that teaches an agent to use promise effectively — chiefly to run impact before touching a file and verify before deleting a test. Copy the skills/promise/ directory into a skills directory Claude Code reads (e.g. ~/.claude/skills/ or a project's .claude/skills/).

Self-hosted promises

This repo keeps its own promises.yaml: every promise is backed by promise-cli's own test suite and lint. Run promise verify here to check the tool still keeps its own promises — and watch a promise go red if you delete the test behind it.

Development

uv run pytest          # tests
uv run ruff check .    # lint
uv run ruff format .   # format

This project was built as an experiment under agent-experiments.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pinkyswear-0.1.1.tar.gz (73.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pinkyswear-0.1.1-py3-none-any.whl (24.7 kB view details)

Uploaded Python 3

File details

Details for the file pinkyswear-0.1.1.tar.gz.

File metadata

  • Download URL: pinkyswear-0.1.1.tar.gz
  • Upload date:
  • Size: 73.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pinkyswear-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3ce54681b9d6a4001bf4f03019ffc37835eca8b8f9e28e1e3670e878547c0bf8
MD5 501019c66f958f3dd43aee4f9df51cd7
BLAKE2b-256 0583e90b29421a43a9609626f21868b63808e053252ddc5d3c6fc53eb3a0e1bc

See more details on using hashes here.

Provenance

The following attestation bundles were made for pinkyswear-0.1.1.tar.gz:

Publisher: release.yml on Data-Simply/pinkyswear

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pinkyswear-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pinkyswear-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 24.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for pinkyswear-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 006c5f2d078b4fd0a5f3ea3f4f6b00ab6db30bd481fa920e293ec768afd6ad32
MD5 d94a57c490ff4795f0218ee108961474
BLAKE2b-256 79da4e91e796ce5780640847dd84064d7824364f4cf145182f2397444c4466d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for pinkyswear-0.1.1-py3-none-any.whl:

Publisher: release.yml on Data-Simply/pinkyswear

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page