Skip to main content

Executable design intent for AI-coded repos: the design document becomes the charter.

Project description

charter — the design document becomes the charter

Your design doc says:

Auth tokens are HMAC, never JWT.

$ charter annotate SPEC.md      # LLM turns prose into enforceable decisions
$ cat CHARTER.md
[D-001] Auth tokens are HMAC, never JWT -> assert: ! grep -rqE "jwt|jsonwebtoken" src
$ charter approve --why "initial review"

Weeks later, an AI agent adds JWT code.

$ charter check
  FAIL D-001 "Auth tokens are HMAC, never JWT"  assert FAILED

The agent fixes it. Nobody was interrupted, nothing was forgotten, and charter trace D-001 shows every file that implements the decision.

That's the whole idea: executable design intent for AI-coded repos — ADRs + linter + traceability, in one file with almost no state.

Status: a proof of concept. Charter is one file exploring a single idea — that an architectural decision should be executable, not just documented. It's deliberately small: no daemon, no service, no config sprawl. If the idea resonates, the interesting work is hardening annotation quality and the enforcer ladder — issues and PRs welcome.

The doctrine in one sentence: a decision with no jurisdiction is not governed. Every decision names an enforcer; a supervise-only decision with neither code citations nor an @ watch scope fails check. CHARTER.md is the constitution, check is the court, audit is the judge for gray areas, and [D-xxx] citations are the map.

Try it in 30 seconds: sh demo/run_demo.sh (on Windows, run it from Git Bash) — an agent adds Supabase to a local-first app and check catches it. No API key needed.

Quick start

See it work first — the demo is offline and needs no API key (on Windows, run it from Git Bash):

git clone https://github.com/cspergel/Charter
cd Charter
sh demo/run_demo.sh        # an agent adds Supabase to a local-first app; check catches it

Then use it on your own repo. There are two ways to run it:

# Option A — zero install. It's one file, zero dependencies (Python 3.10+).
python /path/to/charter.py check

# Option B — install the `charter` command.
pip install charter-intent   # the command is `charter`; `charter` was taken on PyPI
# — or from a local checkout: pip install .
charter check

annotate (turning a prose doc into decisions) needs an LLM backend; point it at Claude Code with no API key required:

export CHARTER_LLM_CMD="claude -p"   # or set ANTHROPIC_API_KEY
charter annotate SPEC.md
charter approve --why "initial review"
charter check
charter doctor                       # checks your setup is sound

State on disk: CHARTER.md (yours), .charter/ledger.jsonl (append-only journal), and .charter/charter.sha (approval hash — commit it). Permission to execute asserts is recorded in a per-user trust store outside the repo (~/.charter/trust, keyed by repo path), so nothing a repo ships can grant itself execution.

Security, in one paragraph

charter check executes shell commands defined in CHARTER.md (the assert enforcers). A freshly cloned repo will not execute its asserts — even if it ships an approval hash, and even if it ships a forged trust marker — until you review CHARTER.md and run charter approve yourself, because the trust record lives in a per-user store outside the repo. CI opts in explicitly with CHARTER_TRUST_ASSERTS=1. annotate/audit send doc and file contents to your configured LLM backend; nothing else makes network calls. Details and threat model: SECURITY.md.

The lifecycle

charter annotate SPEC.md     # LLM reads your prose doc, extracts binding
                             #   decisions, assigns [D-xxx] symbols, proposes
                             #   the lowest viable enforcer per decision,
                             #   writes CHARTER.md + SPEC.annotated.md
                             #   (charter init creates an empty CHARTER.md
                             #   if you'd rather write it by hand)
<review CHARTER.md once>     # the only mandatory human moment: adjust
                             #   enforcers, strengthen supervise items
charter approve --why "..."  # the human gate: journaled, hash-stamped
charter check                # deterministic, free — pre-commit + CI
charter verify               # prove each enforcer is actually live, not theater
charter verify --adversarial # an LLM saboteur tries to bypass each enforcer
charter audit                # judged pass over supervise-tier (PR-time)
charter log [D-001] [--verify]  # the accountability record (tamper-evident)
charter digest [--mark]      # batch-review everything the system did
charter trace D-001          # everything that traces to a decision
charter graph [--json]       # the derived graph (Mermaid / machine-readable)
charter explain D-001        # the full story of one decision
charter doctor               # setup-health checks
charter install-hook         # pre-commit hook + Claude Code settings block

The line syntax

[D-001] title -> assert: <must-pass> !! <proof-must-succeed> @ glob, glob
[D-002] title -> supervise @ src/db/**
  • The enforcer kind is one of the ladder below.
  • !! introduces an assert's tripwire: a probe that must succeed, proving the detector can detect a known violation sample (a typo'd grep path can't pass forever). The canonical pattern is echo <violation-sample> | <the-real-detector>.
  • @ declares watch scope: a human-set jurisdiction floor. Audit reads cited files ∪ watched files, so an uncited violating file is still seen.

The five layers

  1. Annotate — the bootstrap agent. Extraction is conservative (contracts, not preferences — "keep the code simple" is correctly ignored), capped at 15 by default (--cap N), dedupes against decisions already indexed, and annotation of your original doc is non-destructive (writes a .annotated copy with symbols inlined at the source sentences).
  2. Enforce — the ladder, strongest first: structure > type > test > lint > assert > supervise. During review, push supervise items toward the stronger deterministic rungs. check fails on: aspirational decisions (no enforcer), missing enforcer targets, enforcer rot (the #Symbol vanished in a refactor), failing asserts, and blind decisions (supervise-only with no citations and no watch scope). A proposed enforcer that doesn't exist yet is a build obligation — check stays red until the builder creates the type/test, which is governance generating the skeleton of the system. check --budget N warns when judgment-only decisions outgrow the budget (default 5).
  3. Trace — builders leave [D-xxx] citations in comments and commits. The graph is derived from grep on every run, never stored — so it can never go stale.
  4. Superviseaudit judges only the supervise tier, and citations are the scope: the auditor reads exactly the files that claim to implement the decision, plus watched files. Verdicts: COMPLIES (ok line, exit 0), VIOLATES (exit 1 — fix the code), AMBIGUOUS (flagged for digest review). All verdicts land in the ledger. No backend configured → everything AMBIGUOUS, never crashes.
  5. Steer — one optional SessionStart hook injects the whole index (~15 one-liners, a few hundred tokens, once per session) plus the citation protocol. The PreToolUse hook (hook --file) goes further: it blocks an edit before it lands if the proposed content would trip an assert, returning the decision as the reason so the agent self-corrects mid-task — governance inside the loop, not just post-hoc in CI.

LLM backends

Resolution order for annotate and audit:

  1. CHARTER_LLM_CMD — any command that reads the prompt on stdin and prints the reply on stdout. Point it at Claude Code headless: export CHARTER_LLM_CMD="claude -p" — rides your existing plan.
  2. ANTHROPIC_API_KEY — direct API (Sonnet for annotation quality, Haiku for cheap audit verdicts; override with CHARTER_ANNOTATE_MODEL / CHARTER_AUDIT_MODEL).
  3. Neither — annotate explains itself; audit degrades to AMBIGUOUS.

Integration

.git/hooks/pre-commit (or just run charter install-hook):

python charter.py check || exit 1

CI on PRs (the judged layer) — note the explicit opt-in, and treat PRs that modify CHARTER.md like PRs that modify your CI workflows:

CHARTER_TRUST_ASSERTS=1 python charter.py check && python charter.py audit

.claude/settings.json (optional steering):

{
  "hooks": {
    "SessionStart": [
      { "hooks": [ { "type": "command", "command": "python charter.py hook" } ] }
    ]
  }
}

For non-hook agents (Cursor, aider), put this in AGENTS.md/.cursorrules:

This repo's binding decisions live in CHARTER.md. When your work implements or touches a decision, leave its [D-xxx] symbol in a nearby comment and your commit message. Run python charter.py check before finishing; a failure means an enforcer caught a violation — fix the code, never the enforcer. Conflicts between a request and a decision must be surfaced, not silently resolved.

Liberties taken, and why

  • Citations replace scope globs. The derived graph defines what each decision governs. This deleted the lockfile, ack protocol, session baselines, and per-edit hooks of earlier designs — the single largest overhead reduction — at the cost of relying on builders to cite. The steering hook + agent instructions make citing the path of least resistance, and check fails supervise decisions that end up blind.
  • Stateless judgment. audit judges current state, not drift-since — so there is nothing to pin, ack, or reconcile. Run it whenever; the ledger is the only memory, and it's append-only and reviewable.
  • The graph navigates; enforcers govern. graph --json exists for agents to ask "what connects to what," but no verdict ever comes from graph topology — authority lives in things that can't be argued with.

Proof-carrying governance (verify)

A check that's never been exercised is a check you can't trust — a typo'd grep path passes forever and you never notice. charter verify proves each deterministic decision is actually enforceable against your code right now.

charter verify --adversarial goes further: an LLM red-team agent tries to slip a real violation past each enforcer — hiding it in a path the grep doesn't scan, a synonym the pattern misses, a different file type — on a sandboxed copy that's always restored. Anything it gets through is reported as a bypass, with the exact evasion. Run on this project's own sibling tool it bypassed 3 of 5 enforcers (e.g. a "no network calls" rule that only grepped requests|httpx|urllibhttp.client walked right past). It's governance that attacks itself, so "is this rule real or just theater?" has an answer.

Where it fits

The neighbors solve adjacent problems; none does Charter's loop:

Governs Who writes the rules Enforced every commit?
Spec-driven dev (Spec Kit, OpenSpec, Kiro) code generation, up front you write the spec no — out of the loop once code exists
Arch fitness functions (ArchUnit, dependency-cruiser) the codebase you, by hand, in code yes (per language)
ADR tools (adr-tools, Log4brains) a written record you write the ADR no — nothing checks the code
Charter the living repo an LLM drafts, you approve yes, deterministically

The gap Charter fills: it turns a prose decision into an enforced check, keeps the decision→code map current via citations, and does it without you hand-writing the rule — across any language, since the deterministic layer is just shell. Spec tools are upstream of the code; fitness functions need hand-written rules per stack; ADR tools document but never enforce. Use Charter with a spec tool if you like — scaffold generation with one, keep the repo true with the other.

FAQ

Isn't this just a pre-commit hook that runs grep? At the deterministic layer, yes — and that's the point. A grep that can't be argued with beats an LLM that can be talked out of a verdict. Charter's value isn't a cleverer check; it's turning prose decisions into checks at all, keeping them in sync with the code via citations, and proving the checks aren't vacuous (tripwires). The grep is a feature, not an embarrassment.

Why not Spec Kit / OpenSpec / Kiro? Those govern code generation — write a spec, then generate from it. They're out of the loop the moment a later change quietly contradicts the original design, which is where drift actually accrues. Charter governs the repo from then on, on every commit. They compose: scaffold with whatever you like, keep it true with Charter.

An LLM wrote my enforcement rules — why would I trust that? You don't trust it — you review it. annotate only proposes; nothing takes effect until you read CHARTER.md and approve it, exactly like reading code you're about to run. At enforcement time there's no LLM in the loop: check is deterministic shell. The model proposes, you ratify, grep decides.

What stops an agent from editing CHARTER.md, or weakening an enforcer, to make check pass? Any change to CHARTER.md fails check until a human runs approve (a hashed, journaled gate) — so an agent can't quietly rewrite the constitution. The agent instructions say fix the code, never the enforcer, and a weakened assert trips its tripwire (the proof that it can still catch a known violation). Tampering is visible, not silent.

Won't grep-based asserts be brittle and false-positive? Some will — which is why the ladder exists. Push fragile checks up to type, test, or lint, where the language and your test runner do the work; reserve assert for things that genuinely are a grep. Tripwires flag asserts that have quietly stopped detecting anything, so a brittle check fails loudly rather than passing forever.

Does it lock me into Claude? No. Any backend that reads a prompt on stdin works (CHARTER_LLM_CMD), the Anthropic API works, and you can skip the LLM entirely and write CHARTER.md by hand — check never calls a model.

Known limits

  • Sweet spot: a repo with a real design/architecture doc. Baseline-tested across Flask, httpx, click, prettier, rust-analyzer, the GitHub CLI, okhttp, and Deno. It does best when the doc states binding "never/always/default" decisions; it degrades on (a) repos with no design doc — pointed at a how-to guide it tends to extract file-existence trivia, (b) very large monorepos, and (c) deep language-specific symbol/dependency layouts (Go package symbols, Kotlin multi-root/Gradle version catalogs). Open issues track these. The deterministic assert rung is the most language-agnostic; lean on it.
  • check executes shell from CHARTER.md. The trust gate means a cloned repo can't run code on your machine before you review it, but after you approve, the asserts are exactly as trustworthy as your review of them. Read them like code you are about to run — because they are.
  • A builder that never cites makes citation-only supervise decisions blind — check fails them rather than letting them silently un-govern. Decisions with @ watch globs are still audited via watched files. Deterministic rungs are immune — prefer them.
  • Annotation quality is bounded by the doc: vague prose yields supervise proposals. The review-once step is where you strengthen enforcers.
  • assert commands are POSIX-shell; on Windows they run under Git Bash (auto-detected, CHARTER_SHELL overrides).
  • A vacuous assert's !! proof is authored by the same source as the assert, so the tripwire raises the bar but can't fully self-police — check flags trivially-true probes (!! true, bare echo), but a cleverly matched fake proof still needs the human review-once gate.
  • audit judges at most the first 60 in-scope files per decision (chunked, worst-verdict-wins) to bound LLM cost; broad @ src/** scopes over huge trees are reported as truncated. Narrow watch globs audit completely.
  • audit sends in-scope file contents to the model, and a determined prompt-injection in governed code can still influence a verdict — deterministic rungs carry the real authority; supervise+audit is the soft, advisory tier.

History

See CHANGELOG.md. Charter was previously named governor; v0.4.0 renamed it and added the local trust gate.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

charter_intent-0.5.0.tar.gz (54.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

charter_intent-0.5.0-py3-none-any.whl (36.9 kB view details)

Uploaded Python 3

File details

Details for the file charter_intent-0.5.0.tar.gz.

File metadata

  • Download URL: charter_intent-0.5.0.tar.gz
  • Upload date:
  • Size: 54.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for charter_intent-0.5.0.tar.gz
Algorithm Hash digest
SHA256 af9af98d1a72c37dd5243993f343c30a861e70d47057eec8a1d904e45c942daa
MD5 3c445e7c8dac0d5141ea48cc7ff435bf
BLAKE2b-256 4cdc0fd96c6b9eb7b4250bf3915ed2f25a3871611fba410aa1a500266fe7cafd

See more details on using hashes here.

Provenance

The following attestation bundles were made for charter_intent-0.5.0.tar.gz:

Publisher: workflow.yml on cspergel/Charter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file charter_intent-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: charter_intent-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 36.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for charter_intent-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc1e0f1cf0aff6697159f325013945f38214913683438e71f04446345ec93b81
MD5 1fcdbda7992f69283ffe1bf04761d61c
BLAKE2b-256 9025d84b6e5ec294a159f454b931a2d50e7923e9468f7a9c7a6024b6370f82da

See more details on using hashes here.

Provenance

The following attestation bundles were made for charter_intent-0.5.0-py3-none-any.whl:

Publisher: workflow.yml on cspergel/Charter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page