Spec-driven development framework for AI coding agents

These details have not been verified by PyPI

Project links

Project description

scafld

scafld builds long-running AI coding work under adversarial review, so your agent stays coherent across the whole job.

Canonical repo: https://github.com/nilstate/scafld. Default branch: main.

Most AI coding tools optimize for speed inside one turn. scafld optimizes for correctness across the whole run:

the work starts from a reviewed spec
execution stays phase-bounded and measurable
completion is gated by an independent challenger at review

The differentiator is simple: the agent does not get to grade its own homework.

Identity

scafld is a scaffold in the literal sense: temporary structure that shapes the build while the work is in progress.

The lifecycle stays familiar:

draft -> harden -> approve -> build -> review -> complete

What changed is the core model underneath it.

Primitives

Two nouns carry the system:

spec: what must be true. The reviewed contract and lifecycle source of truth.
session: what happened. The durable run ledger under .ai/runs/{task-id}/session.json.

handoff is transport, not a primitive. It is the structured brief for the next voice.

Every generated handoff is:

immutable
sibling *.md + *.json
tagged by role × gate

Current runtime pairings:

executor × phase
executor × recovery
challenger × review

Review Gate

Challenge fires at one gate only in v1: review.

That keeps the system sharp without turning every phase into ceremony.

review emits a challenger handoff
the challenger writes a verdict into .ai/reviews/{task-id}.md
complete closes only if the review gate passes, or a human applies the audited override path after a completed challenger round

The override path is explicit:

scafld complete <task-id> --human-reviewed --reason "manual audit"

That override is available only after a completed challenger review round exists.

Runtime Layout

.ai/
  specs/{drafts,approved,active,archive}/
  runs/
    {task-id}/
      handoffs/
        executor-phase-phase1.md
        executor-phase-phase1.json
        executor-recovery-ac1_1-1.md
        executor-recovery-ac1_1-1.json
        challenger-review.md
        challenger-review.json
      diagnostics/
        ac1_1-attempt1.txt
      session.json
    archive/{YYYY-MM}/{task-id}/

Hard rules:

spec never carries runtime state
handoff is never read back to compute state
session is the only durable run-state source
recovery is a handoff gate plus counters in session, not a subsystem
telemetry is a view of session, not a separate artifact
v1 makes zero spec schema changes

Agent Surface

Default help teaches the slim workflow surface, including repo seeding:

scafld init
scafld plan <task-id>
scafld approve <task-id>
scafld build <task-id>
scafld review <task-id>
scafld complete <task-id>
scafld status <task-id>
scafld list
scafld report
scafld handoff <task-id>
scafld update

Use scafld --help --advanced to show the remaining operator tools such as harden, validate, branch, sync, audit, diff, summary, checks, and pr-body.

Wrapper intent:

plan: create a draft spec or reopen harden on an existing draft
build: start approved work and immediately drive validation to the next handoff or block
review: run the adversarial review gate and emit the challenger handoff
status: expose the canonical next_action and current_handoff

When the workspace includes them, the wrapper scripts are optional handoff adapters for Codex and Claude Code. The default challenger path is now scafld review itself.

scripts/scafld-codex-build.sh <task-id>
scripts/scafld-codex-review.sh <task-id>
scripts/scafld-claude-build.sh <task-id>
scripts/scafld-claude-review.sh <task-id>

Success Metrics

scafld claims quality lift only where it can measure it.

The canonical metrics are:

first_attempt_pass_rate
recovery_convergence_rate
challenge_override_rate

report surfaces all three per task and in aggregate.

Use scafld report --runtime-only to focus on tasks with runtime session data.

It also surfaces review-signal counts such as completed challenger rounds, grounded findings, and clean reviews that still record what was attacked.

There is also one honest boundary: scafld can emit a better handoff, but an external harness may still ignore it. That is why the metrics are framed as session outcomes, not proof of prompt consumption.

Install

pip install scafld
npm install -g scafld
git clone https://github.com/nilstate/scafld.git ~/.scafld && ~/.scafld/install.sh
curl -fsSL https://raw.githubusercontent.com/nilstate/scafld/main/install.sh | sh

pip install scafld installs the console entry point plus the managed runtime bundle used by scafld init and scafld update.

npm install -g scafld installs the same CLI package for environments that ship tooling through npm. The CLI still requires python3 at runtime. Commands that edit YAML specs, such as scafld harden, also require PyYAML in that Python runtime:

python3 -m pip install PyYAML

Setup

cd your-project
scafld init

This creates the managed runtime bundle, prompts, schemas, run directories, and project-owned overlays:

your-project/
  .ai/
    scafld/                # Managed reset copy refreshed by `scafld update`
    config.yaml            # Project config overlay
    config.local.yaml      # Local machine overrides
    prompts/               # Active project-owned template sources
    runs/                  # Generated handoffs, diagnostics, session state
    reviews/               # Adversarial review artifacts
    specs/                 # Specs by lifecycle state
  AGENTS.md
  CLAUDE.md
  CONVENTIONS.md

Start by customizing:

AGENTS.md
CLAUDE.md
CONVENTIONS.md
.ai/config.local.yaml

When the workspace includes them, the handoff-first wrappers are:

scripts/scafld-codex-build.sh <task-id>
scripts/scafld-codex-review.sh <task-id>
scripts/scafld-claude-build.sh <task-id>
scripts/scafld-claude-review.sh <task-id>

Repo-aware planning also works for mixed Python+Node repos. When both stacks are present, scafld merges the signals and prefers concrete detected commands over placeholder defaults.

Prompt ownership is deliberate:

.ai/prompts/* is the active template layer the runtime reads first
.ai/scafld/prompts/* is the managed reset copy refreshed by scafld update

Minimal Runtime Config

llm:
  model_profile: "default"
  context:
    budget_tokens: 12000
  recovery:
    max_attempts: 1

Anything beyond that waits until it earns its place through measured wins.

Next Docs

docs/execution.md
docs/integrations.md
docs/review.md
docs/run-artifacts.md
docs/cli-reference.md
AGENTS.md

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.4.7

Jun 8, 2026

2.4.6

Jun 1, 2026

2.4.5

May 25, 2026

2.4.4

May 18, 2026

2.4.3

May 17, 2026

2.4.2

May 14, 2026

2.4.1

May 13, 2026

2.4.0

May 12, 2026

2.3.12

May 12, 2026

2.3.11

May 12, 2026

2.3.10

May 12, 2026

2.3.9

May 12, 2026

2.3.8

May 10, 2026

2.3.7

May 9, 2026

2.3.6

May 8, 2026

2.3.5

May 8, 2026

2.3.4

May 8, 2026

2.3.3

May 8, 2026

2.3.2

May 7, 2026

2.3.1

May 7, 2026

2.3.0

May 6, 2026

2.2.4

May 6, 2026

2.2.3

May 5, 2026

2.2.2

May 5, 2026

2.2.1

May 5, 2026

2.2.0

May 4, 2026

2.1.1

May 4, 2026

2.1.0

May 4, 2026

2.0.0

Apr 30, 2026

This version

1.7.2

Apr 29, 2026

1.7.0

Apr 28, 2026

1.6.6

Apr 27, 2026

1.6.5

Apr 27, 2026

1.6.0

Apr 24, 2026

1.5.1

Apr 22, 2026

1.5.0

Apr 22, 2026

1.4.6

Apr 21, 2026

1.4.5

Apr 21, 2026

1.4.4

Apr 21, 2026

1.4.3

Apr 21, 2026

1.4.2

Apr 21, 2026

0.0.1

Apr 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scafld-1.7.2.tar.gz (200.1 kB view details)

Uploaded Apr 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

scafld-1.7.2-py3-none-any.whl (194.6 kB view details)

Uploaded Apr 29, 2026 Python 3

File details

Details for the file scafld-1.7.2.tar.gz.

File metadata

Download URL: scafld-1.7.2.tar.gz
Upload date: Apr 29, 2026
Size: 200.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scafld-1.7.2.tar.gz
Algorithm	Hash digest
SHA256	`91119845b6238dd71e0318398d3ff4e8cb899540a83c3ac4346187936f19ae3e`
MD5	`606ce310e44b9d9ed97210fb2567e442`
BLAKE2b-256	`8ac897eef650a1de5c94581142607addef1c46dddaa8a2f52274214fc152f51c`

See more details on using hashes here.

File details

Details for the file scafld-1.7.2-py3-none-any.whl.

File metadata

Download URL: scafld-1.7.2-py3-none-any.whl
Upload date: Apr 29, 2026
Size: 194.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scafld-1.7.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dfb55d9bc54f9ab20bcb72bbc434a1789e2153a0c90f52089357f41f743356a2`
MD5	`ada1c799f9cd70dc4005a40a25075fa6`
BLAKE2b-256	`e6a5d198f0ab3093fd243f3369f71e40f64bb9a740db2b44d3525e65ee460442`

See more details on using hashes here.

scafld 1.7.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

scafld

Identity

Primitives

Review Gate

Runtime Layout

Agent Surface

Success Metrics

Install

Setup

Minimal Runtime Config

Next Docs

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes