Spec-driven workflow system for autonomous AI-assisted software development
Project description
SpecFlow
A platform-agnostic, spec-driven workflow system for autonomous AI-assisted software development.
SpecFlow separates the work that requires human judgment (specs, acceptance criteria, interface contracts) from the work an agent can do autonomously (implementation, validation, documentation). The result: agents that stay focused, don't gold-plate, and stop cleanly when they hit something the spec doesn't cover.
Core Idea
AI agent failures are workflow failures, not model failures. Agents jump to implementation, infer unstated requirements, scope-creep, and lose context between sessions because nothing stops them.
SpecFlow fixes this by encoding workflow discipline in files:
- Human input is concentrated upfront — specs, acceptance criteria, interface contracts
- Agents operate in isolated Build Mode sessions where the spec is frozen
- The test suite decides when a session is done, not the agent
- When the spec is insufficient, agents stop and log a Spec Gap instead of guessing
Two Modes
Design Mode — human-intensive, project-scoped
Phases: Plan → Specify → Scaffold
The goal is a fully specified repo. Every unit needs four things before Build Mode can start:
- Acceptance Criteria — binary pass/fail (Given/When/Then or input→output)
- Interface Contracts — typed public interface, frozen during Build Mode
- Test Infrastructure — framework, test location, and how to run
- Explicit Out-of-Scope — what the unit does NOT do
At each phase boundary, the agent stops and self-assesses before you approve.
Build Mode — agent-autonomous, unit-scoped
Phases: Implement → Validate → Document
The spec is frozen. The agent reads the unit spec, runs the tests, and implements until all acceptance criteria pass. No spec changes, no test changes, no scope creep. Session ends when tests pass.
Spec Gap — when a Build Mode agent finds the spec insufficient:
- Stop — don't guess
- Log a
[SPEC GAP]entry in the unit log - Set the unit status to
gap, mode todesign - Return to Design Mode to resolve it
Quick Start
1. Install
pip install specflow-agent
2. Initialize a project
# From your project directory
specflow init "My Project"
This creates:
.specflow/
├── specflow.md # Control plane (mode: design, unit registry)
├── todo.md # Root task list
├── interfaces/ # Cross-unit interface contracts
└── units/
└── docs/
├── spec.md # Project-level spec
└── todo.md
.claude/
└── commands/
├── sf-start.md # /sf-start slash command
└── sf-end.md # /sf-end slash command
CLAUDE.md # Auto-generated mode-specific agent instructions
3. Start a session
/sf-start
This reads the control plane, identifies the current mode and active unit, and orients the agent. Run it at the beginning of every Claude Code session.
4. End a session
/sf-end
Updates the registry, appends to the unit log, commits, and closes out correctly.
Key Files in a SpecFlow Project
Control plane — .specflow/specflow.md
---
version: 2
project: my-project
mode: design
active_design_unit: auth-service
---
## Unit Registry
units:
- name: docs
status: spec-complete
- name: auth-service
status: pending
depends_on: [docs]
Unit spec — .specflow/units/<name>/spec.md
Must contain three mandatory sections before the unit can enter Build Mode:
## Acceptance Criteria
- Given valid credentials, login() returns AuthTokens with access and refresh tokens
- Given wrong password, login() returns 401 (not a 500 or panic)
## Interface Contracts (Public Interface)
- login(email: string, password: string): AuthTokens | AuthError
- validateToken(token: string): TokenPayload | AuthError
## Explicit Out-of-Scope
- User registration (belongs to user-service)
- OAuth/social login (future work)
Unit log — .specflow/units/<name>/log.md
Append-only session log. Four entry types:
## 2026-04-12 — [MILESTONE] Implement phase complete
All acceptance criteria pass.
## 2026-04-11 — [DEAD-END] Async bcrypt abandoned
Jest timer interference. Switched to synchronous bcrypt.
## 2026-04-10 — [SPEC GAP] Refresh token TTL undefined
Spec doesn't specify TTL for refresh tokens. Cannot implement without this.
## 2026-04-10 — [DESIGN NOTE] Spec template missing test framework field
Had to infer test runner from context. The unit spec template should require
a Test Infrastructure section so build agents don't have to guess.
CLI
specflow init [PROJECT_NAME] # Bootstrap a new project
specflow compile [--output PATH] # Regenerate CLAUDE.md for current mode
specflow status # Show current mode, active unit, NEXT task
specflow --version # Show version and supported control plane
All commands search upward from the current directory for .specflow/specflow.md.
init always operates on the current directory.
The original bash scripts are preserved in reference/scripts/ for reference.
Rules
Rules live in rules/ as individual Markdown files with YAML frontmatter.
specflow-compile.sh generates a mode-specific CLAUDE.md that points to the rules
directories — it does not inline them. Agents read rules on demand.
rules/
├── core/ # Always-active rules
├── phase/ # Phase-specific rules (plan, specify, scaffold, validate, ...)
└── optional/ # Opt-in rules (patch-protocol, bug-protocol)
How It Works with Claude Code
Claude Code reads CLAUDE.md automatically at the start of every session. The
mode-specific compiled output tells the agent:
- Design Mode (~60 lines): spec quality requirements, phase gate protocol, control plane authority, pointers to rule directories
- Build Mode (~30 lines): active unit and spec path, five hard rules (spec frozen, tests frozen, interfaces frozen), Spec Gap procedure
No plugins or integrations required. It's just files.
Platform-Agnostic
SpecFlow works with any AI coding agent that reads files:
- Claude Code — reads
CLAUDE.mdautomatically; use/sf-startand/sf-end - GitHub Copilot / Cursor — point it at
.specflow/files as context - Aider — pass spec files as context with
--read - Any LLM — paste the relevant files into the conversation
The discipline is in the files, not the tool.
Documentation
docs/VISION.md— strategic vision and research backgrounddocs/DECISIONS.md— architectural decisions with reasoning (10 confirmed)docs/file-format-spec.md— complete file format reference (v2)docs/claude-code-tips.md— Claude Code configuration tipsdocs/cli-wrapper-plan.md— CLI design decisions (D1–D8)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file specflow_agent-0.1.0.tar.gz.
File metadata
- Download URL: specflow_agent-0.1.0.tar.gz
- Upload date:
- Size: 36.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd697b7196de1bf3dc0e11483b6ef80e942388427fc479061715b0540878494e
|
|
| MD5 |
7158316633dac471bfeaabbad7990dbf
|
|
| BLAKE2b-256 |
165417c52668fe6e235d0d782809b74a98baa1c64a776c3a4167f79df443b621
|
File details
Details for the file specflow_agent-0.1.0-py3-none-any.whl.
File metadata
- Download URL: specflow_agent-0.1.0-py3-none-any.whl
- Upload date:
- Size: 41.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7984cd5e9ffc347ce8bce76107b3c1b10197507485645012b5df3e196954b5e
|
|
| MD5 |
b2e8a0d5b74bfba1e20b87cd4bcb95c5
|
|
| BLAKE2b-256 |
b50f7c6dc532e213db65386d0a7a9dfb762a497d5c89abdea467de741a1e1b93
|