Skip to main content

The architect of agentic workflows. Defines, validates, and executes AIPs.

Project description

Specwright

The architect of agentic workflows.

Specwright defines, validates, and executes Agentic Implementation Plans (AIPs) โ€” human-in-the-loop governance for AI-assisted software development.

Specwright defines. Dogfold builds. Gorch orchestrates. LifeOS lives.


What is Specwright?

Specwright is a meta-engineering orchestration layer that ensures AI-driven development is:

  • Traceable: Every decision logged, every gate validated
  • Tiered: Governance scales with risk (Tier A/B/C)
  • Human-friendly: Write specs in Markdown, execute validated YAML
  • Compliant: Aligned with ISO 42001 and NIST AI RMF

You write the plan. Specwright ensures it's rigorous.


๐ŸŽฏ Quick Start

# Install
pip install specwright

# Define a new plan (human-friendly Markdown)
spec new --tier B --title "Add OAuth login" --owner alice --goal "Implement secure authentication"

# Edit the generated Markdown spec
# specs/add-oauth-login.md

# Compile to validated YAML
spec compile specs/add-oauth-login.md

# Validate against schema
spec validate specs/add-oauth-login.compiled.yaml

# Execute with governance
spec run specs/add-oauth-login.compiled.yaml

# Or preview without execution
spec run specs/add-oauth-login.compiled.yaml --plan

๐ŸŒŸ The Ecosystem

Specwright is part of a larger experimental toolchain:

Tool Purpose Status
Specwright Defines AIPs, enforces governance Alpha (v0.3.0)
Dogfold Recursive Python scaffolding Experimental
Gorch Google Cloud orchestration Future
LifeOS Personal operating system Future

Note: All tools in this ecosystem are early-stage and actively evolving. Specwright is functional but should be considered alpha software.


๐Ÿ“š Core Concepts

Why Specwright?

The problem: AI tools generate code fast, but lack governance, traceability, and risk management.

The solution: Specwright introduces tiered governance:

Three Risk Tiers

All work follows the same workflow, but governance rigor scales with risk:

Tier Risk Level Gates SLA Coverage Use Cases
A High 5 formal gates 24-72h 90%+ Security/compliance/architecture changes
B Moderate 5 standard gates 8-48h 85%+ Feature development, refactoring
C Low 5 fast-lane gates (4 auto-approved) 1-24h 70%+ Documentation, utilities, minor fixes

Key principle: The tiers modulate rigor, not sequence.

Same workflow, different rigor. All tiers follow the canonical 5-gate model:

  1. Planning [G0: Plan Approval]
  2. Prompt Engineering [G1: Code Readiness]
  3. Implementation [G3: Deployment Approval]
  4. Testing [G2: Pre-Release]
  5. Governance [G4: Post-Implementation]

Human-Friendly Workflow

Write specs in Markdown, execute validated YAML:

specs/                          # Human-authored specifications
โ”œโ”€โ”€ my-feature.md              # Write here! (Markdown)
โ””โ”€โ”€ my-feature.compiled.yaml   # Generated (don't edit)

aips/                          # Validated AIPs ready for execution
โ””โ”€โ”€ AIP-2025-10-31-001.yaml    # Promoted from compiled spec

This separation ensures:

  • Humans collaborate in Markdown (easy to read/write/review)
  • Machines execute YAML (validated, deterministic)
  • Git tracks both (spec shows intent, AIP shows execution)

๐Ÿ› ๏ธ CLI Commands

spec new

Create a new Markdown specification from tier template.

spec new --tier <A|B|C> --title "Task title" --owner "Your Name" --goal "What we're building"

# Interactive prompts
spec new

# Specify output path
spec new --tier B --title "Add feature" --owner alice --goal "Implement X" --output custom/path.md

Output: Human-editable Markdown spec with:

  • YAML frontmatter (tier, title, owner, goal)
  • Structured sections (Objective, Context, Plan, etc.)
  • Step templates with gates, prompts, commands, outputs

spec compile

Compile Markdown spec to validated YAML AIP.

spec compile specs/my-feature.md

# Specify output path
spec compile specs/my-feature.md --output custom/output.yaml

# Force overwrite if compiled file exists
spec compile specs/my-feature.md --overwrite

What it does:

  • Parses Markdown using markdown-it-py (robust token-based parsing)
  • Validates frontmatter, sections, plan steps
  • Checks output paths are within repo bounds
  • Generates canonical YAML with source hash
  • Round-trip guard: fails if existing compiled file differs (unless --overwrite)

Output includes:

meta:
  source_md_path: specs/my-feature.md
  source_md_rel: specs/my-feature.md
  source_md_sha256: "abc123..."
  compiler_version: spec-compiler/0.1.0
  compiled_at: null  # intentionally null for determinism
  tier: B
  title: "My Feature"
  # ...

spec validate

Validate AIP against JSON schema with tier defaults merged.

spec validate specs/my-feature.compiled.yaml
spec validate aips/AIP-2025-10-31-001.yaml

What it checks:

  • Schema compliance (required fields, types, constraints)
  • Tier-specific requirements (coverage targets, gate structure)
  • Path safety (no escaping repo root)
  • Gate references (G0-G4 only)

spec run

Execute an AIP in guided checklist mode (v0.1).

# Interactive execution
spec run specs/my-feature.compiled.yaml

# Preview mode (no execution)
spec run specs/my-feature.compiled.yaml --plan

What it does (v0.1):

  • Displays each step with role, prompts, commands, outputs
  • Prompts for manual completion confirmation
  • Shows gate checkpoints
  • (Future: actual agent execution, state persistence, automated gates)

spec diff

Show semantic diff between Markdown and compiled YAML.

spec diff specs/my-feature.md

# Detailed output
spec diff specs/my-feature.md --verbose

Useful for:

  • Catching compilation drift
  • Reviewing changes before commit
  • Validating round-trip integrity

๐Ÿ“ Design Principles

1. Markdown-First Authoring

Humans write in Markdown. Machines execute YAML.

Why Markdown?

  • Human-readable and writable
  • Great for collaboration (Git diffs, PR reviews)
  • Natural section structure (H2/H3 headings)
  • Easy to template with Jinja2

Why not edit YAML directly?

  • YAML is verbose and error-prone for humans
  • Hard to review in PRs
  • Machine format should be generated, not authored

2. Deterministic Compilation

Every compilation is reproducible and verifiable:

What it does:

  • Canonical YAML ordering: sorted keys, no anchors/aliases
  • Source hash tracking: source_md_sha256 for integrity
  • Null timestamps: compiled_at: null for bit-identical output
  • Round-trip guard: fails if recompiling produces different output

This enables:

  • Git-friendly diffs (no spurious changes)
  • Pre-commit hooks (enforce MD/YAML sync)
  • Audit trails (hash verification)

Compiled YAML includes:

meta:
  source_md_path: specs/user-auth.md
  source_md_sha256: "abc123..."
  compiler_version: "spec-compiler/0.1.0"
  compiled_at: null  # intentionally null for determinism
  tier: "B"

3. Governance as Code

AIPs aren't just checklists โ€” they're executable governance contracts:

  • Schema-validated (JSON Schema)
  • Tier-aware defaults
  • Gate approvals enforced
  • Metrics tracked (coverage, defects, budget)

4. Token-Based Markdown Parsing

Uses markdown-it-py instead of regex:

  • Handles nested code blocks correctly
  • Robust against edge cases (backticks in headings, etc.)
  • Proper token tree for precise extraction
  • Extensible for future enhancements

5. Tiered Governance, Not Tiered Workflows

Same workflow for all tiers, different governance:

  • Tier A: All gates require human approval (24-72h SLAs)
  • Tier B: Standard approval process (8-48h SLAs)
  • Tier C: Most gates auto-approved (1-24h SLAs, only G2 requires human)

This ensures:

  • Process integrity (no skipped steps)
  • Flexibility (adjust rigor to risk)
  • Auditability (all tiers traceable)

6. Schema Validation with Defaults Merging

AIPs can be sparse (only specify what differs from tier defaults):

# In your compiled AIP (minimal)
meta:
  tier: B
  title: "My Feature"
# ...

# At validation time, merged with tier-B defaults:
gates:
  - gate_id: G0-plan-approval
    approver_role: "Tech Lead + Peer"
    # ... all default gate config

This keeps specs concise while ensuring complete validation.


๐Ÿ“– Learn More


๐Ÿ—๏ธ Project Structure

specwright/
โ”œโ”€โ”€ src/spec/                    # Core implementation
โ”‚   โ”œโ”€โ”€ cli/spec.py             # CLI commands
โ”‚   โ”œโ”€โ”€ compiler/               # Markdownโ†’YAML compiler
โ”‚   โ”‚   โ”œโ”€โ”€ parser.py           # Token-based MD parser
โ”‚   โ”‚   โ””โ”€โ”€ compiler.py         # Deterministic YAML generator
โ”‚   โ””โ”€โ”€ core/                   # Shared utilities
โ”‚       โ””โ”€โ”€ loader.py           # YAML loading + defaults merging
โ”‚
โ”œโ”€โ”€ config/                      # Configuration
โ”‚   โ”œโ”€โ”€ templates/
โ”‚   โ”‚   โ”œโ”€โ”€ specs/              # Markdown templates (tier-a/b/c)
โ”‚   โ”‚   โ””โ”€โ”€ aips/               # YAML templates (legacy)
โ”‚   โ”œโ”€โ”€ defaults/               # Tier defaults (tier-A/B/C.yaml)
โ”‚   โ”œโ”€โ”€ schemas/                # JSON Schema for validation
โ”‚   โ””โ”€โ”€ policies/               # Reusable policy packs
โ”‚
โ”œโ”€โ”€ specs/                       # Human-authored Markdown specs
โ”œโ”€โ”€ aips/                        # Validated AIPs (YAML)
โ”œโ”€โ”€ docs/                        # Documentation
โ”œโ”€โ”€ tests/                       # Test suite
โ”‚   โ”œโ”€โ”€ compiler/
โ”‚   โ”‚   โ””โ”€โ”€ golden/             # Golden test snapshots
โ”‚   โ””โ”€โ”€ integration/
โ”‚
โ”œโ”€โ”€ pyproject.toml              # Project configuration
โ””โ”€โ”€ README.md                   # This file

๐Ÿงช Testing

# Run linter
ruff check src/ tests/

# Type checking
mypy src/

# Unit tests
pytest tests/

# Golden tests (snapshot-based)
pytest tests/compiler/golden/ -v

# Integration tests
pytest tests/integration/ -v

Pre-commit Hook

Enforce MD/YAML sync:

# .git/hooks/pre-commit
#!/bin/bash
for md in specs/*.md; do
    yaml="${md%.md}.compiled.yaml"
    if [ -f "$yaml" ]; then
        spec diff "$md" || exit 1
    fi
done

๐Ÿ”„ Workflow Example

1. Create a Tier B feature spec

spec new --tier B --title "Add OAuth login" --owner alice --goal "Implement Google OAuth"

Generated: specs/add-oauth-login.md

2. Edit the spec

---
tier: B
title: Add OAuth login
owner: alice
goal: Implement Google OAuth
---

# Add OAuth login

## Objective

Add Google OAuth 2.0 authentication flow to allow users to sign in with their Google accounts.

## Acceptance Criteria

- [ ] Users can click "Sign in with Google"
- [ ] OAuth callback handles authorization code
- [ ] User profile synced to local database
- [ ] Session management with JWT
- [ ] 85% test coverage achieved

## Context

### Background

Current email/password auth is limiting adoption. Users expect social login.

### Constraints

- Must use Google's official OAuth 2.0 library
- Store only necessary user data (email, name, profile picture)
- GDPR compliant (user can revoke access)

## Plan

### Step 1: Planning [G0: Plan Approval]

**Prompt:**

Create detailed WBS for OAuth integration:
- Frontend: Google Sign-In button + callback page
- Backend: OAuth flow, token exchange, user provisioning
- Database: user table updates for OAuth identifiers
- Security: CSRF protection, state validation

**Outputs:**

- `artifacts/plan/wbs.md`
- `artifacts/plan/security-checklist.md`

### Step 2: Prompt Engineering [G1: Code Readiness]

**Prompt:**

Generate implementation prompts for:
- Frontend: React component with Google OAuth SDK
- Backend: FastAPI endpoints for /auth/google/callback
- Database migrations for oauth_provider, oauth_id fields

**Outputs:**

- `artifacts/prompts/frontend-prompts.md`
- `artifacts/prompts/backend-prompts.md`

### Step 3: Implementation [G3: Deployment Approval]

**Commands:**

```bash
ruff .
mypy .
pytest -q

Outputs:

  • artifacts/code/release-notes.md
  • artifacts/code/runbook.md

Step 4: Testing [G2: Pre-Release]

Commands:

pytest --cov=src --cov-report=xml

Outputs:

  • artifacts/test/coverage.xml

Step 5: Governance [G4: Post-Implementation]

Outputs:

  • artifacts/governance/decision-log.md
  • artifacts/governance/privacy-checklist.md

Models & Tools

Tools: bash, pytest, ruff, mypy

Repository

Branch: feat/add-oauth-login

Merge Strategy: squash


### 3. Compile and validate

```bash
spec compile specs/add-oauth-login.md
spec validate specs/add-oauth-login.compiled.yaml

4. Execute

# Interactive guided execution
spec run specs/add-oauth-login.compiled.yaml

# Or preview first
spec run specs/add-oauth-login.compiled.yaml --plan

5. Promote to AIP (optional)

spec promote specs/add-oauth-login.md --to aips/

Output: aips/AIP-2025-10-31-001.yaml (immutable release artifact)


๐ŸŽ“ Learning Resources

For New Users

  1. Read Agentsway Implementation Guide
  2. Try creating a Tier C spec: spec new --tier C
  3. Review the generated Markdown template
  4. Compile and run through the workflow

For Contributors

  1. Read Spec Compilation Guide
  2. Review compiler implementation
  3. Run golden tests: pytest tests/compiler/golden/ -v
  4. Check open issues

๐ŸŽจ The Story

Specwright was built to solve a real problem: How do you govern AI-driven development without crushing velocity?

The answer: Tiered governance. Not every change needs a 72-hour review cycle. Documentation updates can fast-lane with auto-approved gates. Security changes get formal sign-offs.

Specwright ensures the right rigor for the right risk.

It's part of a larger ecosystem:

  • Specwright defines the governance framework
  • Dogfold learns from builds and scaffolds recursively
  • Gorch orchestrates on Google Cloud
  • LifeOS presents it all to humans

This is meta-engineering: tools that build the builders, then build the world.


๐Ÿš€ Roadmap

v0.3.0 (Current)

  • โœ… Markdown-first authoring with Jinja2 templates
  • โœ… Deterministic compilation with source hash tracking
  • โœ… Token-based Markdown parsing
  • โœ… Round-trip validation and diff detection
  • โœ… Tier-specific governance with 5-gate model
  • โœ… Schema validation with defaults merging

v0.4.0 (Next Quarter)

  • Rename to specwright package
  • Actual agent execution (replace checklist mode)
  • State persistence (.aip_artifacts/state.json)
  • Automated gate approvals (Slack/email integration)
  • Metrics tracking (budget, coverage, time-to-green)
  • Integration with Dogfold scaffolding

v1.0.0 (Future)

  • Multi-agent orchestration
  • Policy enforcement engine
  • Compliance reporting (ISO 42001, NIST AI RMF)
  • Web UI for spec management
  • Full Gorch integration (Google Cloud orchestration)

๐Ÿค Contributing

Contributions welcome! Please:

  1. Read CONTRIBUTING.md for contribution guidelines
  2. See DEVELOPMENT.md for local development workflow (dogfooding while building)
  3. Check open issues
  4. Submit PRs against main branch
  5. Ensure tests pass: pytest tests/
  6. Run linter: ruff check src/

๐Ÿ“„ License

MIT License - see LICENSE file for details.


๐Ÿ™ Acknowledgments

  • Agentsway Implementation Guide - Governance framework foundation
  • ISO 42001:2023 - AI management system standards
  • NIST AI RMF 1.0 - Risk management framework
  • Dogfold - Recursive scaffolding partner
  • Gorch - Google Cloud orchestration layer

๐Ÿ’ฌ Support


Built with โค๏ธ for rigorous, traceable, human-in-the-loop AI-assisted development.

Specwright defines. Dogfold builds. Gorch orchestrates. LifeOS lives.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

specwright-0.3.1.tar.gz (49.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

specwright-0.3.1-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file specwright-0.3.1.tar.gz.

File metadata

  • Download URL: specwright-0.3.1.tar.gz
  • Upload date:
  • Size: 49.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for specwright-0.3.1.tar.gz
Algorithm Hash digest
SHA256 9d3ccc29e13c6f29a94f3c358c97ab00a65146720e73d2e12beb3927a5d74720
MD5 c09167cee0d7e80a540bcd669e4a6fda
BLAKE2b-256 1e9a8d2f359637c2dd527099b71afcab8669e3f9b862f3f2d98e113b83ce3e98

See more details on using hashes here.

File details

Details for the file specwright-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: specwright-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for specwright-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4eff575047be54eb3f44729135ede78049a1dfa18a97d051c65d59aec380317d
MD5 de0bc57e96d77e9ab95438b67317179c
BLAKE2b-256 28029e7b7de422691bcf43216bc2b6d418d139c86b5872d9b69fc81ec39bcdaa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page