Skip to main content

Plan-Build Spec (pb-spec): A CLI tool for managing AI coding assistant skills

Project description

pb-spec — Plan-Build Spec

DeepWiki Context7 Python 3.12+ License: Apache-2.0 PyPI

pb-spec

pb-spec is a set of Agent Skills Specification compliant AI Coding assistant workflow skill packages. It provides a structured process — planbuild — that turns natural-language requirements into well-architected, BDD-driven, tested code.

Design Philosophy

pb-spec implements the Plan-Build pattern: a Planner Agent generates design specs, and a Builder Agent executes code with verification. The core innovation is making .feature files the absolute source of truth — all design and tasks derive FROM scenarios, not the other way around.

Core Principles

Principle Description
BDD-First Feature scenarios are the source of truth. Design and tasks derive FROM scenarios.
RFC 2119 Constraints Architectural constraints use MUST/SHOULD/MAY keywords — binding for the Builder.
DAG Execution Tasks include DependsOn metadata for parallel execution of independent tasks.
Adaptive Steering Tasks with Complexity=High route to reasoning models; Low to fast models.
Escalation Protocol Repeated failures auto-escalate to stronger models for root-cause analysis.
Generator/Evaluator Isolation Generator builds; Evaluator judges with fresh context — never inherits Generator state.

Design Standards

design.md artifacts conform to industry-standard specification formats:

Standard Purpose pb-spec Application
EARS Notation Eliminate ambiguous requirements with 5 sentence patterns Every requirement uses EARS syntax with [REQ-XX] IDs
C4 Model + Mermaid Architecture topology in parseable text Architecture sections use ```mermaid blocks
DBML / Prisma Schema Structured data models with strict types Data model sections use DBML or Prisma Schema DSL
MADR (ADR Records) Architecture decision records Every AD has [Context], [Decision], [Consequences]
RFC 2119 Constraints Binding behavioral constraints for agents §Architectural Constraints with MUST/SHOULD/MAY
Behavior Traceability Matrix Every component maps to a Feature scenario No scenario = remove from design

Best-Practice Alignment

Source Core Idea How pb-spec Applies It
RPI Strategy Separate research, planning, and implementation /pb-init + /pb-plan precede /pb-build
Plan-and-Solve Prompting Plan first to reduce missing-step errors design.md + tasks.md are mandatory artifacts
ReAct Interleave reasoning and actions with environment feedback /pb-build executes task-by-task with test/tool feedback loops
Reflexion Learn from failure signals via iterative retries Escalation protocol + DCR flow in pb-build
Harness Engineering (OpenAI, 2026-02-11) Treat runtime signals and checklists as first-class harness inputs pb-plan requires runtime verification hooks; pb-build validates logs/health evidence
openai/symphony Long-running agents need explicit observability and deterministic escalation pb-build enforces bounded retries and emits standardized DCR packets
Effective Harnesses for Long-Running Agents Grounding, context hygiene, recovery, observability State checks, minimal context handoff, task-local rollback
Building Effective Agents Prefer simple composable workflows over framework complexity Small adapter-based CLI + explicit workflow prompts
Harness Design for Long-Running Application Development Generator/Evaluator separation; adversarial evaluation pb-build dual-persona with adaptive evaluation by task complexity
shadcn/improve Audit codebase, write self-contained plans for cheaper executors /pb-improve surveys codebase, produces prioritized specs
Agent-SOP RFC 2119 constraints, DAG tasks, adaptive model routing RFC 2119 in design.md, DAG metadata in tasks.md, Escalation protocol
Superpowers Composable skills, evidence-based claims, systematic debugging using-pb-spec bootstrap; supporting skills ecosystem

Installation

No manual file configuration needed. As long as your AI assistant supports the standard Agent Skills specification (Claude Code, Cursor, GitHub Copilot, OpenCode, etc.), you can install with one command.

Run in your project root:

# Install all pb-spec workflow skills at once
npx skills add longcipher/pb-spec

# Or install only the ones you need
npx skills add longcipher/pb-spec --skill pb-init
npx skills add longcipher/pb-spec --skill pb-plan
npx skills add longcipher/pb-spec --skill pb-build
npx skills add longcipher/pb-spec --skill pb-improve

(After installation, skills will be placed in .agents/skills/ or the compatible local directory for your environment, and automatically indexed by your AI.)

Quick Start

Invoke these skills directly in your IDE / Agent terminal:

  1. /pb-init: Audit the current project and non-destructively create or update the AGENTS.md architecture snapshot.
  2. /pb-plan "requirement description": For example /pb-plan Add WebSocket auth, AI will generate design.md, tasks.md, and .feature test specs with Architecture Decisions and RFC 2119 constraints.
  3. /pb-build "feature-name": Automatically reads tasks.md, starts the outer BDD and inner TDD dual-loop with Subagents, closing and verifying each task one by one.
  4. /pb-refine "feature-name": (Optional) When encountering design blocks (Build Blocked) or architectural flaws, iteratively modify specs based on feedback.
  5. /pb-improve: Audit the codebase and generate prioritized implementation specs for other agents to execute.

Skills Overview

Workflow Skills

Skill Trigger Output Description
pb-init /pb-init AGENTS.md Audit repo and safely update/append a managed snapshot block without rewriting user-authored constraints
pb-plan /pb-plan <requirement> specs/<spec-dir>/design.md + tasks.md + features/*.feature Design proposal + Gherkin scenarios + ordered task breakdown with RFC 2119 constraints and DAG metadata
pb-refine /pb-refine <feature> Revised spec files Apply feedback or Design Change Requests
pb-build /pb-build <feature-name> Code + tests BDD+TDD via Generator (builds) + Evaluator (adversarial review) with Escalation protocol
pb-improve /pb-improve specs/<spec-dir>/ + specs/context.md Codebase audit → prioritized findings → pb-plan-compatible specs

Supporting Skills

Skill Trigger Description
using-pb-spec Session start Bootstrap: establishes skill invocation discipline
pb-brainstorming Before creative work Explores intent, requirements, and design before implementation
pb-test-driven-development Before writing code Red → Green → Refactor cycle, non-negotiable TDD
pb-systematic-debugging Any bug or failure Find root cause before attempting fixes
pb-verification-before-completion Before claiming done Evidence before assertions, always
pb-requesting-code-review Before merge Dispatch independent reviewer subagent
pb-receiving-code-review Processing feedback Technical evaluation, not performative agreement
pb-dispatching-parallel-agents Multiple independent tasks One agent per problem domain, concurrent execution
pb-subagent-driven-development Executing plans Fresh context per task, two-stage review
pb-finishing-a-development-branch Work complete Verify tests → Present options → Execute choice
pb-writing-skills Creating/editing skills Skills are code, not prose — test them

Supported AI Tools

This skill library uses the standard SKILL.md specification. Fully compatible with:

  • Cursor
  • Claude Code
  • GitHub Copilot / GitHub Spark
  • OpenCode
  • Gemini CLI & Codex
  • Any tool that supports agentskills.io specification or reads .agents/skills/.

Workflow

Five agent skills that chain together:

/pb-init → /pb-plan → [/pb-refine] → /pb-build
                    ↘
                    /pb-improve → specs/ → /pb-build

Supporting skills activate automatically during the workflow:

  • pb-brainstorming — before /pb-plan when requirements are unclear
  • pb-test-driven-development — during /pb-build for every task
  • pb-systematic-debugging — when tasks fail repeatedly
  • pb-verification-before-completion — before marking any task DONE
  • pb-requesting-code-review / pb-receiving-code-review — review cycles
  • pb-dispatching-parallel-agents — parallel audit in /pb-improve
  • pb-finishing-a-development-branch — finalization after /pb-build

1. /pb-init — AGENTS.md Snapshot & Safe Merge

Audits your project and writes a pb-init snapshot into AGENTS.md using managed markers:

  • <!-- BEGIN PB-INIT MANAGED BLOCK -->
  • <!-- END PB-INIT MANAGED BLOCK -->

Merge behavior is non-destructive:

  • If markers exist, only that managed block is replaced.
  • If markers do not exist, the managed block is appended.
  • All existing content outside the managed block is preserved verbatim.

The managed snapshot includes an Architecture Decision Snapshot so later agents inherit repo-level conventions instead of re-deciding them every run.

2. /pb-plan <requirement> — Design & Task Planning

Takes source material in arbitrary format and produces a complete feature spec:

specs/<YYYY-MM-DD-NO-feature-name>/
├── design.md    # Architecture, API contracts, data models, RFC 2119 constraints
├── tasks.md     # Ordered implementation tasks with DAG metadata
└── features/    # Gherkin acceptance artifacts (Source of Truth)

Key capabilities:

  • BDD-First: Feature scenarios are written FIRST; design and tasks derive FROM scenarios
  • EARS Requirements: All acceptance criteria use 5 sentence patterns for machine-checkable verification
  • RFC 2119 Constraints: §Architectural Constraints section with MUST/SHOULD/MAY keywords — binding for Builder
  • Behavior Traceability Matrix: Every design component maps to a Feature scenario (no scenario = remove from design)
  • C4 + Mermaid Architecture: Architecture diagrams in parseable Mermaid syntax
  • MADR Decisions: Architecture decisions with Context/Decision/Consequences
  • DBML Data Models: Structured data models in DSL (natural language forbidden)
  • API-First Contracts: Type signatures before implementation (OpenAPI, Protocol, Trait)
  • DAG-Enabled Tasks: TaskID, DependsOn, Complexity, Required Skills, EvalRule metadata
  • Risk-based testing: Property tests by default for broad input domains; fuzzing and benchmarks conditional
  • Template identity alignment: Renames generic scaffold names to project-matching identifiers
  • Source requirement normalization: Converts arbitrary-format input into a structured requirement ledger
  • Self-reconciliation: Verifies all requirements are covered across design, tasks, and scenarios before finalizing

3. /pb-refine <feature-name> — Design Iteration (Optional)

Reads user feedback or Design Change Requests and updates design.md and tasks.md. Maintains a revision history and cascades changes without overwriting completed work.

Validates 🛑 Build Blocked and 🔄 Design Change Request packets for required sections before modifying spec artifacts.

4. /pb-build <feature-name> — Subagent-Driven Implementation

Implements tasks sequentially using a Generator/Evaluator dual-persona workflow with Escalation protocol:

Generator (subagent) → READY_FOR_EVAL → Evaluator (independent context) → PASS / FAIL
                                                ├── Diff Audit (git diff + scope + architecture)
                                                ├── BDD Evidence Verification (independent re-run)
                                                ├── MCP Live Verification (Playwright / HTTP / CLI)
                                                └── Edge Case Probing (boundaries, errors, security)

On PASS  → Orchestrator marks task DONE in tasks.md
On FAIL  → Evaluator feedback → fresh Generator subagent → retry loop
On 2nd FAIL → Escalation: auto-upgrade to stronger model for root-cause analysis
On 3rd FAIL → DCR packet to /pb-refine

Key principles:

  • BDD-First: Feature scenarios are the source of truth; all business code must satisfy scenarios
  • RFC 2119 Constraints: All constraints from design.md §Architectural Constraints are BINDING
  • TDD is non-negotiable: Every task starts with a failing test (Red → Green → Refactor)
  • Fresh context per subagent: No inherited assumptions; Evaluator never inherits Generator context
  • Architecture decisions are binding: Executes the approved design; does not invent a different architecture
  • Escalation over thrashing: 2nd failure auto-escalates to stronger model; 3rd failure → DCR packet
  • Mode Behavior: Interactive mode (default) or Auto mode (--auto flag)

5. /pb-improve — Codebase Audit & Plan Generation

Audits any codebase and writes pb-plan-compatible specs for other agents to execute. The skill never modifies source code — only produces specs under specs/.

/pb-improve                        full audit → prioritized findings → specs
/pb-improve quick                  cheap pass: hotspots, top findings only
/pb-improve deep                   exhaustive: every package, every category
/pb-improve security               focused audit (also: perf, tests, bugs, ...)
/pb-improve branch                 audit only what the current branch changes
/pb-improve next                   feature suggestions — where to take the project
/pb-improve plan <description>     skip the audit, spec one thing
/pb-improve review-spec <feature>  critique and tighten an existing spec
/pb-improve reconcile              refresh the backlog: verify, unblock, retire
/pb-improve ... --issues           also publish specs as GitHub issues

How it works:

  1. Recon — Maps the repo: stack, conventions, build/test/lint commands (verification gates). Generates specs/context.md with project context.
  2. Audit — Fans out parallel subagents across 9 categories: correctness, security, performance, test coverage, tech debt, dependencies, DX, docs, direction.
  3. Vet — Re-reads every cited location to drop false positives and correct mis-attributions.
  4. Prioritize — Findings ordered by leverage (impact ÷ effort, weighted by confidence).
  5. Spec — One spec per selected finding in specs/<spec-dir>/ with design.md (RFC 2119 constraints), tasks.md (DAG metadata), and features/*.feature (Source of Truth).

BDD-First Integration

The core innovation: .feature files are the absolute source of truth. All design and tasks derive FROM scenarios.

Feature-Driven Workflow

features/*.feature (Source of Truth)
        ↓
design.md (derives FROM features)
        ↓
tasks.md (driven BY scenarios)
        ↓
/pb-build (executes with RED→GREEN→REFACTOR)

Behavior Traceability Matrix

Every design component MUST map to a Feature scenario:

Domain Module Core Component Driven by Feature BDD Tags
Auth JwtAuthGuard features/auth/login.feature @auth, @security
Payment StripeWebhookHandler features/billing/checkout.feature @billing, @webhook

Rule: If a design component cannot be mapped to a scenario, remove it from the design.

RFC 2119 Constraints

Design constraints use RFC 2119 keywords — binding for the Builder:

## Architectural Constraints (RFC 2119)

- **[C-01]** Database interactions **MUST** use the existing ORM layer; raw SQL **MUST NOT** be introduced.
- **[C-02]** API responses **SHOULD** maintain <200ms p99 latency.
- **[C-03]** If an unhandled edge case is encountered, the Builder **MUST** halt and file a DCR.

DAG-Enabled Tasks

Tasks include metadata for parallel execution and adaptive model routing:

### Task 2.1: "Successful login" — User authenticates

- **TaskID:** `T1`
- **DependsOn:** `None`
- **Complexity:** `High`
- **Required Skills:** Python, JWT, SQLAlchemy
- **EvalRule:** `behave --tags=@login_success` must pass
Field Purpose
TaskID Unique identifier for DAG resolution
DependsOn Lists prerequisite TaskIDs; None = can run in parallel
Complexity Low → fast model, High → reasoning model
Required Skills Skills the Builder Agent needs
EvalRule Explicit pass/fail criteria

Escalation Protocol

Failure Count Action Model Strategy
1 Retry with same model Same model
2 Escalate — auto-upgrade to stronger model for root-cause analysis +1 tier (Haiku→Sonnet, Sonnet→Opus)
3 File DCR, stop build N/A

Design Philosophy: Agent Harness

pb-spec's design is inspired by Anthropic's research on Effective Harnesses for Long-Running Agents and the Superpowers methodology. Core idea: place AI agents inside a strict, observable, recoverable execution environment.

Principle How pb-spec Implements It
State Grounding Subagents verify workspace state before writing code
Architecture Continuity pb-init records Architecture Decisions; pb-build verifies conformance
Error Quoting Subagents quote specific error messages before attempting fixes
Context Hygiene Only minimal, relevant context passed to each subagent
Recovery Loop Pre-task snapshots + file-scoped recovery
Verification Harness Design docs define explicit verification commands
Observability as Context Task verification includes runtime signals (logs/health)
Escalation Loop 2nd failure auto-escalates; 3rd failure → DCR handoff to pb-refine
Generator/Evaluator Isolation Generator builds; Evaluator judges with fresh context
Evidence Before Claims pb-verification-before-completion: no success claims without fresh verification
RFC 2119 Constraints Binding behavioral constraints prevent hallucination and scope creep
DAG Execution Parallel task execution for independent tasks
Adaptive Steering Complexity-based model routing for cost/speed optimization
Systematic Debugging pb-systematic-debugging: root cause before fixes, scientific method
Skill Auto-Triggering using-pb-spec bootstrap ensures skills activate at the right moments

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pb_spec-1.1.1.tar.gz (25.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pb_spec-1.1.1-py3-none-any.whl (32.3 kB view details)

Uploaded Python 3

File details

Details for the file pb_spec-1.1.1.tar.gz.

File metadata

  • Download URL: pb_spec-1.1.1.tar.gz
  • Upload date:
  • Size: 25.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pb_spec-1.1.1.tar.gz
Algorithm Hash digest
SHA256 1632851586d7689abfe88a56af6cab2bd2c317d45753e07f214365bc7e32a2ff
MD5 215b1f67c3f505e854b46837edb1a2f9
BLAKE2b-256 044fd28712159952ac6de388b85c44b8316477795b507e2851777c256c22665c

See more details on using hashes here.

File details

Details for the file pb_spec-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: pb_spec-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 32.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pb_spec-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6c6c9b43ddece950b88678208d9e8d480c79e80132451873a0999d31d4905a5f
MD5 890505453c96e70f51aa2b3659cf9ac0
BLAKE2b-256 81bab87d37043b837cd57ae69408d157bba258727a63e2d89a0cd150664d6234

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page