Skip to main content

Task automation from markdown specs via Claude CLI

Project description

spec-runner

Task automation from markdown specs via Claude CLI. Execute tasks from a structured tasks.md file with automatic retries, code review, and Git integration.

Installation

uv add spec-runner

Or for development:

uv sync

Requirements:

  • Python 3.10+
  • Claude CLI (claude command available)
  • Git (for branch management)

Quick Start

# Install Claude Code skills (creates .claude/skills in current project)
spec-runner-init

# Execute next ready task
spec-runner run

# Execute specific task
spec-runner run --task=TASK-001

# Execute all ready tasks
spec-runner run --all

# Create tasks interactively
spec-runner plan "add user authentication"

Usage as Library

from spec_runner import Task, ExecutorConfig, parse_tasks, get_next_tasks
from pathlib import Path

tasks = parse_tasks(Path("spec/tasks.md"))
ready = get_next_tasks(tasks)

for task in ready:
    print(f"{task.id}: {task.name} ({task.priority})")

Features

  • Task-based execution — reads tasks from spec/tasks.md with priorities, checklists, and dependencies
  • Specification traceability — links tasks to requirements (REQ-XXX) and design (DESIGN-XXX)
  • Automatic retries — configurable retry policy with error context passed to next attempt
  • Code review — multi-agent review after task completion
  • Git integration — automatic branch creation, commits, and merges
  • Progress logging — timestamped progress file for monitoring
  • Interactive planning — create tasks through dialogue with Claude

Task File Format

Tasks are defined in spec/tasks.md:

## Milestone 1: MVP

### TASK-001: Implement user login
🔴 P0 | ⬜ TODO | Est: 2d

**Checklist:**
- [ ] Create login endpoint
- [ ] Add JWT token generation
- [ ] Write unit tests

**Depends on:****Blocks:** [TASK-002], [TASK-003]

CLI Commands

spec-runner

spec-runner run                     # Execute next ready task
spec-runner run --task=TASK-001     # Execute specific task
spec-runner run --all               # Execute all ready tasks
spec-runner status                  # Show execution status
spec-runner retry TASK-001          # Retry failed task
spec-runner logs TASK-001           # View task logs
spec-runner reset                   # Reset state
spec-runner plan "feature"          # Interactive task creation

spec-runner-init

spec-runner-init                    # Install skills to ./.claude/skills
spec-runner-init --force            # Overwrite existing skills
spec-runner-init /path/to/project   # Install to specific project

spec-task

spec-task list                      # List all tasks
spec-task list --status=todo        # Filter by status
spec-task show TASK-001             # Task details
spec-task start TASK-001            # Mark as in_progress
spec-task done TASK-001             # Mark as done
spec-task stats                     # Statistics
spec-task next                      # Show next ready tasks
spec-task graph                     # Dependency graph

Multi-phase / Multi-project Options

Both spec-runner and spec-task support --spec-prefix for phase-based workflows:

spec-runner run --spec-prefix=phase5-          # Uses spec/phase5-tasks.md
spec-runner run --project-root=/path/to/proj   # Run against another project
spec-task list --spec-prefix=phase5-           # List phase 5 tasks

Configuration

Configuration file: executor.config.yaml

executor:
  max_retries: 3
  task_timeout_minutes: 30
  claude_command: "claude"
  claude_model: "sonnet"
  spec_prefix: ""              # e.g. "phase5-" for phase5-tasks.md

  # Custom CLI template (optional). Placeholders: {cmd}, {model}, {prompt}
  # command_template: "{cmd} -p {prompt} --model {model}"

  # Review can use different CLI
  review_command: "codex"
  review_model: "gpt-4"
  # review_command_template: "{cmd} -p {prompt}"

  # Git settings
  main_branch: ""  # Auto-detect (main/master) or set explicitly: "master"

  hooks:
    pre_start:
      create_git_branch: true
    post_done:
      run_tests: true
      run_lint: true
      auto_commit: true
      run_review: true

  commands:
    test: "pytest tests/ -v"
    lint: "ruff check ."

  paths:
    root: "."                        # Project root directory
    logs: "spec/.executor-logs"
    state: "spec/.executor-state.json"

Git Branch Workflow

The executor manages git branches automatically:

  1. Branch detection: Auto-detects main or master, or use main_branch config
  2. Task branches: Creates task/task-001-name branches for each task
  3. Auto-merge: Merges task branch to main after completion

Fresh repository (after git init):

  • TASK-000 (scaffolding) runs on the initial branch without creating a separate task branch
  • First commit is made on main
  • Subsequent tasks create their own branches

Existing repository:

  • Each task creates a new branch from main
  • After task completion, branch is merged back to main
  • Task branch is deleted after successful merge

Interrupted tasks:

  • Tasks marked as in_progress are resumed first on next run
  • Use --restart flag to ignore in-progress tasks and start fresh

Supported CLIs

CLI Auto-detected Example template
Claude {cmd} -p {prompt} --model {model}
Codex {cmd} -p {prompt} --model {model}
Ollama {cmd} run {model} {prompt}
llama-cli {cmd} -m {model} -p {prompt} --no-display-prompt
llama-server via curl to localhost:8080
Custom Use template {cmd} --prompt {prompt}

Custom Prompts

You can customize prompts for different LLMs by creating files in spec/prompts/:

spec/prompts/
├── review.md           # Default review prompt
├── review.codex.md     # Codex-specific review prompt
├── review.claude.md    # Claude-specific review prompt
├── review.ollama.md    # Ollama-specific review prompt
├── review.llama.md     # llama.cpp-specific review prompt
└── task.md             # Task execution prompt

The executor automatically selects the prompt based on the CLI being used:

  • review_command: "codex" → uses review.codex.md if exists, otherwise review.md
  • review_command: "ollama" → uses review.ollama.md if exists, otherwise review.md

Prompt Variables

Use ${VARIABLE} or {{VARIABLE}} syntax in templates:

Variable Description
${TASK_ID} Task ID (e.g., TASK-001)
${TASK_NAME} Task name
${CHANGED_FILES} List of changed files
${GIT_DIFF} Git diff summary

Response Format

All review prompts should instruct the LLM to end responses with one of:

  • REVIEW_PASSED — code is acceptable
  • REVIEW_FIXED — issues found and fixed
  • REVIEW_FAILED — issues remain

Tip for smaller models (Ollama, llama): Use shorter, simpler prompts and emphasize the response format requirement.

Project Structure

project/
├── pyproject.toml
├── executor.config.yaml
├── src/
│   └── spec_runner/
│       ├── __init__.py
│       ├── executor.py
│       ├── task.py
│       ├── init_cmd.py
│       └── skills/
│           └── spec-generator-skill/
│               ├── SKILL.md
│               └── templates/
└── spec/
    ├── tasks.md
    ├── requirements.md
    ├── design.md
    └── prompts/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spec_runner-0.3.0.tar.gz (118.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spec_runner-0.3.0-py3-none-any.whl (100.3 kB view details)

Uploaded Python 3

File details

Details for the file spec_runner-0.3.0.tar.gz.

File metadata

  • Download URL: spec_runner-0.3.0.tar.gz
  • Upload date:
  • Size: 118.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for spec_runner-0.3.0.tar.gz
Algorithm Hash digest
SHA256 f59410df27f5a252672a5fa2505528cd57efc832e468531ab11b5cccad4a668e
MD5 9e3546295eae054ffa79ac1ab8e5701c
BLAKE2b-256 dd9fe3ad4c5efcc138d3a4eace4be82e52eb2f621ff0d4dea10fb2edfb030864

See more details on using hashes here.

File details

Details for the file spec_runner-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: spec_runner-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 100.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for spec_runner-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5f4917a32abaf0e1248561ee186837fd609e6fbfecd4e02a8f582bbe9a5c58ea
MD5 383f37ced4bd4636477096290603fabe
BLAKE2b-256 353b767873e1b1b59778193363c2b718f2f67d83a034c7ad377f4a8fbc528609

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page