Skip to main content

Task automation from markdown specs via Claude CLI

Project description

spec-runner

Task automation from markdown specs via Claude CLI. Execute tasks from a structured tasks.md file with automatic retries, code review, and Git integration.

Installation

uv add spec-runner

Or for development:

uv sync

Requirements:

  • Python 3.10+
  • Claude CLI (claude command available)
  • Git (for branch management)

Quick Start

# Install Claude Code skills (creates .claude/skills in current project)
spec-runner-init

# Execute next ready task
spec-runner run

# Execute specific task
spec-runner run --task=TASK-001

# Execute all ready tasks
spec-runner run --all

# Create tasks interactively
spec-runner plan "add user authentication"

Usage as Library

from spec_runner import Task, ExecutorConfig, parse_tasks, get_next_tasks
from pathlib import Path

tasks = parse_tasks(Path("spec/tasks.md"))
ready = get_next_tasks(tasks)

for task in ready:
    print(f"{task.id}: {task.name} ({task.priority})")

Features

  • Task-based execution — reads tasks from spec/tasks.md with priorities, checklists, and dependencies
  • Specification traceability — links tasks to requirements (REQ-XXX) and design (DESIGN-XXX)
  • Automatic retries — configurable retry policy with error context passed to next attempt
  • Code review — multi-agent review after task completion
  • Git integration — automatic branch creation, commits, and merges
  • Progress logging — timestamped progress file for monitoring
  • Interactive planning — create tasks through dialogue with Claude

Task File Format

Tasks are defined in spec/tasks.md:

## Milestone 1: MVP

### TASK-001: Implement user login
🔴 P0 | ⬜ TODO | Est: 2d

**Checklist:**
- [ ] Create login endpoint
- [ ] Add JWT token generation
- [ ] Write unit tests

**Depends on:****Blocks:** [TASK-002], [TASK-003]

CLI Commands

spec-runner

spec-runner run                     # Execute next ready task
spec-runner run --task=TASK-001     # Execute specific task
spec-runner run --all               # Execute all ready tasks
spec-runner status                  # Show execution status
spec-runner retry TASK-001          # Retry failed task
spec-runner logs TASK-001           # View task logs
spec-runner reset                   # Reset state
spec-runner plan "feature"          # Interactive task creation

spec-runner-init

spec-runner-init                    # Install skills to ./.claude/skills
spec-runner-init --force            # Overwrite existing skills
spec-runner-init /path/to/project   # Install to specific project

spec-task

spec-task list                      # List all tasks
spec-task list --status=todo        # Filter by status
spec-task show TASK-001             # Task details
spec-task start TASK-001            # Mark as in_progress
spec-task done TASK-001             # Mark as done
spec-task stats                     # Statistics
spec-task next                      # Show next ready tasks
spec-task graph                     # Dependency graph

Multi-phase / Multi-project Options

Both spec-runner and spec-task support --spec-prefix for phase-based workflows:

spec-runner run --spec-prefix=phase5-          # Uses spec/phase5-tasks.md
spec-runner run --project-root=/path/to/proj   # Run against another project
spec-task list --spec-prefix=phase5-           # List phase 5 tasks

Configuration

Configuration file: executor.config.yaml

executor:
  max_retries: 3
  task_timeout_minutes: 30
  claude_command: "claude"
  claude_model: "sonnet"
  spec_prefix: ""              # e.g. "phase5-" for phase5-tasks.md

  # Custom CLI template (optional). Placeholders: {cmd}, {model}, {prompt}
  # command_template: "{cmd} -p {prompt} --model {model}"

  # Review can use different CLI
  review_command: "codex"
  review_model: "gpt-4"
  # review_command_template: "{cmd} -p {prompt}"

  # Git settings
  main_branch: ""  # Auto-detect (main/master) or set explicitly: "master"

  hooks:
    pre_start:
      create_git_branch: true
    post_done:
      run_tests: true
      run_lint: true
      auto_commit: true
      run_review: true

  commands:
    test: "pytest tests/ -v"
    lint: "ruff check ."

  paths:
    root: "."                        # Project root directory
    logs: "spec/.executor-logs"
    state: "spec/.executor-state.json"

Git Branch Workflow

The executor manages git branches automatically:

  1. Branch detection: Auto-detects main or master, or use main_branch config
  2. Task branches: Creates task/task-001-name branches for each task
  3. Auto-merge: Merges task branch to main after completion

Fresh repository (after git init):

  • TASK-000 (scaffolding) runs on the initial branch without creating a separate task branch
  • First commit is made on main
  • Subsequent tasks create their own branches

Existing repository:

  • Each task creates a new branch from main
  • After task completion, branch is merged back to main
  • Task branch is deleted after successful merge

Interrupted tasks:

  • Tasks marked as in_progress are resumed first on next run
  • Use --restart flag to ignore in-progress tasks and start fresh

Supported CLIs

CLI Auto-detected Example template
Claude {cmd} -p {prompt} --model {model}
Codex {cmd} -p {prompt} --model {model}
Ollama {cmd} run {model} {prompt}
llama-cli {cmd} -m {model} -p {prompt} --no-display-prompt
llama-server via curl to localhost:8080
Custom Use template {cmd} --prompt {prompt}

Custom Prompts

You can customize prompts for different LLMs by creating files in spec/prompts/:

spec/prompts/
├── review.md           # Default review prompt
├── review.codex.md     # Codex-specific review prompt
├── review.claude.md    # Claude-specific review prompt
├── review.ollama.md    # Ollama-specific review prompt
├── review.llama.md     # llama.cpp-specific review prompt
└── task.md             # Task execution prompt

The executor automatically selects the prompt based on the CLI being used:

  • review_command: "codex" → uses review.codex.md if exists, otherwise review.md
  • review_command: "ollama" → uses review.ollama.md if exists, otherwise review.md

Prompt Variables

Use ${VARIABLE} or {{VARIABLE}} syntax in templates:

Variable Description
${TASK_ID} Task ID (e.g., TASK-001)
${TASK_NAME} Task name
${CHANGED_FILES} List of changed files
${GIT_DIFF} Git diff summary

Response Format

All review prompts should instruct the LLM to end responses with one of:

  • REVIEW_PASSED — code is acceptable
  • REVIEW_FIXED — issues found and fixed
  • REVIEW_FAILED — issues remain

Tip for smaller models (Ollama, llama): Use shorter, simpler prompts and emphasize the response format requirement.

Project Structure

project/
├── pyproject.toml
├── executor.config.yaml
├── src/
│   └── spec_runner/
│       ├── __init__.py
│       ├── executor.py
│       ├── task.py
│       ├── init_cmd.py
│       └── skills/
│           └── spec-generator-skill/
│               ├── SKILL.md
│               └── templates/
└── spec/
    ├── tasks.md
    ├── requirements.md
    ├── design.md
    └── prompts/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spec_runner-0.1.5.post1.tar.gz (68.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spec_runner-0.1.5.post1-py3-none-any.whl (73.6 kB view details)

Uploaded Python 3

File details

Details for the file spec_runner-0.1.5.post1.tar.gz.

File metadata

  • Download URL: spec_runner-0.1.5.post1.tar.gz
  • Upload date:
  • Size: 68.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for spec_runner-0.1.5.post1.tar.gz
Algorithm Hash digest
SHA256 5ab32b2c08ed4b33268d4cdc5609facd1f03f51844f21e4e6db207e272868095
MD5 da460434729234ba99c2bf2ecf1aeefd
BLAKE2b-256 c73a973b5fe9a4b532d9608efa16f9d8c18778b865a634ce0821c25bb66979c6

See more details on using hashes here.

File details

Details for the file spec_runner-0.1.5.post1-py3-none-any.whl.

File metadata

File hashes

Hashes for spec_runner-0.1.5.post1-py3-none-any.whl
Algorithm Hash digest
SHA256 dbe2a64ede4dbc2377112ac29c3abb8ecde5ca47f3a309260b2b7fe0cf9a50bb
MD5 02dc013a04477f59008c4391fca38bfa
BLAKE2b-256 4638441a8f99c6139c796e364bb10e2c46e6f93e8b3e45fc4e947c41380a7b82

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page