Skip to main content

Task automation from markdown specs via Claude CLI

Project description

spec-runner

Task automation from markdown specs via Claude CLI. Execute tasks from a structured tasks.md file with automatic retries, code review, and Git integration.

Installation

uv add spec-runner

Or for development:

uv sync

Requirements:

  • Python 3.10+
  • Claude CLI (claude command available)
  • Git (for branch management)

Quick Start

# Install Claude Code skills (creates .claude/skills in current project)
spec-runner-init

# Execute next ready task
spec-runner run

# Execute specific task
spec-runner run --task=TASK-001

# Execute all ready tasks
spec-runner run --all

# Create tasks interactively
spec-runner plan "add user authentication"

Usage as Library

from spec_runner import Task, ExecutorConfig, parse_tasks, get_next_tasks
from pathlib import Path

tasks = parse_tasks(Path("spec/tasks.md"))
ready = get_next_tasks(tasks)

for task in ready:
    print(f"{task.id}: {task.name} ({task.priority})")

Features

  • Task-based execution — reads tasks from spec/tasks.md with priorities, checklists, and dependencies
  • Specification traceability — links tasks to requirements (REQ-XXX) and design (DESIGN-XXX)
  • Automatic retries — configurable retry policy with error context passed to next attempt
  • Code review — multi-agent review after task completion
  • Git integration — automatic branch creation, commits, and merges
  • Progress logging — timestamped progress file for monitoring
  • Interactive planning — create tasks through dialogue with Claude

Task File Format

Tasks are defined in spec/tasks.md:

## Milestone 1: MVP

### TASK-001: Implement user login
🔴 P0 | ⬜ TODO | Est: 2d

**Checklist:**
- [ ] Create login endpoint
- [ ] Add JWT token generation
- [ ] Write unit tests

**Depends on:****Blocks:** [TASK-002], [TASK-003]

CLI Commands

spec-runner

spec-runner run                     # Execute next ready task
spec-runner run --task=TASK-001     # Execute specific task
spec-runner run --all               # Execute all ready tasks
spec-runner status                  # Show execution status
spec-runner retry TASK-001          # Retry failed task
spec-runner logs TASK-001           # View task logs
spec-runner reset                   # Reset state
spec-runner plan "feature"          # Interactive task creation

spec-runner-init

spec-runner-init                    # Install skills to ./.claude/skills
spec-runner-init --force            # Overwrite existing skills
spec-runner-init /path/to/project   # Install to specific project

spec-task

spec-task list                      # List all tasks
spec-task list --status=todo        # Filter by status
spec-task show TASK-001             # Task details
spec-task start TASK-001            # Mark as in_progress
spec-task done TASK-001             # Mark as done
spec-task stats                     # Statistics
spec-task next                      # Show next ready tasks
spec-task graph                     # Dependency graph

Multi-phase / Multi-project Options

Both spec-runner and spec-task support --spec-prefix for phase-based workflows:

spec-runner run --spec-prefix=phase5-          # Uses spec/phase5-tasks.md
spec-runner run --project-root=/path/to/proj   # Run against another project
spec-task list --spec-prefix=phase5-           # List phase 5 tasks

Configuration

Configuration file: executor.config.yaml

executor:
  max_retries: 3
  task_timeout_minutes: 30
  claude_command: "claude"
  claude_model: "sonnet"
  spec_prefix: ""              # e.g. "phase5-" for phase5-tasks.md

  # Custom CLI template (optional). Placeholders: {cmd}, {model}, {prompt}
  # command_template: "{cmd} -p {prompt} --model {model}"

  # Review can use different CLI
  review_command: "codex"
  review_model: "gpt-4"
  # review_command_template: "{cmd} -p {prompt}"

  # Git settings
  main_branch: ""  # Auto-detect (main/master) or set explicitly: "master"

  hooks:
    pre_start:
      create_git_branch: true
    post_done:
      run_tests: true
      run_lint: true
      auto_commit: true
      run_review: true

  commands:
    test: "pytest tests/ -v"
    lint: "ruff check ."

  paths:
    root: "."                        # Project root directory
    logs: "spec/.executor-logs"
    state: "spec/.executor-state.json"

Git Branch Workflow

The executor manages git branches automatically:

  1. Branch detection: Auto-detects main or master, or use main_branch config
  2. Task branches: Creates task/task-001-name branches for each task
  3. Auto-merge: Merges task branch to main after completion

Fresh repository (after git init):

  • TASK-000 (scaffolding) runs on the initial branch without creating a separate task branch
  • First commit is made on main
  • Subsequent tasks create their own branches

Existing repository:

  • Each task creates a new branch from main
  • After task completion, branch is merged back to main
  • Task branch is deleted after successful merge

Interrupted tasks:

  • Tasks marked as in_progress are resumed first on next run
  • Use --restart flag to ignore in-progress tasks and start fresh

Supported CLIs

CLI Auto-detected Example template
Claude {cmd} -p {prompt} --model {model}
Codex {cmd} -p {prompt} --model {model}
Ollama {cmd} run {model} {prompt}
llama-cli {cmd} -m {model} -p {prompt} --no-display-prompt
llama-server via curl to localhost:8080
Custom Use template {cmd} --prompt {prompt}

Custom Prompts

You can customize prompts for different LLMs by creating files in spec/prompts/:

spec/prompts/
├── review.md           # Default review prompt
├── review.codex.md     # Codex-specific review prompt
├── review.claude.md    # Claude-specific review prompt
├── review.ollama.md    # Ollama-specific review prompt
├── review.llama.md     # llama.cpp-specific review prompt
└── task.md             # Task execution prompt

The executor automatically selects the prompt based on the CLI being used:

  • review_command: "codex" → uses review.codex.md if exists, otherwise review.md
  • review_command: "ollama" → uses review.ollama.md if exists, otherwise review.md

Prompt Variables

Use ${VARIABLE} or {{VARIABLE}} syntax in templates:

Variable Description
${TASK_ID} Task ID (e.g., TASK-001)
${TASK_NAME} Task name
${CHANGED_FILES} List of changed files
${GIT_DIFF} Git diff summary

Response Format

All review prompts should instruct the LLM to end responses with one of:

  • REVIEW_PASSED — code is acceptable
  • REVIEW_FIXED — issues found and fixed
  • REVIEW_FAILED — issues remain

Tip for smaller models (Ollama, llama): Use shorter, simpler prompts and emphasize the response format requirement.

Project Structure

project/
├── pyproject.toml
├── executor.config.yaml
├── src/
│   └── spec_runner/
│       ├── __init__.py
│       ├── executor.py
│       ├── task.py
│       ├── init_cmd.py
│       └── skills/
│           └── spec-generator-skill/
│               ├── SKILL.md
│               └── templates/
└── spec/
    ├── tasks.md
    ├── requirements.md
    ├── design.md
    └── prompts/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spec_runner-0.1.4.post2.tar.gz (68.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spec_runner-0.1.4.post2-py3-none-any.whl (73.0 kB view details)

Uploaded Python 3

File details

Details for the file spec_runner-0.1.4.post2.tar.gz.

File metadata

  • Download URL: spec_runner-0.1.4.post2.tar.gz
  • Upload date:
  • Size: 68.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for spec_runner-0.1.4.post2.tar.gz
Algorithm Hash digest
SHA256 15a88b5e33e7c21e9da29806424e4706fb4fd9505976dd9cc8534aa4c2e0e417
MD5 332170873922c534bf0fa6dda49297f1
BLAKE2b-256 9cced5ee451e51aaf9811bebf05fe3868e4f5579a668dc719c276eb94b292b94

See more details on using hashes here.

File details

Details for the file spec_runner-0.1.4.post2-py3-none-any.whl.

File metadata

File hashes

Hashes for spec_runner-0.1.4.post2-py3-none-any.whl
Algorithm Hash digest
SHA256 049e57b4b9299ac9ee326b00fead1f5b2091050b83b10e5b48a26bc7ec713473
MD5 378c87e07e57ec7d8a78da660bdca379
BLAKE2b-256 8c92788ebdb10e27e33d40bd7af90a5788da83a2d608edbc5a2fd23e2e33c14a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page