Skip to main content

Task automation from markdown specs via Claude CLI

Project description

spec-runner

Task automation from markdown specs via Claude CLI. Execute tasks from a structured tasks.md file with automatic retries, code review, Git integration, parallel execution, and live TUI dashboard.

Installation

uv add spec-runner

Or for development:

uv sync

Requirements:

  • Python 3.10+
  • Claude CLI (claude command available)
  • Git (for branch management)
  • gh CLI (optional, for GitHub Issues sync)

Quick Start

# Install Claude Code skills (creates .claude/skills in current project)
spec-runner-init

# Execute next ready task
spec-runner run

# Execute specific task
spec-runner run --task=TASK-001

# Execute all ready tasks
spec-runner run --all

# Execute in parallel with live TUI
spec-runner run --all --parallel --tui

# Create tasks interactively
spec-runner plan "add user authentication"

# Watch mode — continuously execute ready tasks
spec-runner watch

Features

  • Task-based execution — reads tasks from spec/tasks.md with priorities, checklists, and dependencies
  • Specification traceability — links tasks to requirements (REQ-XXX) and design (DESIGN-XXX)
  • Automatic retries — configurable retry policy with exponential backoff and error context forwarding
  • Code review — multi-agent review after task completion with enriched diff context
  • Git integration — automatic branch creation, commits, and merges
  • Parallel execution — run multiple independent tasks concurrently with semaphore-based limiting
  • TUI dashboard — live Textual-based terminal UI with progress bars and log panel
  • Cost tracking — per-task token usage and cost breakdown
  • Watch mode — continuously poll and execute ready tasks
  • Plugin system — extend with custom hooks via spec/plugins/*/plugin.yaml
  • MCP server — read-only Model Context Protocol server for Claude Code integration
  • GitHub Issues sync — bidirectional sync between tasks.md and GitHub Issues
  • Interactive planning — generate specs (requirements + design + tasks) through dialogue with Claude
  • Structured logging — JSON/console output via structlog
  • SQLite state — persistent execution state with WAL mode, auto-migration from legacy JSON
  • HITL review — optional human-in-the-loop approval gate after code review
  • Parallel review — 5 specialized review agents (quality, implementation, testing, simplification, docs) running concurrently
  • Agent personas — role-specific prompt templates and model selection (architect, implementer, reviewer)
  • Constitution guardrails — inviolable project rules from spec/constitution.md injected into every prompt
  • Telegram notifications — alerts on task failure and run completion via Telegram Bot API
  • Pause/resume — pause mid-run with Ctrl+, edit tasks, resume; TUI keybinding p
  • Streaming events — live stdout streaming from Claude CLI to TUI via EventBus
  • Session/idle timeouts — automatic stop after configurable session or idle duration

Task File Format

Tasks are defined in spec/tasks.md:

## Milestone 1: MVP

### TASK-001: Implement user login
🔴 P0 | ⬜ TODO | Est: 2d

**Checklist:**
- [ ] Create login endpoint
- [ ] Add JWT token generation
- [ ] Write unit tests

**Traces to:** [REQ-001], [DESIGN-001]
**Depends on:****Blocks:** [TASK-002], [TASK-003]

CLI Commands

spec-runner

# Execution
spec-runner run                            # Execute next ready task
spec-runner run --task=TASK-001            # Execute specific task
spec-runner run --all                      # Execute all ready tasks
spec-runner run --all --parallel           # Execute ready tasks in parallel
spec-runner run --all --parallel --max-concurrent=5  # With concurrency limit
spec-runner run --all --hitl-review        # Interactive HITL approval gate
spec-runner run --force                    # Skip lock check (stale lock)
spec-runner run --tui                      # Execute with live TUI dashboard
spec-runner run --log-level=DEBUG          # Set log verbosity
spec-runner run --log-json                 # Output logs as JSON

# Monitoring
spec-runner status                         # Show execution status
spec-runner costs                          # Cost breakdown per task
spec-runner costs --json                   # JSON output for automation
spec-runner costs --sort=cost              # Sort by cost descending
spec-runner logs TASK-001                  # View task logs

# Operations
spec-runner retry TASK-001                 # Retry failed task
spec-runner reset                          # Reset state
spec-runner watch                          # Continuously execute ready tasks
spec-runner watch --tui                    # Watch with live TUI dashboard
spec-runner tui                            # Launch TUI status dashboard
spec-runner validate                       # Validate config and tasks

# Planning
spec-runner plan "description"             # Interactive task planning
spec-runner plan --full "description"      # Generate full spec (requirements + design + tasks)

# Integration
spec-runner mcp                            # Launch read-only MCP server (stdio)

spec-task

# Task management
spec-task list                             # List all tasks
spec-task list --status=todo               # Filter by status
spec-task list --priority=p0               # Filter by priority
spec-task list --milestone=mvp             # Filter by milestone
spec-task show TASK-001                    # Task details
spec-task start TASK-001                   # Mark as in_progress
spec-task done TASK-001                    # Mark as done
spec-task block TASK-001                   # Mark as blocked
spec-task check TASK-001 2                 # Mark checklist item
spec-task stats                            # Statistics
spec-task next                             # Show next ready tasks
spec-task graph                            # ASCII dependency graph

# GitHub Issues
spec-task export-gh                        # Export to GitHub Issues format
spec-task sync-to-gh                       # Sync tasks -> GitHub Issues
spec-task sync-to-gh --dry-run             # Preview without making changes
spec-task sync-from-gh                     # Sync GitHub Issues -> tasks.md

spec-runner-init

spec-runner-init                           # Install skills to ./.claude/skills
spec-runner-init --force                   # Overwrite existing skills
spec-runner-init /path/to/project          # Install to specific project

Multi-phase Options

Both spec-runner and spec-task support --spec-prefix for phase-based workflows:

spec-runner run --spec-prefix=phase5-          # Uses spec/phase5-tasks.md
spec-task list --spec-prefix=phase5-           # List phase 5 tasks

Usage as Library

from spec_runner import Task, ExecutorConfig, parse_tasks, get_next_tasks
from pathlib import Path

tasks = parse_tasks(Path("spec/tasks.md"))
ready = get_next_tasks(tasks)

for task in ready:
    print(f"{task.id}: {task.name} ({task.priority})")

MCP Server (Claude Code Integration)

spec-runner includes a read-only MCP server for querying project status from Claude Code.

Add to .mcp.json:

{
  "mcpServers": {
    "spec-runner": {
      "command": "spec-runner",
      "args": ["mcp"]
    }
  }
}

Available tools: spec_runner_status, spec_runner_tasks, spec_runner_costs, spec_runner_logs.

Configuration

Configuration file: executor.config.yaml

executor:
  max_retries: 3
  task_timeout_minutes: 30
  claude_command: "claude"
  claude_model: "sonnet"
  spec_prefix: ""              # e.g. "phase5-" for phase5-tasks.md
  max_concurrent: 3            # Parallel task limit
  budget_usd: 50.0             # Total budget cap
  task_budget_usd: 10.0        # Per-task budget cap
  session_timeout_minutes: 0   # Global session timeout (0 = disabled)
  idle_timeout_minutes: 0      # Idle timeout between tasks (0 = disabled)

  # Telegram notifications (optional)
  telegram_bot_token: ""       # Bot token from @BotFather
  telegram_chat_id: ""         # Chat ID to send notifications to
  notify_on: [run_complete, task_failed]

  # Agent personas (optional)
  personas:
    implementer:
      system_prompt: "You are a focused Python developer"
      model: "sonnet"
    reviewer:
      system_prompt: "You are a senior code reviewer"
      model: "haiku"

  hooks:
    pre_start:
      create_git_branch: true
    post_done:
      run_tests: true
      run_lint: true
      auto_commit: true
      run_review: true
      review_parallel: false   # Run 5 review agents in parallel
      review_roles: [quality, implementation, testing]

  commands:
    test: "uv run pytest tests/ -v"
    lint: "uv run ruff check ."

  paths:
    root: "."
    logs: "spec/.executor-logs"

Git Branch Workflow

  1. Branch detection: Auto-detects main or master, or use main_branch config
  2. Task branches: Creates task/TASK-001-short-name branches for each task
  3. Auto-merge: Merges task branch to main after completion

Supported CLIs

CLI Auto-detected Example template
Claude Yes {cmd} -p {prompt} --model {model}
Codex Yes {cmd} -p {prompt} --model {model}
Ollama Yes {cmd} run {model} {prompt}
llama-cli Yes {cmd} -m {model} -p {prompt} --no-display-prompt
Custom Use template {cmd} --prompt {prompt}

Project Structure

project/
├── pyproject.toml
├── executor.config.yaml
├── src/
│   └── spec_runner/
│       ├── __init__.py
│       ├── executor.py          # Re-exports (backward compat)
│       ├── cli.py               # CLI commands + argparse
│       ├── execution.py         # Task execution + retry logic
│       ├── parallel.py          # Async parallel execution
│       ├── config.py            # ExecutorConfig + YAML loading
│       ├── state.py             # SQLite state persistence
│       ├── prompt.py            # Prompt building + templates
│       ├── hooks.py             # Git ops, code review (parallel 5-role), plugins
│       ├── runner.py            # Subprocess execution + event streaming
│       ├── task.py              # Task parsing + management
│       ├── validate.py          # Config + task validation
│       ├── plugins.py           # Plugin discovery + hooks
│       ├── logging.py           # Structured logging (structlog)
│       ├── events.py            # EventBus for streaming to TUI
│       ├── notifications.py     # Telegram notifications
│       ├── tui.py               # Textual TUI dashboard + live streaming
│       ├── mcp_server.py        # MCP server (FastMCP, stdio)
│       ├── init_cmd.py          # Skill installer
│       └── skills/
│           └── spec-generator-skill/
└── spec/
    ├── tasks.md
    ├── requirements.md
    ├── design.md
    └── plugins/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spec_runner-1.1.0.tar.gz (148.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spec_runner-1.1.0-py3-none-any.whl (119.0 kB view details)

Uploaded Python 3

File details

Details for the file spec_runner-1.1.0.tar.gz.

File metadata

  • Download URL: spec_runner-1.1.0.tar.gz
  • Upload date:
  • Size: 148.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for spec_runner-1.1.0.tar.gz
Algorithm Hash digest
SHA256 ca33988e18fffa201dbdce1f64da155542572e8781322830c2f719b86444ff22
MD5 f518788880b9a07bd904f82a98ae8069
BLAKE2b-256 79327ed630883b93febe3415f0630ea8290221d5782f0e18db401a2d56df291a

See more details on using hashes here.

File details

Details for the file spec_runner-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: spec_runner-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 119.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for spec_runner-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2de1f018d3d150e640c41fad2cd8e64997869ca3beda3520d1637fb8e0f8f060
MD5 511e0f9198c47a7b0fac4511f2b55ebf
BLAKE2b-256 d153d3c35f2dc5e87ac1896dc36e76afb1e6c0fb95681fbe75fb61a20bfff59b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page