Skip to main content

Structured task management for AI-assisted development workflows

Project description

FormalTask

Structured task management for AI-assisted development workflows. Integrates with Claude Code to provide epic-based planning, parallel task execution, and automated review workflows.

How It Works

Plan → Critique → Specs → Tasks → Workers → Complete → Merge

You describe what you want. /plan explores the codebase and writes a plan. /critique pokes holes. /decompose splits it into YAML specs. ft epic-decompose commits tasks to SQLite with dependency tracking. ft work spawn launches parallel Claude workers in isolated git worktrees.

The database is the coordination backbone — not just storage.

Plans Carry Their Revision History

Critiques don't live in separate files — they're embedded in the plan:

goals:
  - id: "g-1"
    current: "Users can log in with email/password"
    history:
      - version: "r1"
        text: "Users can log in"
        critique:
          verdict: "FIX_AND_SHIP"
          findings:
            - priority: "P1"
              finding: "Missing rate limiting"
              action: "Add rate limiter"
              resolution: "fixed"  # Set by /revise

Each /critique round appends to history. When /revise addresses findings, it sets resolution: fixed|rejected|deferred. The plan carries its full revision history.

Specs Are Contracts, Guards Enforce Them

Specs declare what the completion system will check:

title: "Task 2: Implement API client"
depends_on: [1]
required_reviews: ["code-quality", "security"]
inputs:
  schema: "$task[1].outputs.schema"       # Auto-wired from Task 1
outputs:
  client: ".artifacts/api_client.py"      # Task 3 can reference this
acceptance_criteria:
  - id: "c-1"
    current: "GET /users returns parsed User objects"
    command: "pytest tests/test_api.py"   # Runnable verification

When ft task complete runs, the completion check evaluates: Did required_reviews all pass? Are acceptance criteria with command: fields passing? The spec is the contract. Guards enforce it.

The Rules Kernel

Completion checks, orchestration, and prompt generation use the same type:

@dataclass
class Rule:
    when: str      # condition DSL
    then: str      # output (phase name or Jinja2 template)
    target: str    # what it applies to ("task.phase", "notify", "tool.block")
    priority: int  # 0 = informational, 1 = blocks, 999 = catchall
    name: str      # reason (literal or state key for dynamic lookup)

The kernel is ~60 LOC: evaluate(condition, context) → bool, render(template, context) → str, apply_rules(rules, context) where first match wins. The same evaluator answers: "Is this task done?" "Should we spawn a CI fixer?" "What prompt should this worker get?"

The condition DSL supports AND, OR, NOT, comparisons (==, !=, >, <), dotted path resolution (task.metadata.retries), and bare truthy checks. No parentheses — flatten complex conditions into multiple rules.

22 builtin rules handle the standard completion lifecycle. Three rule sets ship by default:

Rule set Purpose
BUILTIN_RULES 22 completion rules (review gates, PR checks, docs, acceptance criteria)
ORCHESTRATION_RULES Watch daemon triggers (e.g., alert after 1 hour)
TOOL_REDIRECT_RULES Block/redirect tool usage (e.g., WebSearch → exa)

Custom Rules Per Task

Tasks can define their own rules in metadata.completion_rules. These are prepended before BUILTIN_RULES, so they get first-match-wins priority:

# In a spec or task metadata:
"completion_rules": [
    {
        "when": "blocking_findings AND review_rounds.self-critique >= 2",
        "then": "needs_escalation",
        "target": "task.phase",
        "priority": 1,
        "name": "Round cap hit. Escalate to human."
    }
]

This lets individual tasks define their own completion policies without modifying global rules.

User Templates

Worker prompt templates use the same kernel. Drop a Jinja2 file in ~/.claude/templates/ to override any bundled template — user templates take priority, with automatic fallback to bundled on parse errors.

Workers Create Their Own Tasks

A worker that finds a problem during review can create a new task on the spot:

ft task create-from-finding src/auth.py 42 --title "Fix session expiry edge case"

This creates a critique-gated task — a task with self-critique baked in:

  1. The task starts in a critique phase (c1). The worker must self-review before moving to execution.
  2. A custom completion rule caps critique rounds: if P0/P1 findings persist after 2 self-critique rounds, the task escalates to a human via ft work blocked.
  3. Only after receiving a verdict_go does the task transition to the exec phase where normal completion rules apply.

The task inherits its epic from the spawning worker, carries provenance (source_task_id, finding_ref), and can be auto-spawned by the watch daemon.

What Falls Out

Because everything routes through rules and the database:

  • Auto-spawn fixer tasks when CI fails
  • Nudge stuck workers after 30 minutes
  • Inject thorough-approach prompts for complex tasks
  • Wire outputs to inputs across task dependencies
  • Block completion until required reviews pass
  • Workers spawn new tasks mid-flight, with their own completion policies

Quick Start

pip install formaltask

After installation:

  1. Set the required environment variable:

    export OPENROUTER_API_KEY="<your-key-here>"
    
  2. Run the setup wizard:

    ft setup        # Interactive mode
    ft setup --yes  # Non-interactive (CI/scripts)
    

    The setup wizard initializes the database, registers Claude Code hooks, and verifies your configuration.

Prerequisites

  • Python 3.11+ (required)
  • Git (for hooks and version control)
  • tmux 3.2+ (optional, enables parallel worker features)

Optional Feature Groups

Install additional features using pip extras:

Extra Purpose
llm LLM client libraries (openai, instructor)
tui Terminal user interface dashboard
test Testing dependencies (pytest, hypothesis)
dev Development tools (ruff, basedpyright)
agents Agent-related utilities
dayflow HTTP client utilities
mcp MCP server integration
all All optional dependencies

Alternative Installation (Development)

For development or contributing to FormalTask:

git clone https://github.com/davidabeyer/formaltask.git
cd formaltask
python3 -m venv venv && source venv/bin/activate
./install.sh

Manual pip Installation

Install in development mode:

pip install -e .

With optional dependencies:

pip install -e ".[all]"

Or install specific extras:

pip install -e ".[tui,test]"

Git Hooks

The ./install.sh script automatically configures git to use the project's tracked hooks. This enables:

  • Pre-commit validation (linting, TDD guard)
  • Pre-push task status enforcement
  • Pre-merge-commit task validation

For manual installations, run: git config core.hooksPath .githooks

Configuration

Settings File

Claude Code settings are stored in ~/.claude/settings.json. This file configures hooks, permissions, and other Claude Code behaviors.

Environment Variables

Variable Required Purpose
OPENROUTER_API_KEY Yes LLM operations via OpenRouter
PROJECT_ROOT For tests and CLI Database path resolution

Database

Task data is stored in .claude/formaltask.db (SQLite).

Usage

Command Line

ft --help                      # Show available commands
ft work spawn <id>             # Spawn worker for a task
ft work list                   # List spawnable tasks
ft work watch                  # Monitor workers
ft work watch --spawn          # Monitor + auto-spawn ready tasks
ft work dashboard              # TUI dashboard
ft work inbox                  # Show blocked workers awaiting input
ft task list <epic>            # List tasks in an epic
ft task show <id>              # Show task details
ft task complete <id>          # Mark task as complete
ft task cancel <id>            # Cancel a task
ft epic list                   # List all epics
ft epic health <epic>          # Check epic health
ft setup                       # Run setup wizard
ft doctor                      # Verify configuration

Or run as a Python module:

python3 -m formaltask.cli --help

Project Structure

formaltask/
├── cli/                # CLI commands (ft <noun> <verb>)
├── core/               # Completion checking, config
├── data/               # Static data files
├── db/                 # Database connection, migrations
├── epics/              # Epic CRUD, YAML parsing
├── git/                # Worktree management, PR queries
├── hooks/              # Hook utilities (shared with hooks/)
├── llm/                # LLM integration (OpenRouter)
├── review/             # Review context, prompt building
├── skills/             # Skill metadata, span tracking
├── state/              # Findings, session tracking
├── tasks/              # Task lifecycle, dependencies, guards
├── validators/         # PreToolUse validators (TDD, doc-guard)
├── vault/              # Knowledge storage
├── workers/            # Worker spawning, monitoring
├── apps/               # TUI applications (dashboard)
└── utils/              # Shared utilities
agents/                 # Subagent definitions
hooks/                  # Hook entry points for Claude Code events
tests/                  # Test suite
.githooks/              # Tracked git hooks
.claude/
└── formaltask.db       # Task database (auto-created by ft setup)

See the CLI Reference for full command documentation, Planning Workflow for the plan→critique→revise→decompose lifecycle, and Architecture Overview for how the pieces fit together.

Dashboard

The interactive TUI dashboard (ft work dashboard) provides real-time monitoring and control of parallel workers.

Dashboard

Layout: Status bar (top) showing task counts and auto-spawn state, task list (middle) with color-coded health indicators, terminal pane (bottom) showing the selected worker's output.

Worker states: Each task shows a health indicator — LIVE (running), EXIT (process ended), HELP (needs human input), FIX (has review findings), or queued (ready to spawn).

Keybindings:

Key Action
j / k Navigate task list
Enter Attach to selected worker (F12 to detach back)
S Spawn next queued task
A Toggle auto-spawn (automatically fills worker slots)
+ / - Adjust max worker limit (1-10)
X Kill selected worker (double-tap to confirm)
R Restart selected worker (double-tap to confirm)
i Open inbox (blocked workers awaiting input)
q Quit

Auto-spawn fills available worker slots from the task queue. The status bar shows the current limit (e.g. auto (5)). Adjust with +/- to scale up or down without leaving the dashboard. This is the interactive equivalent of ft work watch --spawn.

Development

Running Tests

pytest tests/ --cov=formaltask

Linting

ruff check formaltask/ --fix
ruff format formaltask/

Type Checking

basedpyright formaltask/

License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

formaltask-0.1.0.tar.gz (276.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

formaltask-0.1.0-py3-none-any.whl (355.1 kB view details)

Uploaded Python 3

File details

Details for the file formaltask-0.1.0.tar.gz.

File metadata

  • Download URL: formaltask-0.1.0.tar.gz
  • Upload date:
  • Size: 276.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for formaltask-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7c0787cc2979551ac7cf80d3713cb6817597e9f09ec06e1526fa1952c29b96a2
MD5 f7a4573126b0bf77eb7061555e304ccb
BLAKE2b-256 b5e8af089fad1a689e9197508b44d145a3730f58fe55216787c93e7e75ca0e0f

See more details on using hashes here.

File details

Details for the file formaltask-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: formaltask-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 355.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for formaltask-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 274bf57a8564b9e113359327c18709eff2d5f7a810ac0fb161262120635da6d9
MD5 7e2ca4fcee4da71e02b9ec8c86fabd73
BLAKE2b-256 b2fa574009e1bafd1db5e254a5c8a4798d6f4fe2d2907e3d32f1622ddd4331fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page