Structured task management for AI-assisted development workflows
Project description
FormalTask
Structured task management for AI-assisted development workflows. Integrates with Claude Code to provide epic-based planning, parallel task execution, and automated review workflows.
How It Works
Plan → Critique → Specs → Tasks → Workers → Complete → Merge
You describe what you want. /plan explores the codebase and writes a plan. /critique pokes holes. /decompose splits it into YAML specs. ft epic-decompose commits tasks to SQLite with dependency tracking. ft work spawn launches parallel Claude workers in isolated git worktrees.
The database is the coordination backbone — not just storage.
Plans Carry Their Revision History
Critiques don't live in separate files — they're embedded in the plan:
goals:
- id: "g-1"
current: "Users can log in with email/password"
history:
- version: "r1"
text: "Users can log in"
critique:
verdict: "FIX_AND_SHIP"
findings:
- priority: "P1"
finding: "Missing rate limiting"
action: "Add rate limiter"
resolution: "fixed" # Set by /revise
Each /critique round appends to history. When /revise addresses findings, it sets resolution: fixed|rejected|deferred. The plan carries its full revision history.
Specs Are Contracts, Guards Enforce Them
Specs declare what the completion system will check:
title: "Task 2: Implement API client"
depends_on: [1]
required_reviews: ["code-quality", "security"]
inputs:
schema: "$task[1].outputs.schema" # Auto-wired from Task 1
outputs:
client: ".artifacts/api_client.py" # Task 3 can reference this
acceptance_criteria:
- id: "c-1"
current: "GET /users returns parsed User objects"
command: "pytest tests/test_api.py" # Runnable verification
When ft task complete runs, the completion check evaluates: Did required_reviews all pass? Are acceptance criteria with command: fields passing? The spec is the contract. Guards enforce it.
The Rules Kernel
Completion checks, orchestration, and prompt generation use the same type:
@dataclass
class Rule:
when: str # condition DSL
then: str # output (phase name or Jinja2 template)
target: str # what it applies to ("task.phase", "notify", "tool.block")
priority: int # 0 = informational, 1 = blocks, 999 = catchall
name: str # reason (literal or state key for dynamic lookup)
The kernel is ~60 LOC: evaluate(condition, context) → bool, render(template, context) → str, apply_rules(rules, context) where first match wins. The same evaluator answers: "Is this task done?" "Should we spawn a CI fixer?" "What prompt should this worker get?"
The condition DSL supports AND, OR, NOT, comparisons (==, !=, >, <), dotted path resolution (task.metadata.retries), and bare truthy checks. No parentheses — flatten complex conditions into multiple rules.
22 builtin rules handle the standard completion lifecycle. Three rule sets ship by default:
| Rule set | Purpose |
|---|---|
BUILTIN_RULES |
22 completion rules (review gates, PR checks, docs, acceptance criteria) |
ORCHESTRATION_RULES |
Watch daemon triggers (e.g., alert after 1 hour) |
TOOL_REDIRECT_RULES |
Block/redirect tool usage (e.g., WebSearch → exa) |
Custom Rules Per Task
Tasks can define their own rules in metadata.completion_rules. These are prepended before BUILTIN_RULES, so they get first-match-wins priority:
# In a spec or task metadata:
"completion_rules": [
{
"when": "blocking_findings AND review_rounds.self-critique >= 2",
"then": "needs_escalation",
"target": "task.phase",
"priority": 1,
"name": "Round cap hit. Escalate to human."
}
]
This lets individual tasks define their own completion policies without modifying global rules.
User Templates
Worker prompt templates use the same kernel. Drop a Jinja2 file in ~/.claude/templates/ to override any bundled template — user templates take priority, with automatic fallback to bundled on parse errors.
Workers Create Their Own Tasks
A worker that finds a problem during review can create a new task on the spot:
ft task create-from-finding src/auth.py 42 --title "Fix session expiry edge case"
This creates a critique-gated task — a task with self-critique baked in:
- The task starts in a critique phase (
c1). The worker must self-review before moving to execution. - A custom completion rule caps critique rounds: if P0/P1 findings persist after 2 self-critique rounds, the task escalates to a human via
ft work blocked. - Only after receiving a
verdict_godoes the task transition to the exec phase where normal completion rules apply.
The task inherits its epic from the spawning worker, carries provenance (source_task_id, finding_ref), and can be auto-spawned by the watch daemon.
What Falls Out
Because everything routes through rules and the database:
- Auto-spawn fixer tasks when CI fails
- Nudge stuck workers after 30 minutes
- Inject thorough-approach prompts for complex tasks
- Wire outputs to inputs across task dependencies
- Block completion until required reviews pass
- Workers spawn new tasks mid-flight, with their own completion policies
Quick Start
pip install formaltask
After installation:
-
Set the required environment variable:
export OPENROUTER_API_KEY="<your-key-here>"
-
Run the setup wizard:
ft setup # Interactive mode ft setup --yes # Non-interactive (CI/scripts)
The setup wizard initializes the database, registers Claude Code hooks, and verifies your configuration.
Prerequisites
- Python 3.11+ (required)
- Git (for hooks and version control)
- tmux 3.2+ (optional, enables parallel worker features)
Optional Feature Groups
Install additional features using pip extras:
| Extra | Purpose |
|---|---|
llm |
LLM client libraries (openai, instructor) |
tui |
Terminal user interface dashboard |
test |
Testing dependencies (pytest, hypothesis) |
dev |
Development tools (ruff, basedpyright) |
agents |
Agent-related utilities |
dayflow |
HTTP client utilities |
mcp |
MCP server integration |
all |
All optional dependencies |
Alternative Installation (Development)
For development or contributing to FormalTask:
git clone https://github.com/davidabeyer/formaltask.git
cd formaltask
python3 -m venv venv && source venv/bin/activate
./install.sh
Manual pip Installation
Install in development mode:
pip install -e .
With optional dependencies:
pip install -e ".[all]"
Or install specific extras:
pip install -e ".[tui,test]"
Git Hooks
The ./install.sh script automatically configures git to use the project's tracked hooks. This enables:
- Pre-commit validation (linting, TDD guard)
- Pre-push task status enforcement
- Pre-merge-commit task validation
For manual installations, run: git config core.hooksPath .githooks
Configuration
Settings File
Claude Code settings are stored in ~/.claude/settings.json. This file configures hooks, permissions, and other Claude Code behaviors.
Environment Variables
| Variable | Required | Purpose |
|---|---|---|
OPENROUTER_API_KEY |
Yes | LLM operations via OpenRouter |
PROJECT_ROOT |
For tests and CLI | Database path resolution |
Database
Task data is stored in .claude/formaltask.db (SQLite).
Usage
Command Line
ft --help # Show available commands
ft work spawn <id> # Spawn worker for a task
ft work list # List spawnable tasks
ft work watch # Monitor workers
ft work watch --spawn # Monitor + auto-spawn ready tasks
ft work dashboard # TUI dashboard
ft work inbox # Show blocked workers awaiting input
ft task list <epic> # List tasks in an epic
ft task show <id> # Show task details
ft task complete <id> # Mark task as complete
ft task cancel <id> # Cancel a task
ft epic list # List all epics
ft epic health <epic> # Check epic health
ft setup # Run setup wizard
ft doctor # Verify configuration
Or run as a Python module:
python3 -m formaltask.cli --help
Project Structure
formaltask/
├── cli/ # CLI commands (ft <noun> <verb>)
├── core/ # Completion checking, config
├── data/ # Static data files
├── db/ # Database connection, migrations
├── epics/ # Epic CRUD, YAML parsing
├── git/ # Worktree management, PR queries
├── hooks/ # Hook utilities (shared with hooks/)
├── llm/ # LLM integration (OpenRouter)
├── review/ # Review context, prompt building
├── skills/ # Skill metadata, span tracking
├── state/ # Findings, session tracking
├── tasks/ # Task lifecycle, dependencies, guards
├── validators/ # PreToolUse validators (TDD, doc-guard)
├── vault/ # Knowledge storage
├── workers/ # Worker spawning, monitoring
├── apps/ # TUI applications (dashboard)
└── utils/ # Shared utilities
agents/ # Subagent definitions
hooks/ # Hook entry points for Claude Code events
tests/ # Test suite
.githooks/ # Tracked git hooks
.claude/
└── formaltask.db # Task database (auto-created by ft setup)
See the CLI Reference for full command documentation, Planning Workflow for the plan→critique→revise→decompose lifecycle, and Architecture Overview for how the pieces fit together.
Dashboard
The interactive TUI dashboard (ft work dashboard) provides real-time monitoring and control of parallel workers.
Layout: Status bar (top) showing task counts and auto-spawn state, task list (middle) with color-coded health indicators, terminal pane (bottom) showing the selected worker's output.
Worker states: Each task shows a health indicator — LIVE (running), EXIT (process ended), HELP (needs human input), FIX (has review findings), or queued (ready to spawn).
Keybindings:
| Key | Action |
|---|---|
j / k |
Navigate task list |
Enter |
Attach to selected worker (F12 to detach back) |
S |
Spawn next queued task |
A |
Toggle auto-spawn (automatically fills worker slots) |
+ / - |
Adjust max worker limit (1-10) |
X |
Kill selected worker (double-tap to confirm) |
R |
Restart selected worker (double-tap to confirm) |
i |
Open inbox (blocked workers awaiting input) |
q |
Quit |
Auto-spawn fills available worker slots from the task queue. The status bar shows the current limit (e.g. auto (5)). Adjust with +/- to scale up or down without leaving the dashboard. This is the interactive equivalent of ft work watch --spawn.
Development
Running Tests
pytest tests/ --cov=formaltask
Linting
ruff check formaltask/ --fix
ruff format formaltask/
Type Checking
basedpyright formaltask/
License
MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file formaltask-0.1.0.tar.gz.
File metadata
- Download URL: formaltask-0.1.0.tar.gz
- Upload date:
- Size: 276.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c0787cc2979551ac7cf80d3713cb6817597e9f09ec06e1526fa1952c29b96a2
|
|
| MD5 |
f7a4573126b0bf77eb7061555e304ccb
|
|
| BLAKE2b-256 |
b5e8af089fad1a689e9197508b44d145a3730f58fe55216787c93e7e75ca0e0f
|
File details
Details for the file formaltask-0.1.0-py3-none-any.whl.
File metadata
- Download URL: formaltask-0.1.0-py3-none-any.whl
- Upload date:
- Size: 355.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
274bf57a8564b9e113359327c18709eff2d5f7a810ac0fb161262120635da6d9
|
|
| MD5 |
7e2ca4fcee4da71e02b9ec8c86fabd73
|
|
| BLAKE2b-256 |
b2fa574009e1bafd1db5e254a5c8a4798d6f4fe2d2907e3d32f1622ddd4331fd
|