Skip to main content

Whilly Wiggum loop — Ralph's smarter brother technique for continuous AI agent loops: TRIZ analyzer, Decision Gate, PRD wizard, Rich TUI dashboard, tmux/git-worktree parallelism, self-healing system

Project description

Whilly Orchestrator

Python 3.10+ License: MIT Workshop kit

Python implementation of the Whilly Wiggum loop — Ralph Wiggum's smarter brother. Same family, same "I'm helping!" spirit, but with TRIZ contradiction analysis, a Decision Gate that refuses nonsense upfront, a PRD wizard, and a Rich TUI dashboard on top of the classic continuous agent loop.

🇷🇺 Краткое описание на русском · 🎓 Workshop kit (HackSprint1)

"I'm helping — and I've read TRIZ." — Whilly Wiggum

What it does

Whilly runs a loop: pick a pending task → hand it to an LLM agent → verify result → commit → next. It keeps running until the task board is empty, a budget is exhausted, or you stop it. Parallel mode dispatches multiple agents in tmux panes or git worktrees.

The base technique was first described in Ghuntley's post on the Ralph Wiggum loop and widely adopted across the Claude Code community. Whilly is the brainier sibling: all of Ralph's "pick task → try → repeat" stamina, plus a TRIZ analyzer for surfacing contradictions, a Decision Gate for refusing garbage tasks, and a PRD wizard for understanding the problem before swinging at it.

Features

  • Continuous agent loop — pull tasks from a JSON plan, run Claude CLI on each, retry on transient errors
  • Rich TUI dashboard — live progress, token usage, cost totals, per-task status; hotkeys for pause/reset/skip
  • Parallel execution — tmux panes or git worktrees, up to N concurrent agents with budget/deadlock guards
  • Self-healing system 🛡️ — auto-detect crashes, fix common code errors (NameError, ImportError), restart pipeline
  • Task decomposer — LLM-based breakdown of oversized tasks into subtasks
  • PRD wizard — interactive Product Requirements Document generation, then auto-derive tasks from the PRD
  • TRIZ analyzer — surface contradictions and inventive principles for ambiguous tasks
  • State store — persistent task state across restarts, per-task per-iteration logs
  • Notifications — budget warnings, deadlock detection, auth/API error alerts

Install

pip install whilly-orchestrator

Or from source:

git clone https://github.com/mshegolev/whilly-orchestrator
cd whilly-orchestrator
pip install -e .

Requires Claude CLI on PATH (or set CLAUDE_BIN).

Quick start

  1. Create tasks.json describing the work:

    {
      "project": "health-endpoint",
      "tasks": [
        {
          "id": "TASK-001",
          "phase": "Phase 1",
          "category": "functional",
          "priority": "high",
          "description": "Add a /health endpoint returning {\"status\":\"ok\"}",
          "status": "pending",
          "dependencies": [],
          "key_files": ["app/server.py"],
          "acceptance_criteria": ["GET /health returns 200 with {\"status\":\"ok\"}"],
          "test_steps": ["curl -s localhost:8000/health"]
        },
        {
          "id": "TASK-002",
          "phase": "Phase 1",
          "category": "test",
          "priority": "high",
          "description": "Write a pytest covering the new endpoint",
          "status": "pending",
          "dependencies": ["TASK-001"],
          "key_files": ["tests/test_health.py"],
          "acceptance_criteria": ["pytest tests/test_health.py passes"],
          "test_steps": ["pytest -q tests/test_health.py"]
        }
      ]
    }
    
  2. Run Whilly (2 concurrent agents, $5 budget cap):

    WHILLY_MAX_PARALLEL=2 WHILLY_BUDGET_USD=5 whilly tasks.json
    # straight from a checkout, no install:
    ./whilly.py tasks.json
    # or as a module:
    python -m whilly tasks.json
    # or just `whilly` with no args for the interactive plan-picker
    
  3. Watch the dashboard. Press q to quit, d for task detail, l for the live log of a running agent, t for the task overview.

🛡️ Self-Healing System

Whilly includes a built-in self-healing system that automatically detects, analyzes, and fixes code errors to ensure pipeline resilience:

# Standard whilly (no crash protection)
whilly tasks.json

# Self-healing whilly (auto-fix + restart on crashes)
python scripts/whilly_with_healing.py tasks.json

Supported error types:

  • NameError — missing variables/parameters (auto-fix)
  • ImportError — missing modules (auto pip install)
  • TypeError — function parameter mismatches (diagnosis)
  • ⚠️ AttributeError — missing object attributes (suggestions)

Features:

  • 🔍 Smart error detection via traceback pattern analysis
  • 🔧 Automated fixes for common coding errors
  • 🔄 Auto-restart with exponential backoff (max 3 retries)
  • 📊 Learning from patterns in historical error logs
  • 💡 Recovery suggestions for complex issues

See Self-Healing Guide for complete documentation.

Modules

Module Purpose
orchestrator.py Main loop, batch planning, interface agreement between agents
agent_runner.py Claude CLI wrapper, JSON output parsing, usage accounting
tmux_runner.py Parallel agents in tmux panes
worktree_runner.py Parallel agents in isolated git worktrees
dashboard.py Rich TUI dashboard with hotkeys
task_manager.py Task lifecycle (pending → in_progress → done/failed)
state_store.py Persistent state across restarts
decomposer.py LLM-based task breakdown
prd_generator.py, prd_wizard.py, prd_launcher.py PRD generation and task derivation
triz_analyzer.py TRIZ contradiction analysis
self_healing.py 🛡️ Error detection, analysis, and automated fixing
recovery.py Task status synchronization and consistency validation
reporter.py Per-iteration reports, cost totals, summary markdown
verifier.py, notifications.py, history.py, config.py Infrastructure

Configuration

Configuration is done via environment variables (prefix WHILLY_). A few CLI flags exist for one-shot overrides — see whilly --help.

Variable Default Purpose
WHILLY_MODEL claude-opus-4-6[1m] Claude model id
WHILLY_MAX_PARALLEL 3 Concurrent agents (1 = sequential)
WHILLY_MAX_ITERATIONS 0 Max work cycles per plan (0 = unlimited)
WHILLY_BUDGET_USD 0 Hard cost cap; 80% triggers warning, 100% stops the run
WHILLY_TIMEOUT 0 Wall-clock cap in seconds (0 = unlimited)
WHILLY_USE_TMUX 1 Use tmux panes for parallel agents
WHILLY_WORKTREE 0 Per-task git worktree isolation (needs MAX_PARALLEL>1)
WHILLY_LOG_DIR whilly_logs Per-task log directory
WHILLY_STATE_FILE .whilly_state.json Crash-recovery state file (--resume reads it)
WHILLY_HEADLESS auto CI mode — JSON on stdout, exit codes
CLAUDE_BIN claude Path to Claude CLI binary
WHILLY_AGENT_BACKEND claude Active agent backend (claude or opencode)
WHILLY_OPENCODE_BIN opencode Path to the OpenCode CLI binary
WHILLY_OPENCODE_SAFE 0 1 → drop --dangerously-skip-permissions for OpenCode
WHILLY_OPENCODE_SERVER_URL (unset) Optional remote OpenCode server URL

Key CLI flags: --all, --headless, --timeout N, --resume, --reset PLAN.json, --init "desc" [--plan] [--go], --plan PRD.md, --prd-wizard, --no-worktree, --agent {claude,opencode}.

Exit codes in headless mode: 0 success, 1 some tasks failed, 2 budget exceeded, 3 timeout.

See docs/Whilly-Usage.md for the full CLI reference.

Documentation

Workshop kit

Whilly ships with a HackSprint1 workshop kit — a 90-minute hands-on tutorial that takes you from pip install to a running self-hosting bootstrap demo. Two tracks:

  • Track A (tasks.json) — works without GitHub auth, 30 min.
  • Track B (GitHub Issues) — full e2e with PR creation, 60 min.

Includes BRD, PRD, 12 ADRs, sample plans, and a roadmap. See docs/workshop/INDEX.md for the full guide. RU/EN bilingual.

Backends

Whilly ships with pluggable agent backends behind a single AgentBackend Protocol (see whilly/agents/).

Backend Select CLI wrapped Notes
Claude (default) --agent claude / WHILLY_AGENT_BACKEND=claude claude --output-format json -p "…" Requires Claude CLI. Set CLAUDE_BIN to override path.
OpenCode --agent opencode / WHILLY_AGENT_BACKEND=opencode opencode run --format json --model <provider/id> "…" Requires sst/opencode on PATH (or WHILLY_OPENCODE_BIN). Set WHILLY_OPENCODE_SAFE=1 to respect its per-tool permission policy.

Model ids pass through normalization per backend — e.g. claude-opus-4-6 automatically becomes anthropic/claude-opus-4-6 for OpenCode. Completion is signalled identically (<promise>COMPLETE</promise>) so the main loop is backend-agnostic. Decision Gate, tmux runner, and the subprocess fallback all route through the active backend.

Workflow boards

Whilly can sync a GitHub Projects v2 board as issues move through the pipeline (ready → picked_up → in_review → done / refused / failed). Board integration is Protocol-driven (whilly/workflow/BoardSink) — today one adapter ships (GitHubProjectBoard via gh api graphql); Jira/Linear/GitLab drop in as sibling implementations.

Before first use, run the analyzer to map whilly's six lifecycle events to your board columns:

whilly --workflow-analyze https://github.com/users/<you>/projects/<N>

The analyzer prints matched / missing / ambiguous columns and walks you through [A]dd / [M]ap / [S]kip decisions. Output goes to .whilly/workflow.json — a committable artefact so teams share one contract. Extra flags: --apply (auto-add all missing columns, CI-friendly) and --report (dry-run, no writes).

See ADR-014 for the design rationale and extension guide.

Self-hosting pipelines

Two e2e scripts ship for "whilly processes its own GitHub issues end-to-end", differing in how much thinking they do before coding. Pick by issue complexity:

Script Stages Use when
scripts/whilly_e2e_demo.py fetch → Decision Gate → execute → PR → review-fix loop Issue is crisp, single-file, "just do it" scoped. Ralph-loop reference.
scripts/whilly_e2e_triz_prd.py fetch → Gate → TRIZ challengePRDtasks decomp → execute → quality gate → PR Issue deserves decomposition. "Whilly Wiggum" smarter-brother variant.

Both honour the workflow board integration (WHILLY_PROJECT_URL=... → cards move at every stage) and share the hard WHILLY_BUDGET_USD cap.

Typical invocation for the TRIZ+PRD pipeline:

unset GITHUB_TOKEN
WHILLY_REPO=mshegolev/whilly-orchestrator \
WHILLY_LABEL=whilly:ready \
WHILLY_BUDGET_USD=30 \
WHILLY_PROJECT_URL=https://github.com/users/mshegolev/projects/4 \
python scripts/whilly_e2e_triz_prd.py --limit 1

--limit N caps issues per run; --dry-run skips all LLM / PR / merge work for plan-only inspection; --allow-auto-merge is OFF by default — a pipeline that modifies whilly's own code always leaves PRs for human review. Details + design rationale in ADR-015.

Troubleshooting / FAQ

Issue Fix
gh auth status returns 401 ("token invalid") unset GITHUB_TOKEN (env-based token overrides keyring auth), then gh auth login if needed.
claude: command not found Install Claude CLI from docs.claude.com or set CLAUDE_BIN to its path.
Dashboard rendering broken on narrow terminal (<100 cols) WHILLY_HEADLESS=1 whilly tasks.json — disables TUI, streams JSON events on stdout.
Budget hits 0 unexpectedly Set or raise WHILLY_BUDGET_USD (default unlimited; 0 also means unlimited).
tmux ls shows no sessions after dispatch Either tmux isn't installed, or WHILLY_USE_TMUX=0 — whilly silently falls back to subprocess mode.
Agent loops forever without marking done Ensure prompt ends with the <promise>COMPLETE</promise> marker contract — agent_runner.is_complete checks that string.

Workshop kit

Running HackSprint1 or a self-paced walkthrough? The full workshop kit (BRD, PRD, ADRs, tutorial, roadmap) lives under docs/workshop/INDEX.md.

Development

pip install -e ".[dev]"
pytest
ruff check whilly/ tests/
ruff format whilly/ tests/

Credits

  • Technique lineage: Ghuntley's original Ralph Wiggum loop post — the pattern whilly descends from.
  • Spirit of the family — Ralph's "I'm helping!" captures the essence of an agent that just keeps going, no matter what. Whilly is his smarter brother: same stamina, plus TRIZ, Decision Gate, PRD wizard.

Related work

  • Earlier Ralph-loop implementations exist across the Claude Code community. Whilly sets itself apart with a Rich TUI dashboard, TRIZ analyzer, Decision Gate pre-flight, PRD wizard, and tmux/git-worktree parallel execution — the "smarter brother" kit on top of the base loop.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whilly_orchestrator-3.1.0.tar.gz (144.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whilly_orchestrator-3.1.0-py3-none-any.whl (107.3 kB view details)

Uploaded Python 3

File details

Details for the file whilly_orchestrator-3.1.0.tar.gz.

File metadata

  • Download URL: whilly_orchestrator-3.1.0.tar.gz
  • Upload date:
  • Size: 144.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for whilly_orchestrator-3.1.0.tar.gz
Algorithm Hash digest
SHA256 a4aeca7297537bfa68285656c2030125cee0eae534fa4a05ab1aa45514e0478f
MD5 32a511d488b3cf8beeac5569c2ac66b0
BLAKE2b-256 50647f363bbef50dcdd5e92579fea7d8e01734f262d0bcac8c031f5969613a6e

See more details on using hashes here.

Provenance

The following attestation bundles were made for whilly_orchestrator-3.1.0.tar.gz:

Publisher: release.yml on mshegolev/whilly-orchestrator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file whilly_orchestrator-3.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for whilly_orchestrator-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 91ce9c5b19113bc7a5fafdf77af6c06671d8b34b15bb3e7560a52bdd769222b6
MD5 529b9d79cfd50193fec4e7b8c64aefeb
BLAKE2b-256 24c9e141d1bc516aabc8e832304f4ba4720d9b7167c226758ed40fdf5befdae9

See more details on using hashes here.

Provenance

The following attestation bundles were made for whilly_orchestrator-3.1.0-py3-none-any.whl:

Publisher: release.yml on mshegolev/whilly-orchestrator

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page