Skip to main content

Multi-agent session handoff framework for Claude Code and Codex CLI

Project description

superharness

Multi-agent task coordination for Claude Code and Codex CLI

superharness lets AI coding assistants work on the same project without stepping on each other. It provides a shared contract, queue-based delegation, and handoff/ledger state so tasks survive across sessions.

AI agent installing this? Read docs/INSTALL-AGENT.md — it tells you exactly what to detect, what to ask the user (just two questions), and how to set everything up without human terminal interaction.


Using superharness

Via Claude Code or Codex CLI (recommended)

Step 1 — Install superharness once (terminal):

pipx install superharness
Alternative: install from source
curl -fsSL https://raw.githubusercontent.com/celstnblacc/superharness/main/scripts/install-remote.sh | bash
# export PATH="$HOME/.local/bin:$PATH"  # add to ~/.zshrc or ~/.bashrc if needed

Or clone manually:

git clone https://github.com/celstnblacc/superharness.git ~/.local/share/superharness
bash ~/.local/share/superharness/scripts/install-wrapper.sh

Step 2 — Go to your project and open Claude Code or Codex CLI.

Step 3 — Type these phrases directly to the agent:

shux init              # bootstrap .superharness/ for this project
shux doctor            # check prerequisites and protocol health
shux contract          # show all tasks with status and next-task suggestion
shux continue          # resume active contract automatically
shux delegate <id>     # create task + enqueue in one step (task must be plan_approved or later)
shux test-type <id>    # set mandatory test types for a task
shux verify <id>       # record verification result (pass/fail)
shux close <id>        # mark done (task must be report_ready or review_passed); use --force to bypass
shux task create       # create a task with --blocked-by, --tdd-red/green/refactor, --criteria flags
shux task status       # update task lifecycle status (todo → plan_proposed → plan_approved → in_progress → report_ready → done)
shux status            # dashboard: tasks, watcher, profile
shux recall <keywords> # search past handoffs and ledger
shux uninstall         # remove watcher and system artifacts for this project
shux hygiene           # validate protocol compliance (contract, handoffs, ledger)
shux hygiene --repair  # auto-fix missing handoffs, ledger entries, and stuck statuses
shux dashboard         # open browser dashboard
shux watch             # start continuous watcher in foreground
shux update            # pull latest superharness + refresh templates, hooks, and watcher
shux discuss           # start or manage a cross-agent discussion (topic, owners, optional ID)
shux install-hooks     # merge adapter hooks into ~/.claude/settings.json (portable, run once per machine)
shux init --skip-hooks # init without modifying ~/.claude/settings.json (for CI or conservative setups)
shux benchmark         # show dispatch cost/duration leaderboard (--top N, --agents)
shux diff <id>         # preview agent changes for a task before closing (--stat, --base)
shux daemon start      # start background watcher daemon (portable, no launchd/systemd needed)
shux daemon stop       # stop the daemon
shux daemon status     # show daemon running state and PID
shux pack export       # bundle .superharness/ into a portable .tar.gz for handoff
shux pack import       # restore a pack into a new project
shux help              # show all shux shortcuts in the terminal

That's it. Steps 1 and 2 are one-time. From then on, shux contract starts every session.


Intelligence layer (v1.7.0)

Dispatch is now smarter. These features activate automatically — no extra setup needed.

Feature What it does
Pre-flight analysis Validates task spec, TDD block, dependencies, and git state before dispatch. Blocks on unresolved deps, warns on missing criteria.
Complexity estimator Scores acceptance criteria + TDD scope and suggests single/fanout/swarm mode.
Failure pattern matching 15 built-in classifiers (ImportError, timeout, git conflict, etc.) analyze errors and inject fix hints into the next dispatch.
Skill extraction When a task completes, extracts category, techniques, and diff stats into skills.yaml. Future dispatches for similar tasks get technique hints.
Benchmark leaderboard Tracks cost, duration, and outcome per dispatch in benchmark.jsonl. View with shux benchmark.
Parallel fan-out Run N agents concurrently on isolated git worktrees. Use fanout_dispatch() from the SDK.
Swarm mode N workers solve the same task, then an Opus reviewer picks the best solution. Optional auto-merge.

Via Terminal (alternative)

For scripting, CI, or users who prefer direct shell access.

Requires: bash, python3. See Prerequisites.

# Try first — no install needed
PYTHONPATH=src python3 -m superharness demo

# Install CLI
pipx install superharness && superharness --version

# Initialize project
cd /path/to/project
superharness init --interactive   # or: superharness init "Name" "Stack" "active"

# Verify
superharness doctor --project .

# Contract snapshot
superharness contract today --project .

# Delegate to agent
superharness delegate --to codex-cli --project .

# Queue management
superharness enqueue --project . --to codex-cli --task my-task --priority 1
superharness dispatch --project . --to codex-cli

# Protocol hygiene + browser dashboard
superharness hygiene --project .
superharness dashboard-ui --project .

Run tests:

uv sync --dev
pytest tests/ -q

Full terminal reference: docs/GUIDE.md


Quick Links

📘 User Guide — Commands, background watcher, troubleshooting 🏗️ Architecture — Why it exists, how it works, design decisions 🔒 Security — Threat model and operational safety notes


What You Get

  • shux shortcuts — Control superharness from inside Claude Code or Codex CLI
  • superharness init — Bootstrap protocol files (.superharness/); auto-installs Claude Code hooks and background watcher (macOS)
  • superharness task — Create and update tasks: --blocked-by <id> dependency tracking, --tdd-red/green/refactor TDD block, --criteria acceptance criteria; task status enforces the full lifecycle (todo → plan_proposed → plan_approved → in_progress → report_ready → done)
  • superharness delegate — Launch agent with contract context (requires task status ≥ plan_approved; auto model routing)
  • superharness verify — Record verification result before closing a task
  • superharness close — Close a verified task (requires report_ready or review_passed; use --force to bypass lifecycle gate)
  • superharness enqueue|dispatch|watch — Queue-based task routing
  • superharness hygiene — Protocol compliance checks
  • superharness watch --foreground — Cross-platform continuous watcher
  • superharness dashboard-ui — Browser dashboard: inbox, tasks, watcher state, enqueue with TDD instructions
  • superharness doctor — Prerequisite and setup health check
  • superharness uninstall — Clean removal of system artifacts
  • Background watcher — Unattended execution via macOS launchd or Linux systemd (opt-in)

Is this for me?

superharness is for you if any of these are true:

  • You use Claude Code or Codex CLI and find yourself re-explaining project context at the start of every session
  • You want to hand off a task to one agent while you work with another
  • You need an append-only audit trail of what each agent did and decided
  • You run agents unattended in the background (e.g. via launchd/systemd)

You probably don't need superharness if you only ever run a single agent interactively and don't switch between sessions.

What you need to use it

Feature Requirements
Core protocol (contracts, handoffs, ledger) bash, python3
Agent shortcuts (shux) + claude or codex CLI
Background auto-dispatch + launchd (macOS) or systemd (Linux)
Browser dashboard + python3 -m http.server (built-in)

You can start with just the core and add agent CLIs and background services later. --print-only mode lets you preview every dispatch without launching anything.


Platform Support

Cross-platform: macOS, Linux, Windows. All user-facing commands are Python and work everywhere python3 is available. CI runs on all three platforms.

  • Background watcher has automated service installers for macOS (launchd), Linux (systemd), and Windows (Task Scheduler via schtasks.exe). superharness watch --foreground works everywhere as an alternative.

Prerequisites

  • python3 3.11+ + pyyamluv sync --dev (or pip install pyyaml click ruamel.yaml)
  • bash — only needed for macOS/Linux watcher service install scripts; not required on Windows or for any core commands
  • claude CLI (for Claude delegation commands): npm install -g @anthropic-ai/claude-code
  • codex CLI (for Codex delegation commands): npm install -g @openai/codex
  • macOS launchd or Linux systemd for background watcher (see Platform Support); --foreground mode works everywhere

Project Runtime State

Per-project state lives in .superharness/:

.superharness/
├── contract.yaml          # tasks, decisions, failures
├── handoffs/              # session handoff state
├── ledger.md              # append-only event log
├── decisions.yaml         # cross-agent ADRs
├── failures.yaml          # failure memory
└── inbox.yaml             # dispatch queue

Architecture details: docs/ARCHITECTURE.md


Repository Layout

superharness/
├── superharness            # thin Bash shim → delegates to Python
├── src/superharness/       # Python CLI + engine + command modules
├── protocol/              # protocol spec + templates
├── adapters/              # Claude/Codex adapter assets
├── scripts/               # launchd installer + CI guard scripts
├── docs/                  # architecture and user guide
├── tests/                 # unit/integration/e2e tests
└── CHANGELOG.md

Security Note

The background watcher enables unattended execution (agents run without human supervision). This is powerful but requires explicit confirmation:

macOS (launchd):

bash scripts/install-launchd-inbox-watcher.sh \
  --project /path/to/project \
  --interval 30 \
  --confirm-non-interactive yes \
  --confirm-skip-permissions yes

Linux (systemd):

CONFIRM_NON_INTERACTIVE=yes bash scripts/install-systemd-inbox-watcher.sh \
  --project /path/to/project \
  --interval 30

Read the full threat model: SECURITY.md


Current Version

Current version: v1.1.1

See CHANGELOG.md for the full iteration log.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superharness-1.8.0.tar.gz (222.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

superharness-1.8.0-py3-none-any.whl (280.1 kB view details)

Uploaded Python 3

File details

Details for the file superharness-1.8.0.tar.gz.

File metadata

  • Download URL: superharness-1.8.0.tar.gz
  • Upload date:
  • Size: 222.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for superharness-1.8.0.tar.gz
Algorithm Hash digest
SHA256 b1c20782c284908a5f7c841ea128d6ad1c41a8943a0c8780b759eaf4e9a66e7c
MD5 bc4813e111514d0415e14ed5ac62f668
BLAKE2b-256 06ea47afa25dd5844b1ff537550fd9fe3c5393757bc1da045f826092421c6f00

See more details on using hashes here.

Provenance

The following attestation bundles were made for superharness-1.8.0.tar.gz:

Publisher: publish.yml on celstnblacc/superharness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file superharness-1.8.0-py3-none-any.whl.

File metadata

  • Download URL: superharness-1.8.0-py3-none-any.whl
  • Upload date:
  • Size: 280.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for superharness-1.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c1574cb9995867481d5dac3d715c94e04f8bdd1ff740f09545c64333e59b7428
MD5 a283e38f77cd4feca7988bca207cab14
BLAKE2b-256 5ca4da832a0aa047580adc5195413c7ba288f82e6ac4c250b776a7afcda55d36

See more details on using hashes here.

Provenance

The following attestation bundles were made for superharness-1.8.0-py3-none-any.whl:

Publisher: publish.yml on celstnblacc/superharness

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page