Autonomous CLI supervisor for staged AI workflows
Project description
cybervisor
cybervisor is an autonomous CLI supervisor for development runs. It executes a customizable multi-stage pipeline with Gemini CLI, Claude Code, or a mock agent, installs runtime hooks for non-interactive execution, enforces structured stage-result contracts, and keeps audit logs in JSONL.
cybervisor works best when it sits on top of a speckit repository. speckit gives the project durable product and planning memory under .specify/, and cybervisor turns that context into an autonomous execution loop with review, correction, and verification stages.
What it does
- Runs a multi-stage pipeline defined in
cybervisor.yaml - Defaults to a robust 5-to-10 stage pipeline depending on the scaffold used
- Supports structured stage-result contracts and artifact-driven routing
- Fails fast when the selected agent CLI or hook verifier credentials are missing
- Writes non-secret hook runtime metadata under
.cybervisor/hooks/for non-mock runs - Keeps verifier credentials in
~/.cybervisor/config.yaml - Snapshots
.gemini/settings.jsonor.claude/settings.jsonand restores them on exit - Streams live agent output to stderr and persists per-stage logs under
.cybervisor/logs/stages/ - Enforces single-instance execution — when the daemon is reachable,
runchecks for active daemon tasks before proceeding; when the daemon is unreachable, falls back to.cybervisor/instance.lock; exits with1if another instance is already running in the same directory (see Runtime Behavior) - Exits with
130onSIGINTorSIGTERMafter cleanup - Daemon mode (
cybervisor serve): Long-running WebSocket server for headless pipeline execution and remote monitoring; supports task cancel, dynamic stop-stage updates, client reconnect with event replay, and background daemonization (see WebSocket Protocol) - Daemon client commands: Six subcommands (
status,submit,attach,cancel,logs,stop-stage) interact with a running daemon over WebSocket;statusreports running task IDs and stages from the daemon's active registry; all support--hostand--portoverrides and exit with meaningful codes
Requirements
- Python 3.11+
uv- One of:
geminionPATHclaudeonPATHmockmode for local deterministic runs (no API key needed)
~/.cybervisor/config.yamlwith verifier settings for non-mock runs
Installation
Install the CLI onto your PATH:
uv tool install cybervisor
After installation, verify:
cybervisor --version
To update an existing installation later:
uv tool upgrade cybervisor
cybervisor --version
For the full update guide, run:
cybervisor docs updating
Quick Start
Initialize the cybervisor scaffold in your project:
cybervisor init
cybervisor init detects your environment:
- If
.specify/exists, it installs the speckit scaffold (integrated withspeckitworkflows). - If
.specify/is missing, it installs the simple scaffold (standalone artifacts in.cybervisor/artifacts/).
Both scaffolds create a cybervisor.yaml file containing the full pipeline configuration, including prompt templates and stage contracts.
Set your global default agent:
cybervisor use claude
For fast, API-key-free pipeline runs (CI, testing), set agent_tool: mock in your config instead:
agent_tool: mock
When agent_tool: mock is set, the pipeline uses the built-in MockAdapter which completes each stage deterministically and writes contract artifacts driven by the loaded PipelineConfig. The hook verifier is still called via an LLM endpoint, so the llm section is still required (point it at any reachable URL with any key).
Configure your verifier settings in ~/.cybervisor/config.yaml:
agent_tool: claude
llm:
api_key: your-api-key
# Optional overrides
# base_url: https://api.openai.com/v1
# model: gpt-4o
# Per-stage agent tool model overrides (top-level, not under llm)
# stage_models:
# Spec: claude-sonnet-4-6
# "Review Code": claude-opus-4-6
Run the supervisor:
cybervisor "Create a 360 feedback system"
printf "Create a 360 feedback system" | cybervisor run
Usage
# Run with a prompt
cybervisor "Your task description"
cybervisor run "Your task description"
printf "Your task description" | cybervisor run
# Specify a custom config
cybervisor run "Your task" --config custom.yaml
# Control execution flow
cybervisor run "Your task" --start-stage "Implement"
cybervisor run "Your task" --end-stage "Review Code"
cybervisor run "Your task" --end-before "Verify"
# Set default agent
cybervisor use gemini
# Validate your configuration
cybervisor validate
cybervisor validate --show-guidance
For advanced stage configuration including cleanup paths, max iterations, per-stage model overrides, and contract authoring, see the Pipeline Authoring Guide and Configuration Reference.
Daemon Mode
cybervisor serve starts a long-running WebSocket daemon. Once running, use the client subcommands to submit tasks, monitor progress, and manage the pipeline remotely.
# Start the daemon server (WebSocket on ws://127.0.0.1:8765)
cybervisor serve
cybervisor serve --host 0.0.0.0 --port 9000
cybervisor serve --background # Run in background via double-fork
# Check daemon connectivity and active tasks (exits 0 when reachable, 1 when not)
cybervisor status
cybervisor status --host 127.0.0.1 --port 8765
# Example output when a task is running:
# Running task: abc123def456 (stage: Spec, cwd: /workspace/project, bounds: end_stage=Verify)
# Daemon reachable at ws://127.0.0.1:8765
# Example output when no task is running:
# No active tasks.
# Daemon reachable at ws://127.0.0.1:8765
# Example output when daemon is down:
# Daemon not reachable at ws://127.0.0.1:8765
# Check status of a specific task by ID (matches across all directories)
cybervisor status abc123def456
# Submit a task and stream events until completion
cybervisor submit "Your task description" --config cybervisor.yaml --start-stage Implement
printf "Your task description" | cybervisor submit # read prompt from stdin
cat task_prompt.txt | cybervisor submit # multi-line prompts preserved
cybervisor submit "Your task" --task-id my-task-123 # explicit task ID
# On submit, the task ID is printed to stderr (e.g. "Task created: abc123def456")
# Use this ID with attach, cancel, logs, or stop-stage
# Note: submitting a new task in a directory that already has a running task is rejected
# with the error "A task is already running in this directory".
# Reconnect to a running or completed task (auto-detects task in current directory)
cybervisor attach
# Reconnect to a specific task by ID to replay buffered events
cybervisor attach my-task-123
# Note: event history exceeding 64 KB is automatically split into chunks on the server
# and reassembled by the client — this is transparent to the user
# Cancel an active task (auto-detects task in current directory; errors if zero tasks)
cybervisor cancel
# Cancel a specific task by ID (works from any directory)
cybervisor cancel my-task-123
# Dump all buffered events as JSON Lines (non-blocking)
cybervisor logs my-task-123
# Update the stop stage of a running task
cybervisor stop-stage --stage Verify # auto-detect task in current directory
cybervisor stop-stage my-task-123 --stage Verify # explicit task ID
# Override daemon address for any client command
cybervisor submit "task" --host 0.0.0.0 --port 9000
Exit codes for client commands:
0— success1— failure (daemon unreachable, task not found, invalid state, etc.)130— interrupted (SIGINT duringsubmitorattach) Treatcybervisor validateas the local readiness gate before merge or execution. A passing result means the config is not only parseable, but also satisfies the stricter contract-authoring checks for route safety, complete routed examples, and authored prompt/guidance synchronization.
User-facing workflow or specification changes should be documented in tracked files under docs/ and, when relevant, this README. Do not leave those changes only in local working directories such as specs/ or .cybervisor/artifacts/, because they are not part of the committed project history.
Recommended with speckit
The strongest setup is pairing cybervisor with speckit. speckit manages the long-lived product memory (specs, plans, tasks) in .specify/, while cybervisor provides the autonomous execution engine to drive those workflows.
Development
If you are contributing to cybervisor:
uv sync
uv run mypy --strict src
uv run pytest
For self-hosted E2E or verify-stage smoke tests, do not run from the repository root when the goal is to simulate a generated project. Create an isolated demo workspace first, typically with:
./scripts/e2e-demo-simple-project.sh
For a fast smoke test that exercises the full pipeline through Verify using a minimal feature prompt and mock LLM API:
./scripts/e2e-verify-smoke.sh
./scripts/e2e-verify-smoke.sh --agent claude # use Claude Code adapter instead of mock
Both modes route all LLM calls (hook verifier and stage-agent) through the bundled mock API server, so no real API keys are needed.
Release helper:
./scripts/publish.sh patch # or minor, major
The script requires a clean git working tree, bumps the package version, refreshes uv.lock, builds and publishes the package, then creates a release commit and annotated git tag like v0.7.1.
Repository Layout
src/cybervisor/ Core CLI package (split into focused subpackages)
cli/ CLI entry point (commands, parser, instance, docs)
client/ Daemon WebSocket client (commands, connection, rendering)
pipeline/ Pipeline execution (runner, artifacts, contract)
server/ Daemon WebSocket server (daemon, handlers, tasks, cleanup)
core_hooks/ Hook runtime (contracts, streaming, verifier, runner)
adapters/ Agent adapter registry and tool-specific adapters
config/ YAML config loading
cli.py, client.py, Thin backward-compatible re-exports
pipeline.py, server.py
hooks.py Hook installer and runtime config
agent_hook.py Packaged cybervisor-agent-hook entry point
preflight.py Dependency pre-check
signals.py Signal handler
logging.py Structured logging
assets/hooks/ Hook prompt assets and fixtures
scripts/ Demo and utility scripts
templates/demo/ Demo project scaffold
tests/ Unit and integration coverage
.specify/ Constitution and repo-specific scripts
AGENTS.md Symlink to constitution
GEMINI.md Symlink to AGENTS.md
CLAUDE.md Symlink to AGENTS.md
.cybervisor/ Runtime state (instance.lock, daemon.lock, hooks/, logs/)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cybervisor-0.12.0.tar.gz.
File metadata
- Download URL: cybervisor-0.12.0.tar.gz
- Upload date:
- Size: 129.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcefb0bf245301eed0e76d46654e920313cb85a0c83e431cf48c21116b3d2d71
|
|
| MD5 |
cfb06f45dad25165e1630d5b3e18e095
|
|
| BLAKE2b-256 |
4721cc4fb976b13673bf35a1421f9d234a6f30af939ed6fdc629e2c122cc4fca
|
File details
Details for the file cybervisor-0.12.0-py3-none-any.whl.
File metadata
- Download URL: cybervisor-0.12.0-py3-none-any.whl
- Upload date:
- Size: 158.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca7a92fbac940780a0f4788edc74aaac5269c9bf443e294a01e365166c7404be
|
|
| MD5 |
37416ab24b6270d23031251088828905
|
|
| BLAKE2b-256 |
bfb594667bc433ea72e661e688ce786518412d7b94802db0840b3ade4b1d4348
|