Skip to main content

Autonomous CLI supervisor for staged AI workflows

Project description

cybervisor

cybervisor is an autonomous CLI supervisor for development runs. It executes a customizable multi-stage pipeline with Gemini CLI, Claude Code, or a mock agent, installs runtime hooks for non-interactive execution, enforces structured stage-result contracts, and keeps audit logs in JSONL.

cybervisor works best when it sits on top of a speckit repository. speckit gives the project durable product and planning memory under .specify/, and cybervisor turns that context into an autonomous execution loop with review, correction, and verification stages.

What it does

  • Runs a multi-stage pipeline defined in cybervisor.yaml
  • Defaults to a robust 5-to-10 stage pipeline depending on the scaffold used
  • Supports structured stage-result contracts and artifact-driven routing
  • Fails fast when the selected agent CLI or hook verifier credentials are missing
  • Writes non-secret hook runtime metadata under .cybervisor/hooks/ for non-mock runs
  • Keeps verifier credentials in ~/.cybervisor/config.yaml
  • Snapshots .gemini/settings.json or .claude/settings.json and restores them on exit
  • Streams live agent output to stderr and persists per-stage logs under .cybervisor/logs/stages/
  • Disables project-local agent skills that conflict with autonomous execution via disabled_skills in cybervisor.yaml; skills are moved to .cybervisor/backups/skills/ before the pipeline and restored after (see Configuration Reference)
  • Automatically checks for cybervisor upgrades in the background during run, serve, and serve-sandbox; logs an info-level notice if a newer version is found — no pip subprocess is spawned (see Updating)
  • Enforces single-instance execution — when the daemon is reachable, run checks for active daemon tasks before proceeding; when the daemon is unreachable, falls back to .cybervisor/instance.lock; exits with 1 if another instance is already running in the same directory (see Runtime Behavior)
  • Exits with 130 on SIGINT or SIGTERM after cleanup
  • Daemon mode (cybervisor serve): Long-running WebSocket server for headless pipeline execution and remote monitoring; supports task cancel, dynamic stop-stage updates, client reconnect with event replay, and background daemonization (see WebSocket Protocol)
  • Daemon client commands: Six subcommands (status, submit, attach, cancel, logs, stop-stage) interact with a running daemon over WebSocket; status reports running task IDs and stages from the daemon's active registry; all support --host and --port overrides and exit with meaningful codes

Requirements

  • Python 3.11+
  • uv
  • One of:
    • gemini on PATH
    • claude on PATH
    • codex on PATH
    • mock mode for local deterministic runs (no API key needed)
  • ~/.cybervisor/config.yaml with verifier settings for non-mock runs

Installation

Install the CLI onto your PATH:

uv tool install cybervisor

After installation, verify:

cybervisor --version

To update an existing installation later:

uv tool upgrade cybervisor
cybervisor --version

For the full update guide, run:

cybervisor docs updating

Quick Start

Initialize the cybervisor scaffold in your project:

cybervisor init

cybervisor init detects your environment:

  • If .specify/ exists, it installs the speckit scaffold (integrated with speckit workflows).
  • If .specify/ is missing, it installs the simple scaffold (standalone artifacts in .cybervisor/artifacts/).

Both scaffolds create a cybervisor.yaml file containing the full pipeline configuration, including prompt templates and stage contracts.

Set your global default agent:

cybervisor use claude

For fast, API-key-free pipeline runs (CI, testing), set agent_tool: mock in your config instead:

agent_tool: mock

When agent_tool: mock is set, the pipeline uses the built-in MockAdapter which completes each stage deterministically and writes contract artifacts driven by the loaded PipelineConfig. The hook verifier is still called via an LLM endpoint, so the llm section is still required (point it at any reachable URL with any key).

Configure your verifier settings in ~/.cybervisor/config.yaml (created with 0o600 permissions so API keys are owner-readable only):

agent_tool: claude
llm:
  api_key: your-api-key
  # Optional overrides
  # base_url: https://api.openai.com/v1
  # model: gpt-4o

# Per-stage agent tool model overrides (top-level, not under llm)
# stage_models:
#   Spec: claude-sonnet-4-6
#   "Review Code": claude-opus-4-6

Run the supervisor:

cybervisor "Create a 360 feedback system"
printf "Create a 360 feedback system" | cybervisor run

Usage

# Run with a prompt
cybervisor "Your task description"
cybervisor run "Your task description"
printf "Your task description" | cybervisor run

# Specify a custom config
cybervisor run "Your task" --config custom.yaml

# Control execution flow
cybervisor run "Your task" --start-stage "Implement"
cybervisor run "Your task" --end-after "Review Code"    # Run up to and including this stage, then stop (updatable via stop-stage)
cybervisor run "Your task" --end-before "Verify"        # Stop before executing this stage (exclusive boundary)

# Set default agent
cybervisor use gemini

# Validate your configuration
cybervisor validate
cybervisor validate --show-guidance

Treat cybervisor validate as the local readiness gate before merge or execution. A passing result means the config is not only parseable, but also satisfies the stricter contract-authoring checks for route safety, complete routed examples, and authored prompt/guidance synchronization.

For advanced stage configuration including cleanup paths, max iterations, per-stage model overrides, and contract authoring, see the Pipeline Authoring Guide and Configuration Reference.

Daemon Mode

cybervisor serve starts a long-running WebSocket daemon. Once running, use the client subcommands to submit tasks, monitor progress, and manage the pipeline remotely.

# Start the daemon server (WebSocket on ws://127.0.0.1:8765)
cybervisor serve
cybervisor serve --host 0.0.0.0 --port 9000
cybervisor serve --background   # Run in background via double-fork

# Check daemon connectivity and active tasks (exits 0 when reachable, 1 when not)
cybervisor status
cybervisor status --host 127.0.0.1 --port 8765
# Example output when a task is running:
#   Running task: abc123def456 (stage: Spec, cwd: /workspace/project, bounds: end_stage=Verify)
#   Daemon reachable at ws://127.0.0.1:8765
# Example output when no task is running:
#   No active tasks.
#   Daemon reachable at ws://127.0.0.1:8765
# Example output when daemon is down:
#   Daemon not reachable at ws://127.0.0.1:8765

# Check status of a specific task by ID (matches across all directories)
cybervisor status abc123def456

# Submit a task and stream events until completion
cybervisor submit "Your task description" --config cybervisor.yaml --start-stage Implement
cybervisor submit "Your task" --end-after "Review Code"
cybervisor submit "Your task" --end-before Verify
printf "Your task description" | cybervisor submit          # read prompt from stdin
cat task_prompt.txt | cybervisor submit                     # multi-line prompts preserved
cybervisor submit "Your task" --task-id my-task-123   # explicit task ID
# On submit, the task ID is printed to stderr (e.g. "Task created: abc123def456")
# Use this ID with attach, cancel, logs, or stop-stage
# Note: submitting a new task in a directory that already has a running task is rejected
# with the error "A task is already running in this directory".

# Reconnect to a running or completed task (auto-detects task in current directory)
cybervisor attach

# Reconnect to a specific task by ID to replay buffered events
cybervisor attach my-task-123
# Note: event history exceeding 64 KB is automatically split into chunks on the server
# and reassembled by the client — this is transparent to the user

# Cancel an active task (auto-detects task in current directory; errors if zero tasks)
cybervisor cancel

# Cancel a specific task by ID (works from any directory)
cybervisor cancel my-task-123

# Dump all buffered events as JSON Lines (non-blocking)
cybervisor logs my-task-123

# Update the stop stage of a running task
cybervisor stop-stage --stage Verify                   # auto-detect task in current directory
cybervisor stop-stage my-task-123 --stage Verify       # explicit task ID

# Override daemon address for any client command
cybervisor submit "task" --host 0.0.0.0 --port 9000

Exit codes for client commands:

  • 0 — success
  • 1 — failure (daemon unreachable, task not found, invalid state, etc.)
  • 130 — interrupted (SIGINT during submit or attach)

Recommended with speckit

The strongest setup is pairing cybervisor with speckit. speckit manages the long-lived product memory (specs, plans, tasks) in .specify/, while cybervisor provides the autonomous execution engine to drive those workflows.

Development

If you are contributing to cybervisor:

User-facing workflow or specification changes should be documented in tracked files under docs/ and, when relevant, this README. Do not leave those changes only in local working directories such as specs/ or .cybervisor/artifacts/, because they are not part of the committed project history.

uv sync
uv run mypy --strict src
uv run pytest

For self-hosted E2E or verify-stage smoke tests, do not run from the repository root when the goal is to simulate a generated project. Create an isolated demo workspace first, typically with:

./scripts/e2e-demo-simple-project.sh

For a fast smoke test that exercises the full pipeline through Verify using a minimal feature prompt and mock LLM API:

./scripts/e2e-verify-smoke.sh
./scripts/e2e-verify-smoke.sh --agent claude   # use Claude Code adapter instead of mock

Both modes route all LLM calls (hook verifier and stage-agent) through the bundled mock API server, so no real API keys are needed.

Release helper:

./scripts/publish.sh patch  # or minor, major

The script requires a clean git working tree, bumps the package version, refreshes uv.lock, builds and publishes the package, then creates a release commit and annotated git tag like v0.7.1.

Repository Layout

src/cybervisor/        Core CLI package (split into focused subpackages)
  cli/                 CLI entry point (commands, parser, instance, docs)
  client/              Daemon WebSocket client (commands, connection, rendering)
  pipeline/            Pipeline execution (runner, artifacts, contract)
  server/              Daemon WebSocket server (daemon, handlers, tasks, cleanup)
  core_hooks/          Hook runtime (contracts, streaming, verifier, runner)
  adapters/            Agent adapter registry and tool-specific adapters
  config.py              YAML config loading (PipelineConfig, stage contracts)
  cli.py, client.py,   Thin backward-compatible re-exports
  pipeline.py, server.py
  hooks.py             Hook installer and runtime config
  agent_hook.py        Packaged cybervisor-agent-hook entry point
  preflight.py         Dependency pre-check
  signals.py           Signal handler
  logging.py           Structured logging
  init.py              Scaffold installer (`cybervisor init`)
  doctor.py            Verifier readiness check (`cybervisor doctor`)
  global_config.py     ~/.cybervisor/config.yaml loader
  skills.py            Project-local skill disable/restore
  upgrade.py           Background version-check
assets/hooks/          Hook prompt assets and fixtures
scripts/               Demo and utility scripts
templates/demo/        Demo project scaffold
tests/                 Unit and integration coverage
.specify/              Constitution and repo-specific scripts
AGENTS.md              Symlink to constitution
GEMINI.md              Symlink to AGENTS.md
CLAUDE.md              Symlink to AGENTS.md
.cybervisor/           Runtime state (instance.lock, daemon.lock, hooks/, logs/)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cybervisor-0.14.0.tar.gz (145.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cybervisor-0.14.0-py3-none-any.whl (178.8 kB view details)

Uploaded Python 3

File details

Details for the file cybervisor-0.14.0.tar.gz.

File metadata

  • Download URL: cybervisor-0.14.0.tar.gz
  • Upload date:
  • Size: 145.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cybervisor-0.14.0.tar.gz
Algorithm Hash digest
SHA256 e7ca9872d8755a61969d405692de29114366f4ba226494f86622a7a292f6829e
MD5 dcba56907acf116af3f9d1843c1cc0df
BLAKE2b-256 4ca6caf8c00aeae8b7cbe6c1bba0129dc3b9b3bcadbe93aea9b78bee8a86985c

See more details on using hashes here.

File details

Details for the file cybervisor-0.14.0-py3-none-any.whl.

File metadata

  • Download URL: cybervisor-0.14.0-py3-none-any.whl
  • Upload date:
  • Size: 178.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for cybervisor-0.14.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bf1aa6766c818e531e3a146cf974cc70a3b0aafd87a417079b7a272fff74b9cd
MD5 bb8a6687357267325aaa337ff7b145f2
BLAKE2b-256 dcc3c4876520d1a4932a7ed7d27630b5443ae857abd74fce741c41d7943d8211

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page