Skip to main content

Multi-Agent CLI Orchestration Research Platform

Project description

zwarm

Multi-agent CLI orchestration research platform. Coordinate multiple coding agents (Codex, Claude Code) with delegation, conversation, and trajectory alignment.

Installation

# From the workspace (recommended during development)
cd /path/to/labs
uv sync

# Or install directly
uv pip install -e ./zwarm

Requirements:

  • Python 3.13+
  • codex CLI installed (for Codex adapter)
  • claude CLI installed (for Claude Code adapter)

Quick Start

# 1. Test an executor directly
zwarm exec --task "What is 2+2?"

# 2. Run the orchestrator with a task
zwarm orchestrate --task "Create a hello world Python function"

# 3. Check state after running
zwarm status

# 4. View event history
zwarm history

Task Input Options

# Direct task
zwarm orchestrate --task "Build a REST API"

# From file
zwarm orchestrate --task-file task.md

# From stdin
echo "Fix the bug in auth.py" | zwarm orchestrate

Configuration

zwarm looks for configuration in this order:

  1. --config flag (YAML file)
  2. config.toml in working directory
  3. Default settings

Minimal config.toml

[weave]
enabled = true
project = "your-wandb-entity/zwarm"

[executor]
adapter = "codex_mcp"  # or "claude_code"

Environment Variables

# Enable Weave tracing (alternative to config.toml)
export WEAVE_PROJECT="your-entity/zwarm"

# Required for adapters
export OPENAI_API_KEY="..."      # for Codex
export ANTHROPIC_API_KEY="..."   # for Claude Code

Full Configuration Reference

# config.yaml
orchestrator:
  max_steps: 100              # Maximum orchestrator steps

executor:
  adapter: codex_mcp          # Default adapter: codex_mcp | claude_code
  model: null                 # Model override (adapter-specific)
  sandbox: workspace-write    # Codex sandbox mode

weave:
  enabled: true
  project: your-entity/zwarm

state_dir: .zwarm             # State directory for sessions/events

watchers:
  enabled: []                 # List of enabled watchers
  config:
    progress:
      stuck_threshold: 5
    budget:
      max_steps: 50
      max_sessions: 10
    scope:
      keywords: []

Adapters

zwarm supports multiple CLI coding agents through adapters.

Codex MCP (default)

Uses Codex via MCP server for true conversational sessions.

# Sync mode (conversational)
zwarm exec --adapter codex_mcp --task "Add a login function"

# The orchestrator can have back-and-forth conversations
# using delegate() and converse() tools

Requires: codex CLI installed, OPENAI_API_KEY set

Claude Code

Uses Claude Code CLI for execution.

zwarm exec --adapter claude_code --task "Fix the type errors"

Requires: claude CLI installed, authenticated

Watchers (Trajectory Alignment)

Watchers are composable guardrails that monitor agent behavior and can intervene when things go wrong.

Available Watchers

Watcher Description
progress Detects stuck/spinning agents
budget Monitors step/session limits
scope Detects scope creep from original task
pattern Custom regex pattern matching
quality Code quality checks

Enabling Watchers

# config.yaml
watchers:
  enabled:
    - progress
    - budget
    - scope
  config:
    progress:
      stuck_threshold: 5      # Flag after 5 similar steps
    budget:
      max_steps: 50
      max_sessions: 10
    scope:
      keywords:
        - "refactor"
        - "rewrite"

Watcher Actions

Watchers can return different actions:

  • continue - Keep going
  • warn - Log warning but continue
  • pause - Pause for human review
  • stop - Stop the orchestrator

Weave Integration

zwarm integrates with Weave for tracing and observability.

Enabling Weave

# Via environment variable
export WEAVE_PROJECT="your-entity/zwarm"

# Or via config.toml
[weave]
enabled = true
project = "your-entity/zwarm"

What Gets Traced

  • Orchestrator step() calls with tool inputs/outputs
  • Individual adapter calls (_call_codex, _call_claude)
  • Delegation tools (delegate, converse, end_session)
  • All tool executions

View traces at: https://wandb.ai/your-entity/zwarm/weave

CLI Reference

orchestrate

Start an orchestrator session to delegate tasks.

zwarm orchestrate [OPTIONS]

Options:
  -t, --task TEXT           Task description
  -f, --task-file PATH      Read task from file
  -c, --config PATH         Config file (YAML)
  --adapter TEXT            Executor adapter override
  --resume                  Resume from previous state
  --set KEY=VALUE           Override config values

exec

Run a single executor directly (for testing).

zwarm exec [OPTIONS]

Options:
  -t, --task TEXT           Task to execute
  -f, --task-file PATH      Read task from file
  --adapter TEXT            Adapter to use [default: codex_mcp]
  --model TEXT              Model override
  --mode [sync|async]       Execution mode [default: sync]

status

Show current orchestrator state.

zwarm status [OPTIONS]

Options:
  --sessions                Show session details
  --tasks                   Show task details
  --json                    Output as JSON

history

Show event history.

zwarm history [OPTIONS]

Options:
  -n, --limit INTEGER       Number of events [default: 20]
  --session TEXT            Filter by session ID
  --json                    Output as JSON

configs

Manage configuration files.

zwarm configs list          # List available configs
zwarm configs show NAME     # Show config contents

Architecture

┌─────────────────────────────────────────────────────────┐
│                     Orchestrator                         │
│  (Plans, delegates, supervises - does NOT write code)   │
├─────────────────────────────────────────────────────────┤
│                    Delegation Tools                      │
│   delegate() | converse() | check_session() | bash()    │
└───────────────┬─────────────────────┬───────────────────┘
                │                     │
        ┌───────▼───────┐     ┌───────▼───────┐
        │  Codex MCP    │     │  Claude Code  │
        │   Adapter     │     │    Adapter    │
        └───────┬───────┘     └───────┬───────┘
                │                     │
        ┌───────▼───────┐     ┌───────▼───────┐
        │    codex      │     │    claude     │
        │  mcp-server   │     │     CLI       │
        └───────────────┘     └───────────────┘

Key Concepts

  • Orchestrator: Plans and delegates but never writes code directly
  • Executors: CLI agents (Codex, Claude) that do the actual coding
  • Sessions: Conversations with executors (sync or async)
  • Watchers: Trajectory aligners that monitor and intervene

State Management

All state is stored in flat files under .zwarm/:

.zwarm/
├── state.json              # Current state
├── events.jsonl            # Append-only event log
├── sessions/
│   └── <session-id>/
│       ├── messages.json   # Conversation history
│       └── metadata.json   # Session info
└── orchestrator/
    └── messages.json       # Orchestrator history (for resume)

Development

Running Tests

# From workspace root
uv run pytest wbal/tests/ -v

# zwarm doesn't have its own tests yet

Project Structure

zwarm/
├── src/zwarm/
│   ├── adapters/           # Executor adapters
│   │   ├── base.py         # ExecutorAdapter protocol
│   │   ├── codex_mcp.py    # Codex MCP adapter
│   │   └── claude_code.py  # Claude Code adapter
│   ├── cli/
│   │   └── main.py         # Typer CLI
│   ├── core/
│   │   ├── config.py       # Configuration loading
│   │   ├── models.py       # ConversationSession, Message, etc.
│   │   └── state.py        # Flat-file state management
│   ├── tools/
│   │   └── delegation.py   # delegate, converse, etc.
│   ├── watchers/
│   │   ├── base.py         # Watcher protocol
│   │   ├── builtin.py      # Built-in watchers
│   │   └── manager.py      # WatcherManager
│   ├── prompts/
│   │   └── orchestrator.py # Orchestrator system prompt
│   └── orchestrator.py     # Main Orchestrator class
├── configs/                # Example configurations
├── README.md
└── pyproject.toml

Research Context

zwarm is a research platform exploring:

  1. Agent reliability - Can orchestrators reliably delegate and verify work?
  2. Agent meta-capability - Can agents effectively use other agents?
  3. Long-running agents - Can agents run for days, not hours?

See ZWARM_PLAN.md for detailed design documentation.

License

Research project - see repository license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zwarm-0.1.0.tar.gz (39.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zwarm-0.1.0-py3-none-any.whl (50.9 kB view details)

Uploaded Python 3

File details

Details for the file zwarm-0.1.0.tar.gz.

File metadata

  • Download URL: zwarm-0.1.0.tar.gz
  • Upload date:
  • Size: 39.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for zwarm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a9cfb9b0e4f4e9bb68234b1bfb8e214279925dabccac8d0214d7b001da5562f7
MD5 f118f1ebb8c9c8ff14c3a3899efd17cb
BLAKE2b-256 12129d644c897665d1ca945e553ef0d8f737fe959209335c3ea6783664899732

See more details on using hashes here.

File details

Details for the file zwarm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: zwarm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 50.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for zwarm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 825d92738871ff619908abf454c07804802433bbb9ce18df2c308d034a765a54
MD5 41454894b7d333c18b2a0ba921eb21e3
BLAKE2b-256 677b1b5cfa1df80c0111001fc3a66e273f636bc6c85b4568ee9a5b1d29865b7d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page