Skip to main content

MCP server for multi-harness AI agent delegation

Project description

Spindle

MCP server for multi-harness AI agent delegation. Spawn background agents (Claude Code, Codex, Gemini, Kimi) that run asynchronously, with optional git worktree isolation for safe parallel work.

Features

  • Async agent spawning - Fire-and-forget pattern with spool IDs
  • Optional blocking with gather/yield - Wait for all results at once, or stream them as agents complete. Alternatively, agent can continue other work, spins are nonblocking by default
  • Permission profiles - Control what tools child agents can use (readonly, careful, full)
  • Shard isolation - Run agents in sandboxed git worktrees to prevent conflicts
  • Model selection - Route tasks to different models per-agent
  • Session continuity - Resume conversations with child agents (auto-recovers expired sessions)
  • Rich querying - Search, filter, peek at running output, export results

Requirements

  • Python 3.10+
  • Claude CLI installed and authenticated
  • Git (for shard/worktree functionality)

Install

pip install spindle-mcp

Add to Claude Code's MCP config (~/.claude.json):

{
  "mcpServers": {
    "spindle": {
      "command": "spindle"
    }
  }
}

Usage

Basic: Spawn and collect

# Spawn an agent
spool_id = spin("Research the Python GIL")

# Do other work...

# Check result
result = unspool(spool_id)

Permission profiles

Control what tools the spawned agent can use:

# Read-only: Can only search and read
spin("Analyze the codebase", permission="readonly")

# Careful (default): Can read/write but limited bash
spin("Fix this bug", permission="careful")

# Full access: No restrictions
spin("Implement the feature", permission="full")

# Shard: Full access + auto-isolated worktree (common for risky work)
spin("Refactor the auth system", permission="shard")

# Careful + shard: Limited tools but isolated
spin("Update configs", permission="careful+shard")

Profiles:

  • readonly: Read, Grep, Glob, safe bash (ls, cat, git status/log/diff)
  • careful: Read, Write, Edit, Grep, Glob, bash for git/make/pytest/python/npm
  • full: No restrictions
  • shard: Full access + auto-creates isolated worktree
  • careful+shard: Careful permissions + auto-creates isolated worktree

You can also pass explicit allowed_tools to override the profile.

Isolated workspaces with shards

Run agents in isolated git worktrees to prevent conflicts:

# Agent works in its own worktree
spool_id = spin("Refactor auth module", shard=True)

# Check shard status
shard_status(spool_id)

# Merge changes back when done
shard_merge(spool_id)

# Or discard if not needed
shard_abandon(spool_id)

Shards create a git worktree + branch. If SKEIN is available, uses skein shard spawn for richer tracking. Falls back to plain git worktree otherwise.

Wait for completion

# Spawn multiple agents
id1 = spin("Find all TODO comments")
id2 = spin("List unused imports")
id3 = spin("Check for type errors")

# Gather: block until all complete, get all results
results = spin_wait("id1,id2,id3", mode="gather")

# Yield: return as each completes
# Great when results are independent - process each as it lands
result = spin_wait("id1,id2,id3", mode="yield")  # Returns first to finish

# With timeout
results = spin_wait("id1,id2", mode="gather", timeout=300)

Yield mode keeps you responsive instead of blocking on the slowest agent.

Time-based waiting

Simple timed waiting with spin_sleep:

spin_sleep("90m")       # Sleep for 90 minutes
spin_sleep("2h")        # Sleep for 2 hours
spin_sleep("30s")       # Sleep for 30 seconds
spin_sleep("06:00")     # Wait until 6 AM

Or use spin_wait with the time parameter:

spin_wait(time="90m")
spin_wait(time="06:00")  # Handles next-day wraparound

Useful for periodic check-in loops (e.g., QM/dancing partner patterns).

Model selection and timeouts

# Route quick tasks to haiku (fast, cheap)
spin("Summarize this file", model="haiku")

# Complex work to opus
spin("Design the new architecture", model="opus")

# Auto-kill if it takes too long
spin("Should be quick", timeout=60)

Continue a session

# Get session ID from completed spool
result = unspool(spool_id)  # includes session_id

# Continue that conversation
new_id = respin(session_id, "Follow up question")

If the session has expired on Claude's end, respin automatically falls back to transcript injection to recreate context.

Cancel running work

spin_drop(spool_id)

List all spools

spools()

Search and filter

# Search prompts and results
spool_search("authentication")

# Filter by status and time
spool_results(status="error", since="1h")

# Regex search results
spool_grep("error|failed|exception")

# Get statistics
spool_stats()

# Export to file
spool_export("all", format="md")

Multi-Harness Support

Spindle supports multiple AI agent harnesses, allowing you to choose the best tool for each task.

Available Harnesses

Claude Code (default) - Anthropic's Claude models via claude CLI

  • Superior code understanding and reasoning
  • Best for complex refactoring, architecture decisions
  • Slower startup (~3-4 minutes to first response)
  • Use harness="claude-code" or omit harness parameter

Codex CLI - OpenAI's GPT-5 Codex models via codex CLI

  • Extremely fast startup (~10 seconds to first response)
  • Good for quick edits, simple tasks, prototyping
  • Requires ChatGPT Plus/Pro/Enterprise
  • Use harness="codex"

Gemini CLI - Google's Gemini models via gemini CLI

  • Fast startup (~5-10 seconds to first response)
  • Full agent with tool use, file access, multi-step reasoning
  • Generous free tier (1000 req/day with Google account)
  • Models: "flash", "pro", or any full model name
  • Use harness="gemini"

Kimi CLI - Moonshot AI's Kimi models via kimi-cli

  • Fast startup (~5-10 seconds to first response)
  • Thinking mode for complex reasoning
  • Models: "thinking", "thinking-turbo", "turbo", "latest", or any full model name
  • Use harness="kimi"

Basic Usage

# Claude Code (default) - best for complex work
spool_id = spin("Refactor the auth module to use dependency injection")

# Codex CLI - fast for simple tasks
spool_id = spin(
    prompt="Add error handling to this function",
    harness="codex",
    working_dir="/path/to/project"
)

# Gemini CLI - fast with free tier
spool_id = spin(
    prompt="Summarize this codebase",
    harness="gemini",
    working_dir="/path/to/project"
)

# Kimi CLI - fast reasoning with thinking mode
spool_id = spin(
    prompt="Analyze this bug",
    harness="kimi",
    working_dir="/path/to/project"
)

# All harnesses use the same API
result = unspool(spool_id)  # Auto-detects harness

Choosing a Harness

Use Claude Code when:

  • Task requires deep reasoning or architecture decisions
  • Working on complex refactoring across multiple files
  • Need thorough code review or analysis

Use Codex when:

  • Need quick edits or simple implementations
  • Prototyping or exploring ideas rapidly

Use Gemini when:

  • Want fast results without API key management (Google account login)
  • Running many parallel tasks on a budget (free tier)
  • Need a quick general-purpose agent

Use Kimi when:

  • Need thinking mode for complex reasoning at speed
  • Want fast startup with strong reasoning capabilities

Requirements

Claude Code:

Codex CLI:

  • Codex CLI installed (npm i -g @openai/codex)
  • ChatGPT Plus/Pro/Enterprise subscription

Gemini CLI:

  • Gemini CLI installed (npm i -g @google/gemini-cli)
  • Google account login (gemini → "Login with Google") or GEMINI_API_KEY env var

Kimi CLI:

  • Kimi CLI installed (pip install kimi-cli)
  • Auth via kimi-cli login or API key in ~/.kimi/config.toml

See docs/MULTI_HARNESS_GUIDE.md and docs/CODEX_SETUP.md for detailed documentation.

API

Unified API (works with all harnesses)

Tool Purpose
spin(prompt, permission?, shard?, system_prompt?, working_dir?, allowed_tools?, tags?, model?, timeout?, harness?) Spawn agent, return spool_id
unspool(spool_id) Get result (auto-detects harness, non-blocking)
respin(session_id, prompt) Continue session (auto-detects harness)

spin() parameters:

  • prompt (required): The task for the agent
  • harness (optional): "claude-code" (default), "codex", "gemini", or "kimi"
  • working_dir (optional for Claude, required for Codex/Gemini/Kimi): Project directory
  • permission (optional): "readonly", "careful" (default), "full", "shard", "careful+shard"
  • model (optional): Model to use ("sonnet", "opus", "haiku" for Claude; "flash", "pro" for Gemini; "thinking", "turbo" for Kimi)
  • timeout (optional): Auto-kill after N seconds
  • tags (optional): Comma-separated tags for organization
  • shard (optional): Create isolated git worktree (can also use permission="shard")
  • system_prompt (optional): Custom system prompt for Claude Code
  • allowed_tools (optional): Override permission profile with explicit tool list

Spool Management (works with all harnesses)

Tool Purpose
spools() List all spools
spin_wait(spool_ids?, mode?, timeout?, time?) Block until spools complete, or wait for duration
spin_sleep(duration) Sleep for a duration (90m, 2h, 30s, HH:MM)
spin_drop(spool_id) Cancel by killing process
spool_search(query, field?) Search prompts/results
spool_results(status?, since?, limit?) Bulk fetch with filters
spool_grep(pattern) Regex search results
spool_retry(spool_id) Re-run with same params
spool_peek(spool_id, lines?) See partial output while running
spool_dashboard() Overview of running/complete/needs-attention
spool_stats() Get summary statistics
spin_harnesses() List available harnesses, models, and defaults
spool_export(spool_ids, format?, output_path?) Export to file
shard_status(spool_id) Check shard worktree status
shard_merge(spool_id, keep_branch?) Merge shard to master
shard_abandon(spool_id, keep_branch?) Discard shard

Storage

Spools persist to ~/.spindle/spools/{spool_id}.json:

{
  "id": "abc12345",
  "status": "complete",
  "prompt": "...",
  "result": "...",
  "session_id": "...",
  "permission": "careful",
  "allowed_tools": "...",
  "tags": ["batch-1"],
  "shard": {
    "worktree_path": "/path/to/worktrees/abc12345-...",
    "branch_name": "shard-abc12345-...",
    "shard_id": "..."
  },
  "pid": 12345,
  "created_at": "2025-11-26T...",
  "completed_at": "2025-11-26T..."
}

CLI Commands

spindle install-service  # Install background service (Linux/macOS)
spindle start            # Start via systemd (or background if no service)
spindle reload           # Restart via systemd to pick up code changes
spindle status           # Check if running (hits /health endpoint)
spindle serve --http     # Run MCP server directly

Background Service

For persistent background operation:

# Install and enable the service (Linux or macOS)
spindle install-service

# Start it
spindle start

Linux: Writes a systemd user service to ~/.config/systemd/user/spindle.service

macOS: Writes a launchd plist to ~/Library/LaunchAgents/com.spindle.server.plist and loads it immediately

Use --force to overwrite an existing service file. Then spindle reload restarts the service to pick up code changes.

Windows

On Windows, run spindle manually:

spindle serve --http

Or use NSSM to create a Windows service.

WSL

In WSL2 with systemd enabled, spindle install-service works like native Linux. If systemd isn't enabled, you'll get instructions to enable it or run manually.

Hot Reload (MCP tool)

From within Claude Code, call spindle_reload() to restart the server and pick up code changes.

Configuration

Environment variables:

Variable Default Description
SPINDLE_MAX_CONCURRENT 15 Maximum concurrent spools

Storage location: ~/.spindle/spools/

How It Works

  1. spin() spawns a detached CLI process (claude, codex, gemini, or kimi-cli) with the given prompt
  2. The process runs in background, writing output to temporary files
  3. A monitor thread polls for completion
  4. unspool() returns the result once complete (non-blocking check)
  5. Spool metadata persists to JSON files, surviving server restarts

For shards:

  1. A git worktree is created with a new branch
  2. The agent runs inside that worktree
  3. After completion, merge back with shard_merge() or discard with shard_abandon()

Limits

  • Max 15 concurrent spools (configurable via SPINDLE_MAX_CONCURRENT)
  • 24h auto-cleanup of old spools
  • Orphaned spools (dead process) marked as error on restart

Contributing

See CONTRIBUTING.md for development setup and guidelines.

License

MIT - see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spindle_mcp-1.1.0.tar.gz (52.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spindle_mcp-1.1.0-py3-none-any.whl (39.8 kB view details)

Uploaded Python 3

File details

Details for the file spindle_mcp-1.1.0.tar.gz.

File metadata

  • Download URL: spindle_mcp-1.1.0.tar.gz
  • Upload date:
  • Size: 52.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for spindle_mcp-1.1.0.tar.gz
Algorithm Hash digest
SHA256 5f947e46e14d8a800da764e8b9565f3cb1842b92e6bf814cd2ab5c24a65d2e7b
MD5 7874eafa3aa95adbb56dac8fafc432d3
BLAKE2b-256 4bcd0eb06f5e55e934ebec8ea5151e2ec27467f527ec25af128adb65cd6ffe92

See more details on using hashes here.

File details

Details for the file spindle_mcp-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: spindle_mcp-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 39.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for spindle_mcp-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 85b89c8581bf54bec3569c1d55057629cbadb44d01c5fb41681e4fd0136c3ec3
MD5 9a559e74e70c7db033b1cd5edcf0415a
BLAKE2b-256 37677de1a478c6fafc3eb13ce10d8c51a737abc62582476dce1fae1282542933

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page