Run Claude Code-style subagents across your local model fleet.
Project description
subagent-fleet
Run Claude Code-style subagents across your local model fleet.
subagent-fleet is a config-first Python CLI for mapping coding subagents to the best Ollama model and machine you own, then generating LiteLLM and Claude Code-style agent configuration.
Quickstart • Configuration • Generated Files • Security • Roadmap
Overview
Local model users often have more than one useful machine: a laptop, a Mac mini, a workstation, a home server, or a spare GPU box. Most coding harnesses still point at one model endpoint.
subagent-fleet turns that setup into a private local subagent fleet:
planner -> small fast model on a lightweight node
implementer -> larger coding model on a bigger node
reviewer -> larger coding model on a bigger node
summarizer -> small local model on the controller
It does not replace Ollama, LiteLLM, or Claude Code. It generates the glue between them:
Claude Code / coding harness
|
v
LiteLLM gateway generated by subagent-fleet
|
+-- Ollama node: laptop
+-- Ollama node: Mac mini 64GB
+-- Ollama node: workstation
Features
- Validate a declarative
fleet.yaml. - Discover models from configured Ollama nodes via
/api/tags. - Generate
litellm_config.yamlwithollama_chat/routes. - Generate Claude Code-style
.claude/agents/*.mdfiles. - Generate
.env.subagent-fleetfor Claude Code/LiteLLM environment variables. - Warm configured Ollama models with
keep_alive. - Show node health and agent routing tables.
- Keep unreachable nodes isolated so one offline machine does not crash the whole workflow.
Status
MVP CLI implemented.
Available commands:
subagent-fleet init
subagent-fleet validate
subagent-fleet discover
subagent-fleet generate
subagent-fleet warmup
subagent-fleet status
subagent-fleet doctor
subagent-fleet clean
subagent-fleet skills list
subagent-fleet skills install
subagent-fleet plugins install
Install
Choose one of the install paths below.
CLI from GitHub
Install the CLI directly from PyPI:
python -m pip install subagent-fleet
Or install it as an isolated command with pipx:
pipx install subagent-fleet
Verify:
subagent-fleet --help
Development Checkout
Use this when contributing to the project:
git clone https://github.com/adityak74/subagent-fleet.git
cd subagent-fleet
python -m pip install -e ".[dev]"
Run tests:
python -m pytest
Claude Code Plugin First
Install the plugin first from Claude Code, then let the bundled bootstrap skill install the CLI:
/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet
After install, ask Claude Code:
Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.
The bootstrap skill will run or recommend:
python -m pip install subagent-fleet
subagent-fleet skills install
Codex Plugin First
Install this repository as a local Codex marketplace:
codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet
Then ask Codex:
Use the subagent-fleet bootstrap skill to install the CLI and set up this repo.
Quickstart
Create a starter config:
subagent-fleet init
Edit fleet.yaml with your Ollama node endpoints and model names, then validate it:
subagent-fleet validate
Check which nodes are reachable:
subagent-fleet discover
Generate LiteLLM, Claude agent, and environment files:
subagent-fleet generate
Start LiteLLM:
export LITELLM_MASTER_KEY="sk-local-dev"
litellm \
--config ./litellm_config.yaml \
--host 127.0.0.1 \
--port 4000
Point Claude Code at the local gateway:
source .env.subagent-fleet
claude
Configuration
subagent-fleet is driven by fleet.yaml.
project:
name: local-dev
gateway:
provider: litellm
host: 127.0.0.1
port: 4000
master_key_env: LITELLM_MASTER_KEY
nodes:
m5-local:
endpoint: http://localhost:11434
tags: [controller, local, fast]
m4-mini-64gb:
endpoint: http://192.168.1.50:11434
tags: [heavy, coder, reviewer]
m4-mini-16gb:
endpoint: http://192.168.1.51:11434
tags: [small, planner, summarizer]
models:
heavy-coder:
node: m4-mini-64gb
ollama_model: qwen2.5-coder:32b
litellm_alias: claude-sonnet-local
context: 32768
timeout: 600
max_parallel: 1
small-coder:
node: m4-mini-16gb
ollama_model: qwen2.5-coder:7b
litellm_alias: claude-haiku-local
context: 8192
timeout: 300
max_parallel: 1
agents:
planner:
model: small-coder
description: Use for planning, file discovery, task decomposition, and summarization.
tools: [Read, Grep, Glob]
prompt: |
You are a fast local planning agent.
Do not edit files.
Return a concise response with:
- plan
- relevant files
- risks
- next recommended agent
implementer:
model: heavy-coder
description: Use for implementation, bug fixes, refactors, and patch creation.
tools: [Read, Grep, Glob, Edit, MultiEdit, Bash]
reviewer:
model: heavy-coder
description: Use after implementation to review diffs, tests, regressions, and maintainability.
tools: [Read, Grep, Glob, Bash]
Generated Files
Running:
subagent-fleet generate
creates:
litellm_config.yaml
.claude/agents/planner.md
.claude/agents/implementer.md
.claude/agents/reviewer.md
.env.subagent-fleet
Example LiteLLM route:
model_list:
- model_name: claude-sonnet-local
litellm_params:
model: ollama_chat/qwen2.5-coder:32b
api_base: http://192.168.1.50:11434
api_key: ollama
timeout: 600
model_info:
max_input_tokens: 32768
Example Claude agent:
---
name: planner
description: Use for planning, file discovery, task decomposition, and summarization.
model: claude-haiku-local
tools: Read, Grep, Glob
---
You are a fast local planning agent.
Do not edit files.
Return a concise response with:
- plan
- relevant files
- risks
- next recommended agent
Commands
| Command | Purpose |
|---|---|
subagent-fleet init |
Create a starter fleet.yaml. |
subagent-fleet validate |
Validate schema, references, URLs, aliases, and agent names. |
subagent-fleet discover |
Query configured Ollama nodes for available models. |
subagent-fleet generate |
Generate LiteLLM config, Claude agents, and env file. |
subagent-fleet warmup |
Preload configured Ollama models with keep_alive. |
subagent-fleet status |
Show node health and agent routing. |
subagent-fleet doctor |
Show validation and local-network safety guidance. |
subagent-fleet clean |
List or remove generated files. |
subagent-fleet skills list |
List bundled assistant skills and supported targets. |
subagent-fleet skills install |
Install assistant-facing setup and operations skills. |
subagent-fleet plugins install |
Install Claude Code and Codex plugin marketplace bundles. |
JSON output is available for discovery and status:
subagent-fleet discover --json
subagent-fleet status --json
Assistant Skills
subagent-fleet ships assistant-facing skills that teach Claude Code, Codex, OpenCode, and similar tools how to set up and operate the fleet from inside a repository.
List bundled skills and supported targets:
subagent-fleet skills list
Install all bundled skills for all supported targets:
subagent-fleet skills install
This writes:
.claude/skills/subagent-fleet-setup/SKILL.md
.claude/skills/subagent-fleet-operations/SKILL.md
.codex/skills/subagent-fleet-setup/SKILL.md
.codex/skills/subagent-fleet-operations/SKILL.md
.opencode/skills/subagent-fleet-setup/SKILL.md
.opencode/skills/subagent-fleet-operations/SKILL.md
Install for a specific assistant:
subagent-fleet skills install --target codex
subagent-fleet skills install --target claude-code
subagent-fleet skills install --target opencode
Install one bundled skill:
subagent-fleet skills install --skill subagent-fleet-setup
Existing skill files are not overwritten unless you pass --force.
Plugin Marketplaces
This repository also ships plugin marketplace metadata so users can install the assistant skill first, then let that skill install and verify the Python CLI.
Included plugin artifacts:
.claude-plugin/marketplace.json
.agents/plugins/marketplace.json
plugins/subagent-fleet/.claude-plugin/plugin.json
plugins/subagent-fleet/.codex-plugin/plugin.json
plugins/subagent-fleet/skills/subagent-fleet-bootstrap/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-setup/SKILL.md
plugins/subagent-fleet/skills/subagent-fleet-operations/SKILL.md
The bootstrap skill teaches Claude Code or Codex how to install the CLI:
python -m pip install subagent-fleet
and then install repo-local assistant skills:
subagent-fleet skills install
Claude Code plugin install flow:
/plugin marketplace add https://github.com/adityak74/subagent-fleet
/plugin install subagent-fleet
Codex local marketplace flow:
codex plugin marketplace add .
codex plugin add subagent-fleet@subagent-fleet
To generate the same marketplace/plugin bundle into another directory:
subagent-fleet plugins install --out /path/to/marketplace-root
Install only one target:
subagent-fleet plugins install --target claude-code
subagent-fleet plugins install --target codex
Existing plugin marketplace files are not overwritten unless you pass --force.
Ollama Worker Setup
On each worker machine, run Ollama on a private interface reachable from your controller:
launchctl setenv OLLAMA_HOST "0.0.0.0:11434"
launchctl setenv OLLAMA_KEEP_ALIVE "-1"
launchctl setenv OLLAMA_NUM_PARALLEL "1"
launchctl setenv OLLAMA_MAX_LOADED_MODELS "1"
killall Ollama
open -a Ollama
From the controller:
curl http://NODE_IP:11434/api/tags
Security
subagent-fleet assumes private local networking.
Do:
- Use LAN, firewall rules, Tailscale, WireGuard, or a private subnet.
- Keep
LITELLM_MASTER_KEYset for LiteLLM access. - Treat generated
.env.subagent-fleetfiles as local developer configuration.
Do not:
- Expose Ollama directly to the public internet.
- Expose LiteLLM without authentication.
- Commit real API keys, LAN secrets, or machine-specific private
.envfiles.
Run:
subagent-fleet doctor
for local setup and safety reminders.
Development
Install dev dependencies:
python -m pip install -e ".[dev]"
Run tests:
python -m pytest
Run a focused test:
python -m pytest tests/test_config.py
Check CLI wiring:
python -m subagent_fleet.cli --help
Project Layout
src/subagent_fleet/
cli.py
config.py
discovery.py
plugins.py
warmup.py
status.py
skills.py
generators/
skill_templates/
templates/
examples/
plugins/
tests/
Roadmap
MVP:
-
fleet.yamlschema - Ollama node health checks
- Ollama model discovery via
/api/tags - LiteLLM config generation
- Claude Code agent generation
- Environment file generation
- Model warmup with
keep_alive - Status and routing tables
Next:
- Latency benchmarking
- Recommended agent-to-node assignment
- Role-based routing templates
- Tailscale-aware node discovery
- OpenAI-compatible harness examples
- Release packaging
Later:
- Dynamic routing by task type
- Fallback model generation
- Queue-aware scheduling
- Agent execution trace viewer
- Support for vLLM, LM Studio, llama.cpp, OpenRouter, and cloud APIs
Star History
Contributing
Issues and pull requests are welcome.
Good first areas:
- More generator tests
- Additional example fleets
- Better status formatting
- More robust Ollama error reporting
- Documentation for real multi-machine setups
Before opening a PR:
python -m pytest
What This Is Not
subagent-fleet is not:
- an inference engine
- a replacement for Ollama
- a replacement for LiteLLM
- a model sharding framework
- Kubernetes for local LLMs
- a public model hosting platform
It is a small workflow layer for private local subagent orchestration.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file subagent_fleet-0.0.1.tar.gz.
File metadata
- Download URL: subagent_fleet-0.0.1.tar.gz
- Upload date:
- Size: 31.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b052bbf1175681f0b48ff1ea1c4680095f83e3356dc7a90cb9f5dcdedc6ee91c
|
|
| MD5 |
df29c35b643c32c7aae37dc4d2fefd89
|
|
| BLAKE2b-256 |
3218d61ab0bae6dfe2f9381c39aa5f485244763613635b5fd85a14e4c28dacb7
|
File details
Details for the file subagent_fleet-0.0.1-py3-none-any.whl.
File metadata
- Download URL: subagent_fleet-0.0.1-py3-none-any.whl
- Upload date:
- Size: 26.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dab8fbae523759e5957c863d21046ff0300a4b6ace9374dad0b070b52c6818fd
|
|
| MD5 |
24539da873e7d06a77ada50e54a4e439
|
|
| BLAKE2b-256 |
7dafc734ae93ff073dce5ff60dbace3236fa526df52439f4a017bc24644e28ae
|