Free-tier subagent routing for AI CLIs. The routing policy for endy.
Project description
multiplexor
The routing policy for free-tier AI CLIs. Pairs with endy when you also need a runtime, tmux orchestration, and cross-agent handoff.
You have a primary agent doing the heavy work. You also have Gemini CLI, OpenCode, and maybe a local Ollama model sitting idle. multiplexor connects them. It lets your main agent delegate tasks to any installed AI CLI -- automatically picking the best one, falling back when quotas run dry, and keeping everything local.
No proxies. No daemon. No MCP layer. Just a small Python command that knows which CLIs you have installed, scores them, picks one, runs it, and moves on.
The problem this solves
Free-tier AI CLIs are powerful but limited. Gemini CLI gives you access to Gemini models at no cost. OpenCode includes its own quota. But once one provider gets exhausted, you are stuck waiting or switching manually.
multiplexor fixes the switching part. It treats your installed CLIs as a pool of subagents and routes work through them in order of priority and availability. When one runs dry, you mark it exhausted with multiplexor next and it immediately tries the next best option. When all free providers are gone, Ollama runs locally as a last resort.
This is not about bypassing limits. It is about making sure you never sit idle when another provider is available and ready.
How it works
- You install
multiplexorand runmultiplexor initto create your config. - The config declares which CLIs you have, their tier (
free,included,local,paid), and a priority score. - When your primary agent calls
multiplexor delegate "task", the router computesscore = priority + tier_bonus, filters out exhausted or missing providers, and picks the highest-ranked one. - The task runs headless. Output is captured. If it fails or times out, the next provider tries automatically.
- When a provider hits its limit,
multiplexor nextmarks it temporarily exhausted (24h cooldown by default) and launches the next eligible provider.
primary agent
|
v
multiplexor delegate "review this PR"
|
v
router scores providers:
gemini score 130 (priority 100 + free bonus 30) <-- picked
opencode score 115 (priority 90 + included bonus 25)
ollama score 15 (priority 10 + local bonus 5) (fallback only)
Pairing with endy
multiplexor decides which CLI should pick up next. endy is the
runtime that runs each CLI in a detached tmux window, captures its output to
.logs/, and provides endy handoff — a one-command transfer of an
in-flight coding task from one agent to another with full context.
Wire them together by setting endy's resolver hook to the dedicated binary multiplexor installs for this purpose:
export ENDY_HANDOFF_RESOLVER=multiplexor-next-provider
endy handoff <task-id> # --to becomes optional
When you call endy handoff without --to, endy invokes
multiplexor-next-provider <prev-agent> <task-id> <cwd> and uses the
agent name it prints to stdout. Under the hood that wrapper just calls
multiplexor next-provider, which:
- Marks
<prev-agent>as exhausted (so it does not get re-selected). - Computes the highest-scored eligible provider with the regular
priority + tier_bonusrules (same logic asmultiplexor status). - Prints the chosen name to stdout. Exits non-zero (silently, stderr
only) when nothing is eligible, so endy falls back to requiring an
explicit
--to.
A standalone test, no endy required:
$ multiplexor-next-provider gemini task-123 /tmp/proj
opencode
$ multiplexor status | head -5
1. opencode
tier: included
...
endy state (the per-spawn environment snapshot) reads multiplexor's
view of the world via multiplexor status --json [agent], which returns
a machine-parseable shape including exhausted_seconds_remaining for
each provider — see docs/usage.md.
Useful flags on multiplexor next-provider (or its -next-provider
shim) for tuning the integration:
| Flag | What it does |
|---|---|
--no-mark |
Pure query, do not touch state (good for dry-runs / health checks). |
| `--mode interactive | ask` |
--verbose |
Emit score + tier + alternatives to stderr (stdout stays a single name, safe for the resolver). |
--for endy |
Skip providers endy cannot drive headlessly (e.g. ollama). |
You can also use either tool alone:
- multiplexor without endy —
multiplexor delegate "task"runs a single task on the best-scored CLI. No tmux, no logs directory, no handoff chain. - endy without multiplexor —
endy handoff <id> --to <agent>is a one-shot manual handoff. You pick the next agent yourself.
Together you get continuity: when one tier runs out, the routing policy decides who is next and the runtime carries the task across without you re-typing the prompt.
Install
pip install -e .
Requires Python 3.11+ and at least one supported CLI in your PATH.
Quickstart
multiplexor init # create user config
multiplexor doctor # verify everything is detected
multiplexor status # see current ranking
multiplexor delegate "review this repository and list concrete risks"
When the current provider gets exhausted:
multiplexor next # mark last provider exhausted, launch next
Other commands you will use:
multiplexor # launch best interactive provider
multiplexor reset # clear all exhaustion marks
multiplexor next # skip to next provider
multiplexor delegate "task" # headless subagent run
multiplexor ask "prompt" # alias for delegate
multiplexor --dry-run # show what would run
multiplexor --provider gemini # force a specific provider
Piping works too:
git diff | multiplexor delegate "review these changes"
Providers
The default config ships with these providers:
| Provider | Tier | Priority | Default State | Notes |
|---|---|---|---|---|
| Gemini CLI | free | 100 | enabled | Main subagent, headless capable |
| OpenCode | included | 90 | enabled | Good secondary option |
| Ollama | local | 10 | enabled | Fallback only, runs locally |
| Qwen | paid | 95 | disabled | Optional |
| Hermes | paid | 80 | disabled | Optional |
| cmd | paid | 60 | disabled | CommandCode / Kimi K2.6 — cheapest paid option |
| Codex | paid | 40 | disabled | Optional |
| Claude | paid | 30 | disabled | Optional |
Paid providers are disabled by default because the v1 focus is free-tier delegation. Enable them in your config if you want them in the routing pool.
Scoring formula:
score = priority + tier_bonus
Tier bonuses: free=30, included=25, local=5, paid=0. Gemini CLI with its base priority of 100 and free bonus of 30 scores 130, always winning unless exhausted.
Configuration
Run multiplexor init to create your config at:
- Linux/macOS:
~/.config/multiplexor/config.yaml - Windows:
%USERPROFILE%\.multiplexor\config.yaml
A provider entry needs only a few fields:
providers:
gemini:
enabled: true
tier: free
priority: 100
command: "gemini"
interactive_command: ["gemini", "--skip-trust", "--approval-mode=yolo"]
ask_command: ["gemini", "--skip-trust", "--approval-mode=yolo", "-p", ""]
ask_stdin: true
The Provider class handles everything else: detection from PATH, command construction, scoring, and prompt substitution. Adding a new provider means adding a config block. No code changes required.
Key fields:
enabled: include or skip this providertier: determines the bonus added to prioritypriority: base ranking scorecommand: executable name for PATH detectioninteractive_command/ask_command: command templates for each modeask_stdin: send task through stdin instead of argvfallback_only: only use when no normal provider is eligibledefault_model: required for Ollama
State and exhaustion
State is stored locally as state.json next to your config. It tracks only two things:
- The last provider that ran
- Temporary exhaustion marks (with an expiration timestamp)
It does not store credentials, API keys, prompts, or anything sensitive. The exhaustion cooldown defaults to 24 hours and is configurable.
multiplexor next # marks last_provider as exhausted
multiplexor reset # clears all exhaustion marks
Security model
multiplexor runs commands with shell=False. No shell injection surface. Prompts go through stdin by default so they never appear in process arguments visible to ps.
The default delegate commands use each CLI's official allow-all permission mode:
- Gemini:
--skip-trust --approval-mode=yolo - OpenCode:
--dangerously-skip-permissions
This is intentional. A delegated CLI can edit files and run commands. Only use this in repositories where you are comfortable with that behavior. First-time authentication and setup still belong to each CLI. multiplexor does not store or inject any credentials.
It does not:
- bypass rate limits or quotas
- scrape provider credit balances
- modify any provider's internal configuration
- run a proxy, daemon, web server, or MCP server
Limitations
v1 operates on what it can detect: installed commands and explicit exhaustion state. It cannot read your exact Gemini quota or predict when a provider will fail. If a CLI hangs waiting for interactive setup, the configured timeout (default 120s) kills it and tries the next provider.
Provider-specific setup hints and per-provider timeout overrides are planned.
Testing
python3 -m unittest discover -s tests
Tests use mocked commands. No real CLIs or credentials needed.
Roadmap
- Clearer per-provider setup hints when a CLI fails to run
- Optional per-provider timeout overrides in config
- Examples for adding custom local providers
Docs
- Usage - command examples and patterns
- Configuration - provider config, scoring, and state
- Security - threat model and operational notes
Related
- endy - the runtime. tmux +
.logs/- cross-agent
handoffcommand. Use multiplexor as its routing policy.
- cross-agent
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file endy_multiplexor-1.0.0.tar.gz.
File metadata
- Download URL: endy_multiplexor-1.0.0.tar.gz
- Upload date:
- Size: 23.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
404abe00ca30a07fbe36e4793ae0fc7e0a2d8a0f4c222d450c8a235cf12789d0
|
|
| MD5 |
e015d401d91cb314c547215fee9d876c
|
|
| BLAKE2b-256 |
bfd255d45494773805a4e58df237ab31681fd3e77bde723a57c4ca294ce5bdd5
|
File details
Details for the file endy_multiplexor-1.0.0-py3-none-any.whl.
File metadata
- Download URL: endy_multiplexor-1.0.0-py3-none-any.whl
- Upload date:
- Size: 19.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
577d9a3c40b351a958a8a236c00b94389c595bc0aa0a5ae4863c46f462bb6a4b
|
|
| MD5 |
97f036715a8efc17ee9e72cde83d8ca6
|
|
| BLAKE2b-256 |
7b42843cfdae2f7ab69d2e9650b01bb66162eff480c84c96e41a0c7e33fd1f49
|