Skip to main content

Free-tier subagent routing for AI CLIs. The routing policy for endy.

Project description

multiplexor

The routing policy for free-tier AI CLIs. Pairs with endy when you also need a runtime, tmux orchestration, and cross-agent handoff.

You have a primary agent doing the heavy work. You also have Gemini CLI, OpenCode, and maybe a local Ollama model sitting idle. multiplexor connects them. It lets your main agent delegate tasks to any installed AI CLI -- automatically picking the best one, falling back when quotas run dry, and keeping everything local.

No proxies. No daemon. No MCP layer. Just a small Python command that knows which CLIs you have installed, scores them, picks one, runs it, and moves on.


The problem this solves

Free-tier AI CLIs are powerful but limited. Gemini CLI gives you access to Gemini models at no cost. OpenCode includes its own quota. But once one provider gets exhausted, you are stuck waiting or switching manually.

multiplexor fixes the switching part. It treats your installed CLIs as a pool of subagents and routes work through them in order of priority and availability. When one runs dry, you mark it exhausted with multiplexor next and it immediately tries the next best option. When all free providers are gone, Ollama runs locally as a last resort.

This is not about bypassing limits. It is about making sure you never sit idle when another provider is available and ready.


How it works

  1. You install multiplexor and run multiplexor init to create your config.
  2. The config declares which CLIs you have, their tier (free, included, local, paid), and a priority score.
  3. When your primary agent calls multiplexor delegate "task", the router computes score = priority + tier_bonus, filters out exhausted or missing providers, and picks the highest-ranked one.
  4. The task runs headless. Output is captured. If it fails or times out, the next provider tries automatically.
  5. When a provider hits its limit, multiplexor next marks it temporarily exhausted (24h cooldown by default) and launches the next eligible provider.
primary agent
     |
     v
multiplexor delegate "review this PR"
     |
     v
  router scores providers:
     gemini    score 130  (priority 100 + free bonus 30)  <-- picked
     opencode  score 115  (priority 90  + included bonus 25)
     ollama    score  15  (priority 10  + local bonus 5)   (fallback only)

Pairing with endy

multiplexor decides which CLI should pick up next. endy is the runtime that runs each CLI in a detached tmux window, captures its output to .logs/, and provides endy handoff — a one-command transfer of an in-flight coding task from one agent to another with full context.

Wire them together by setting endy's resolver hook to the dedicated binary multiplexor installs for this purpose:

export ENDY_HANDOFF_RESOLVER=multiplexor-next-provider
endy handoff <task-id>                  # --to becomes optional

When you call endy handoff without --to, endy invokes multiplexor-next-provider <prev-agent> <task-id> <cwd> and uses the agent name it prints to stdout. Under the hood that wrapper just calls multiplexor next-provider, which:

  1. Marks <prev-agent> as exhausted (so it does not get re-selected).
  2. Computes the highest-scored eligible provider with the regular priority + tier_bonus rules (same logic as multiplexor status).
  3. Prints the chosen name to stdout. Exits non-zero (silently, stderr only) when nothing is eligible, so endy falls back to requiring an explicit --to.

A standalone test, no endy required:

$ multiplexor-next-provider gemini task-123 /tmp/proj
opencode
$ multiplexor status | head -5
1. opencode
   tier: included
   ...

endy state (the per-spawn environment snapshot) reads multiplexor's view of the world via multiplexor status --json [agent], which returns a machine-parseable shape including exhausted_seconds_remaining for each provider — see docs/usage.md.

Useful flags on multiplexor next-provider (or its -next-provider shim) for tuning the integration:

Flag What it does
--no-mark Pure query, do not touch state (good for dry-runs / health checks).
`--mode interactive ask`
--verbose Emit score + tier + alternatives to stderr (stdout stays a single name, safe for the resolver).
--for endy Skip providers endy cannot drive headlessly (e.g. ollama).

You can also use either tool alone:

  • multiplexor without endymultiplexor delegate "task" runs a single task on the best-scored CLI. No tmux, no logs directory, no handoff chain.
  • endy without multiplexorendy handoff <id> --to <agent> is a one-shot manual handoff. You pick the next agent yourself.

Together you get continuity: when one tier runs out, the routing policy decides who is next and the runtime carries the task across without you re-typing the prompt.


Install

pip install -e .

Requires Python 3.11+ and at least one supported CLI in your PATH.


Quickstart

multiplexor init          # create user config
multiplexor doctor        # verify everything is detected
multiplexor status        # see current ranking
multiplexor delegate "review this repository and list concrete risks"

When the current provider gets exhausted:

multiplexor next          # mark last provider exhausted, launch next

Other commands you will use:

multiplexor                    # launch best interactive provider
multiplexor reset              # clear all exhaustion marks
multiplexor next               # skip to next provider
multiplexor delegate "task"    # headless subagent run
multiplexor ask "prompt"       # alias for delegate
multiplexor --dry-run          # show what would run
multiplexor --provider gemini  # force a specific provider

Piping works too:

git diff | multiplexor delegate "review these changes"

Providers

The default config ships with these providers:

Provider Tier Priority Default State Notes
Gemini CLI free 100 enabled Main subagent, headless capable
OpenCode included 90 enabled Good secondary option
Ollama local 10 enabled Fallback only, runs locally
Qwen paid 95 disabled Optional
Hermes paid 80 disabled Optional
cmd paid 60 disabled CommandCode / Kimi K2.6 — cheapest paid option
Codex paid 40 disabled Optional
Claude paid 30 disabled Optional

Paid providers are disabled by default because the v1 focus is free-tier delegation. Enable them in your config if you want them in the routing pool.

Scoring formula:

score = priority + tier_bonus

Tier bonuses: free=30, included=25, local=5, paid=0. Gemini CLI with its base priority of 100 and free bonus of 30 scores 130, always winning unless exhausted.


Configuration

Run multiplexor init to create your config at:

  • Linux/macOS: ~/.config/multiplexor/config.yaml
  • Windows: %USERPROFILE%\.multiplexor\config.yaml

A provider entry needs only a few fields:

providers:
  gemini:
    enabled: true
    tier: free
    priority: 100
    command: "gemini"
    interactive_command: ["gemini", "--skip-trust", "--approval-mode=yolo"]
    ask_command: ["gemini", "--skip-trust", "--approval-mode=yolo", "-p", ""]
    ask_stdin: true

The Provider class handles everything else: detection from PATH, command construction, scoring, and prompt substitution. Adding a new provider means adding a config block. No code changes required.

Key fields:

  • enabled: include or skip this provider
  • tier: determines the bonus added to priority
  • priority: base ranking score
  • command: executable name for PATH detection
  • interactive_command / ask_command: command templates for each mode
  • ask_stdin: send task through stdin instead of argv
  • fallback_only: only use when no normal provider is eligible
  • default_model: required for Ollama

State and exhaustion

State is stored locally as state.json next to your config. It tracks only two things:

  • The last provider that ran
  • Temporary exhaustion marks (with an expiration timestamp)

It does not store credentials, API keys, prompts, or anything sensitive. The exhaustion cooldown defaults to 24 hours and is configurable.

multiplexor next    # marks last_provider as exhausted
multiplexor reset   # clears all exhaustion marks

Security model

multiplexor runs commands with shell=False. No shell injection surface. Prompts go through stdin by default so they never appear in process arguments visible to ps.

The default delegate commands use each CLI's official allow-all permission mode:

  • Gemini: --skip-trust --approval-mode=yolo
  • OpenCode: --dangerously-skip-permissions

This is intentional. A delegated CLI can edit files and run commands. Only use this in repositories where you are comfortable with that behavior. First-time authentication and setup still belong to each CLI. multiplexor does not store or inject any credentials.

It does not:

  • bypass rate limits or quotas
  • scrape provider credit balances
  • modify any provider's internal configuration
  • run a proxy, daemon, web server, or MCP server

Limitations

v1 operates on what it can detect: installed commands and explicit exhaustion state. It cannot read your exact Gemini quota or predict when a provider will fail. If a CLI hangs waiting for interactive setup, the configured timeout (default 120s) kills it and tries the next provider.

Provider-specific setup hints and per-provider timeout overrides are planned.


Testing

python3 -m unittest discover -s tests

Tests use mocked commands. No real CLIs or credentials needed.


Roadmap

  • Clearer per-provider setup hints when a CLI fails to run
  • Optional per-provider timeout overrides in config
  • Examples for adding custom local providers

Docs

  • Usage - command examples and patterns
  • Configuration - provider config, scoring, and state
  • Security - threat model and operational notes

Related

  • endy - the runtime. tmux + .logs/
    • cross-agent handoff command. Use multiplexor as its routing policy.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

endy_multiplexor-1.0.0.tar.gz (23.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

endy_multiplexor-1.0.0-py3-none-any.whl (19.9 kB view details)

Uploaded Python 3

File details

Details for the file endy_multiplexor-1.0.0.tar.gz.

File metadata

  • Download URL: endy_multiplexor-1.0.0.tar.gz
  • Upload date:
  • Size: 23.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for endy_multiplexor-1.0.0.tar.gz
Algorithm Hash digest
SHA256 404abe00ca30a07fbe36e4793ae0fc7e0a2d8a0f4c222d450c8a235cf12789d0
MD5 e015d401d91cb314c547215fee9d876c
BLAKE2b-256 bfd255d45494773805a4e58df237ab31681fd3e77bde723a57c4ca294ce5bdd5

See more details on using hashes here.

File details

Details for the file endy_multiplexor-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for endy_multiplexor-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 577d9a3c40b351a958a8a236c00b94389c595bc0aa0a5ae4863c46f462bb6a4b
MD5 97f036715a8efc17ee9e72cde83d8ca6
BLAKE2b-256 7b42843cfdae2f7ab69d2e9650b01bb66162eff480c84c96e41a0c7e33fd1f49

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page