Efficient Agent Router — routes tasks to the best LLM under quality, latency, cost, and safety constraints

Project description

Efficient Agent Router (EAR)

Efficient Agent Router (EAR) is a Python-first orchestration service that selects and executes the best LLM for a request based on quality, cost, latency, context window, and safety constraints.

Goals

Route each request to the most suitable model for the task.
Reduce token burn through cost-aware model ranking.
Protect sensitive input with prompt-injection and PII safeguards.
Provide a clean CLI first, then expose the same logic through MCP.

Current Delivery Status (v0.10.16)

Epic	Description	Status
E1	Foundation and Project Setup	✅ Complete
E2	Model Registry and Metadata Management	✅ Complete
E3	Predictive Routing Engine	✅ Complete
E4	CLI Experience and Operator Workflow	✅ Complete
E5	Reliability and Cascade Fallback	✅ Complete
E6	Safety and Guardrails	✅ Complete
E7	Observability and Cost/Latency Metrics	✅ Complete
E8	MCP Server and Tool Exposure	✅ Complete
E9	CI/CD and Security Automation	✅ Complete
E10	Execution Plane and Adaptive Routing Intelligence	✅ Complete
E11	Leadership Demo Frontend and GTM Showcase	✅ Complete
E17	Ollama Private Provider Integration	✅ Complete
E18	Live React Web Console	✅ Complete
E19	CLI Aliases and UX Polish	✅ Complete
E12–E16	Post-launch hardening (PyPI verify, canary, benchmarks, ADRs)	⏳ Pending

Current Delivery Strategy

Build and validate core routing engine through CLI. ✅
Harden reliability, guardrails, and observability. ✅
Expose stable capabilities through MCP server. ✅
Add real execution runtime and adaptive intent/injection intelligence. ✅
Ship interactive leadership demo with value storytelling. ✅
Add Ollama private provider for on-premise safety routing. ✅
Ship live React web console for developer-facing routing visualization. ✅
Post-launch: verify PyPI release, run live canary, publish benchmarks, backfill ADRs.

Tech Stack

Python 3.12+
asyncio
Typer CLI
Pydantic v2
httpx for OpenRouter model metadata
pytest, pytest-asyncio, pytest-cov
bandit and pip-audit for security controls

Repository Layout

src/
  ear/
    __init__.py          # Package root, version
    config.py            # Pydantic-settings configuration (EARConfig)
    models.py            # Domain models: ModelSpec, RoutingRequest, RoutingDecision
    registry.py          # OpenRouterRegistry, OllamaRegistry, RegistryFactory
    router_engine.py     # IntentClassifier, SuitabilityScorer, RouterEngine
    guardrails.py        # Prompt-injection detector, PII policy, semantic risk scorer
    fallback.py          # FailureClassifier, FallbackPipeline
    metrics.py           # MetricsCollector, SessionSummary
    executor.py          # LLMExecutor, OllamaExecutor, CompositeExecutor
    orchestrator.py      # Unified execution orchestration pipeline
    intent.py            # Advanced intent classifier (embedding + heuristic fallback)
    evaluation.py        # Evaluation harness and benchmark suite
    cli.py               # Typer CLI: route, inspect-models, stats (+ aliases)
    mcp_server.py        # MCP stdio transport and tool/resource handlers
    demo_backend.py      # Demo routing replay scenarios and value storytelling
    demo_server.py       # uvicorn-backed local demo HTTP server
tests/
  conftest.py
  test_config.py
  test_models.py
  test_registry.py
  test_router_engine.py
  test_guardrails.py
  test_fallback.py
  test_metrics.py
  test_executor.py
  test_orchestrator.py
  test_intent.py
  test_evaluation.py
  test_cli.py
  test_mcp_server.py
  test_demo_backend.py
  test_demo_server.py
webapp/
  package.json           # React + Vite dependencies
  vite.config.js
  src/                   # React routing console components
docs/
  system_prompt.md
  execution_plan.md
  wbs.md
  release-playbook.md
  llm_explorer.html      # Standalone browser-based LLM explorer and demo UI
  usage-guide.md
  project-history.md     # Full commit history and delivery log
  adr/
  releases/

Core Workflow

Accept user task input and options (task hint, budget priority, context profile).
Run safety prechecks (injection and PII policy).
Load model metadata from OpenRouter registry cache.
Compute suitability score and candidate ranking.
Return model recommendation, rationale, and fallback chain (execution runtime is tracked in E10).
Emit session metrics snapshot for observability.

Routing Model

The router evaluates candidate models using a weighted suitability function:

S = Quality / (Cost * Latency)

Where score inputs are normalized and constrained by policy:

Context window threshold
Budget priority
Safety allowlist and PII policy
Task-specific boosts (coding, planning, research)

CLI Commands

Full command names and short aliases are both supported:

# Route a prompt (full and alias)
ear route "explain quicksort" --task coding --budget medium
ear r "explain quicksort" --task coding --budget medium

# JSON output for scripting
ear route "explain quicksort" --json

# Execute the routed model call
ear route "explain quicksort" --execute

# Inspect cached models
ear inspect-models
ear im

# Session metrics
ear stats
ear s

# Bare invocation: routes with sensible defaults
ear "explain quicksort"

MCP Design

Tool: route_and_execute
Resources: model performance metrics, cost per session
Transport: stdio

Ollama Private Provider

EAR routes PII-containing and injection-risk prompts to a local Ollama instance, ensuring sensitive data never reaches cloud providers.

Configuration:

export OLLAMA_BASE_URL=http://localhost:11434
export OLLAMA_ENABLED=true

Behavior:

ollama/<model> models appear in the registry with trusted=True and zero pricing.
Guardrail-blocked prompts route to Ollama when available instead of hard-blocking.
PII prompts are restricted to Ollama and vetted cloud providers only.
If Ollama is unavailable and a prompt is blocked, GuardrailsBlockedError is raised (fail-closed).

Interactive LLM Explorer and Demo UI

File: docs/llm_explorer.html
Purpose: interactive OpenRouter model table, routing demo, and value storytelling for leadership and investor demos.

What it includes:

Live model fetch from OpenRouter (/api/v1/models) with auto-refresh and last-updated indicator.
Search, provider pills, min-context, max-cost, and priced/unpriced filters.
Excel-style sortable table with per-column filters.
Side-by-side comparison cards for selected models (up to 4).
Value Story section with 10 routing scenarios: cost savings, latency gains, and safety enforcement.
Routing-mode toggle (Standard / Ollama Private): shows attack scenarios routing to ollama/llama3 for on-premise data-residency demonstration.
Processing progress log: step-by-step routing decisions with timestamps.

How to run:

Open docs/llm_explorer.html directly in a browser, or
Start the local demo server: python -m ear.demo_server (default port 7861)

Live React Web Console

Directory: webapp/
Purpose: developer-facing real-time routing visualization built with React and Vite.

How to run:

# Windows
run_live_webapp.bat

# Linux / macOS
bash run_live_webapp.sh

The launcher waits for the Vite dev server to be ready before opening the browser.

Demo Walkthrough

# Windows
run_demo_walkthrough.bat

# Linux / macOS
bash run_demo_walkthrough.sh

Runs all 10 demo routing scenarios end-to-end and opens the value storytelling view.

Configuration

Environment variables (minimum required):

OPENROUTER_API_KEY=<your key>
EAR_REGISTRY_TTL_SECONDS=300
EAR_DEFAULT_BUDGET=medium
EAR_MAX_RETRIES=3
EAR_OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
EAR_REQUEST_TIMEOUT_SECONDS=30

Optional Ollama private provider:

OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_ENABLED=true

Recommended local setup:

Create and activate virtual environment: python -m venv .venv && .venv\Scripts\activate
Install: pip install -e .[dev]
Copy .env.example to .env and populate values.
Run tests: run_tests.bat (Windows) or bash run_tests.sh
Run security audits: run_security_audits.bat or bash run_security_audits.sh

Quality and Security Requirements

100% statement and branch coverage for routing core.
Deterministic tests with mocked external dependencies.
Security linting with bandit.
Dependency auditing with pip-audit.
No plaintext secret logging.

Security Report HTML Generation

Security workflows generate JSON first, then render HTML using sec-report-kit.
pip-audit workflow outputs: security_reports/pip_audit_latest.html.
Trivy workflow outputs: security_reports/trivy_latest.html.
Both HTML files are uploaded in the workflow artifacts alongside JSON and SARIF outputs.
Local scripts also generate HTML from JSON:
- run_pip_audit.bat / run_pip_audit.sh
- run_trivy.bat / run_trivy.sh
- one-command wrapper: run_security_audits.bat / run_security_audits.sh

MCP Server: sec-report-kit

Install sec-report-kit locally:

pip install sec-report-kit

Configured MCP server command:

srk mcp serve --transport stdio

Workspace configuration is stored in .vscode/mcp.json.

Milestones

M1: Registry and schema baseline ✅
M2: Router core and CLI ✅
M3: Guardrails and metrics ✅
M4: MCP server and CI/CD gates ✅
M5: Execution runtime and adaptive routing intelligence ✅
M6: Leadership/investor demo frontend ✅
M8: Ollama private provider integration ✅
M9: React console and CLI UX hardening ✅
M7: Post-launch hardening (PyPI verify, canary, benchmarks, ADRs) ⏳ Pending

Tests

291 tests across 16 test modules
Enforced 100% statement and branch coverage for all routing, guardrail, and execution logic
All tests run with mocked external dependencies

run_tests.bat        # Windows
bash run_tests.sh    # Linux / macOS

Reports are written to coverage_reports/ (HTML, XML, JSON).

Use ear r / ear im / ear s aliases in examples for brevity.

Project details

Release history Release notifications | RSS feed

This version

0.11.0

May 3, 2026

0.10.21

May 3, 2026

0.10.19

May 3, 2026

0.10.18

May 3, 2026

0.10.17

May 3, 2026

0.10.16

May 3, 2026

0.10.15

May 3, 2026

0.10.14

May 3, 2026

0.10.13

May 3, 2026

0.10.12

May 3, 2026

0.10.10

May 2, 2026

0.10.9

May 2, 2026

0.10.8

May 2, 2026

0.10.7

May 2, 2026

0.10.6

May 2, 2026

0.10.5

May 2, 2026

0.10.4

May 2, 2026

0.10.3

May 1, 2026

0.10.2

May 1, 2026

0.10.1

May 1, 2026

0.10.0

May 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

efficient_agent_router_ear-0.11.0.tar.gz (79.8 kB view details)

Uploaded May 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

efficient_agent_router_ear-0.11.0-py3-none-any.whl (50.5 kB view details)

Uploaded May 3, 2026 Python 3

File details

Details for the file efficient_agent_router_ear-0.11.0.tar.gz.

File metadata

Download URL: efficient_agent_router_ear-0.11.0.tar.gz
Upload date: May 3, 2026
Size: 79.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for efficient_agent_router_ear-0.11.0.tar.gz
Algorithm	Hash digest
SHA256	`37d5d12b8d9f8525d4591ce4f69bd51b06acb4742e189e4d899c8310d3b77f85`
MD5	`ef9710dac0c6703de267372f99d3f68c`
BLAKE2b-256	`caf55ecd9546a516f1817366861e1357352b9e8ac5b520e651fae3c1e8e96b76`

See more details on using hashes here.

Provenance

The following attestation bundles were made for efficient_agent_router_ear-0.11.0.tar.gz:

Publisher: publish-pypi.yml on ShanKonduru/efficient-agent-router-ear

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: efficient_agent_router_ear-0.11.0.tar.gz
- Subject digest: 37d5d12b8d9f8525d4591ce4f69bd51b06acb4742e189e4d899c8310d3b77f85
- Sigstore transparency entry: 1435939198
- Sigstore integration time: May 3, 2026
Source repository:
- Permalink: ShanKonduru/efficient-agent-router-ear@47f624b85ca12ae1dd4c010f5127271a2dad13cf
- Branch / Tag: refs/tags/v0.11.0
- Owner: https://github.com/ShanKonduru
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@47f624b85ca12ae1dd4c010f5127271a2dad13cf
- Trigger Event: push

File details

Details for the file efficient_agent_router_ear-0.11.0-py3-none-any.whl.

File metadata

Download URL: efficient_agent_router_ear-0.11.0-py3-none-any.whl
Upload date: May 3, 2026
Size: 50.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for efficient_agent_router_ear-0.11.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3f504501fc0d00136438c39f6dfcc049c35b68d6d3f8e82635e198805d223b16`
MD5	`1db08f56b835b96546b09c4c58514194`
BLAKE2b-256	`4047663146906886d00ee9c8413d9304eab55fdf834a2edf8edfc279502a2693`

See more details on using hashes here.

Provenance

The following attestation bundles were made for efficient_agent_router_ear-0.11.0-py3-none-any.whl:

Publisher: publish-pypi.yml on ShanKonduru/efficient-agent-router-ear

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: efficient_agent_router_ear-0.11.0-py3-none-any.whl
- Subject digest: 3f504501fc0d00136438c39f6dfcc049c35b68d6d3f8e82635e198805d223b16
- Sigstore transparency entry: 1435939204
- Sigstore integration time: May 3, 2026
Source repository:
- Permalink: ShanKonduru/efficient-agent-router-ear@47f624b85ca12ae1dd4c010f5127271a2dad13cf
- Branch / Tag: refs/tags/v0.11.0
- Owner: https://github.com/ShanKonduru
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@47f624b85ca12ae1dd4c010f5127271a2dad13cf
- Trigger Event: push

efficient-agent-router-ear 0.11.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Efficient Agent Router (EAR)

Goals

Current Delivery Status (v0.10.16)

Current Delivery Strategy

Tech Stack

Repository Layout

Core Workflow

Routing Model

CLI Commands

MCP Design

Ollama Private Provider

Interactive LLM Explorer and Demo UI

Live React Web Console

Demo Walkthrough

Configuration

Quality and Security Requirements

Security Report HTML Generation

MCP Server: sec-report-kit

Milestones

Tests

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance