Skip to main content

Multi-model MCP routing fabric for local and cloud LLMs

Project description

AuraRouter: The AuraXLM-Lite Compute Fabric

Current Status: Production Prototype v5 (Feb 2026) Maintainer: Steven Siebert / AuraCore Dynamics

Overview

AuraRouter implements a role-based configurable xLM (SLM/TLM/LLM) prompt routing fabric. It acts as intelligent middleware that routes tasks across local and cloud models with automatic fallback. AuraRouter is content-agnostic -- it handles code generation, summarization, analysis, RAG-enabled Q&A, and any other prompt-based work. It can run as an MCP server, a desktop GUI application, or a managed service on AuraGrid.

It implements an Intent -> Plan -> Execute loop:

  1. Classifier: A fast local model classifies the task (Direct vs. Multi-Step).
  2. Planner: If multi-step, a reasoning model generates a sequential execution plan.
  3. Worker: An execution model carries out the plan step-by-step.

Architecture

graph TD
    User[MCP Client / GUI] -->|Task| Classifier{Intent Analysis}
    Classifier -->|Direct| Worker[Worker Node]
    Classifier -->|Multi-Step| Planner[Planner Node]
    Planner -->|Plan JSON| Worker

    subgraph Compute Fabric [auraconfig.yaml]
        Worker -->|Try| Node1[Local Model]
        Node1 -->|Fail| Node2[Cloud Fallback]
    end

Installation

1. Core Install (Required)

pip install aurarouter

2. Local Backend (Plugin-based Architecture)

AuraRouter uses a Modular Backend Architecture to keep the core package lean and platform-independent. To use local inference (via llama.cpp), you must install at least one "flavor" package.

AuraRouter dynamically discovers all installed backend plugins at runtime and automatically selects the best one based on your local hardware (preferring GPU over CPU). See BACKEND_PLUGINS.md for technical details.

Platform / Hardware Recommendation Command
NVIDIA RTX 50/40/30/20 aurarouter-cuda13 pip install aurarouter-cuda13
Older NVIDIA GPUs aurarouter-cuda12 pip install aurarouter-cuda12
Windows (CPU Only) aurarouter-win-x64 pip install aurarouter-win-x64
MacOS (M1/M2/M3/Intel) aurarouter-macos-x64 pip install aurarouter-macos-x64
Linux / Generic GPU aurarouter-vulkan pip install aurarouter-vulkan

You can install multiple backends; AuraRouter will score each one using its internal diagnostics and use the most performant one for your current machine.

Optional Dependencies

# With HuggingFace model downloading
pip install aurarouter[local]

# Everything (Core + Dev tools)
pip install aurarouter[all]

Source Install

git clone https://github.com/auracoredynamics/aurarouter.git
cd aurarouter
pip install -r requirements.txt        # Core dependencies
pip install -r requirements-local.txt   # Optional: local inference deps
pip install -e .                        # Editable install

Conda

conda env create -f environment.yaml
conda activate aurarouter

Standalone Executable (Windows)

If you prefer to run AuraRouter without a Python environment, you can build a standalone executable. You can choose to bundle specific backends or all of them:

# Build with all installed backends
python build.py

# Build a universal executable (bundles all potential backends)
python build.py --backends all

# Build for specific hardware only
python build.py --backends cuda13,win-x64

The executable will be generated at aurarouter/dist/aurarouter.exe.

See DEPLOYMENT.md for detailed deployment and configuration guide.

Quick Start

1. Configuration

Run the interactive installer to create a config template:

aurarouter --install

Or manually create ~/.auracore/aurarouter/auraconfig.yaml:

models:
  local_qwen:
    provider: ollama
    endpoint: http://localhost:11434/api/generate
    model_name: qwen2.5-coder:7b

  cloud_gemini:
    provider: google
    model_name: gemini-2.0-flash
    api_key: "AIzaSy..."

roles:
  router:   [local_qwen, cloud_gemini]
  reasoning: [cloud_gemini]
  coding:   [local_qwen, cloud_gemini]

# Optional: semantic verb synonyms for intent classification
semantic_verbs:
  coding:
    synonyms: [programming, code generation, developer]
  reasoning:
    synonyms: [planner, architect, planning]

2. Run

# Standalone executable (Windows)
.\aurarouter\dist\aurarouter.exe        # Run MCP server
.\aurarouter\dist\aurarouter.exe gui    # Launch desktop GUI

# Python module (Installed)
aurarouter                             # Run MCP server
aurarouter gui                         # Launch desktop GUI

Providers

Provider Type Config Key Dependencies
Ollama Local HTTP ollama None (uses httpx)
llama.cpp Server Local HTTP llamacpp-server None (uses httpx)
llama.cpp (Managed) Local Native llamacpp Bundled binary (no extra install)
OpenAPI-Compatible Local/Cloud HTTP openapi None (uses httpx)
Google Gemini Cloud google Included
Anthropic Claude Cloud claude Included

The OpenAPI provider works with any OpenAI-compatible API endpoint (vLLM, text-generation-inference, LocalAI, LM Studio, etc.).

GUI

The desktop GUI (included in the base install) provides:

  • Singleton enforcement — Only one AuraRouter instance runs at a time; subsequent launches detect the existing instance
  • Environment selector — Switch between Local and AuraGrid deployments at runtime
  • Service controls — Start, stop, and pause the MCP server or AuraGrid MAS
  • Model loading progress — Visual progress indicator when local GPU models are loading
  • Execute tab — Task input with file attachment, DAG execution visualization, task output
  • Models tab — Local GGUF model browser, HuggingFace downloads, local file import, grid model listing (AuraGrid)
  • Configuration tab — Model CRUD with capability tags, fallback chain editor with known roles, semantic verb configuration, YAML preview, cell-wide save warnings (AuraGrid)
  • Grid panels (AuraGrid) — Deployment strategy editor, cell node status, event log
  • Health dashboard — Per-model health status with clickable indicator (state-aware: reports correctly when service is stopped)
  • Privacy-aware routing — Automatically re-routes prompts containing PII away from cloud models to local/private-tagged models
  • Prompt history — Last 20 tasks with results, restorable from dropdown
  • Keyboard shortcuts — Ctrl+Enter (execute), Ctrl+N (new), Escape (cancel)

All configuration changes are persisted to auraconfig.yaml. See GUI_GUIDE.md for the complete guide.

CLI Commands

Command Description
aurarouter Run MCP server (default)
aurarouter gui Launch desktop GUI
aurarouter backends List discovered backends and hardware health
aurarouter --rescan-hardware Clear backend cache and re-run hardware diagnostics
aurarouter download-model --repo REPO --file FILE Download GGUF model from HuggingFace
aurarouter list-models List locally downloaded GGUF models
aurarouter remove-model --file FILE Remove a downloaded model
aurarouter auto-tune --file FILE Suggest optimal parameters for a GGUF model
aurarouter --install Interactive installer for MCP clients
aurarouter --install-gemini Register for Gemini CLI
aurarouter --install-claude Register for Claude

MCP Tools

Dynamic Model Registration

External services can register new GGUF models via the aurarouter.assets.register MCP tool:

# Example: Register a fine-tuned model
result = mcp_client.call_tool("register_asset", {
    "model_id": "my-finetuned-qwen",
    "file_path": "/path/to/model.gguf",
    "repo": "myorg/my-finetuned-model",
    "tags": "coding,local,fine-tuned"
})

After registration, add the model to a role chain and restart the MCP server to enable routing.

Asset Discovery

Query available local GGUF files via the aurarouter.assets.list MCP tool:

# List all downloaded models
result = mcp_client.call_tool("list_assets", {})

Returns a JSON array of asset entries with repo, filename, path, size, and metadata.

AuraGrid Integration (Optional)

AuraRouter can be deployed as a Managed Application Service (MAS) on AuraGrid for distributed access to routing services.

pip install aurarouter[auragrid]

See AURAGRID.md for the complete integration guide.

Scaling Guide

When you add new on-prem xLM resources:

  1. Open auraconfig.yaml (or use the GUI Configuration tab).
  2. Add the new model under models.
  3. Add it to the appropriate role chain under roles.
  4. Restart the router (or save from the GUI). No code changes required.

Troubleshooting

  • "Empty response received": The local model is likely OOMing or timing out. Check the timeout setting in auraconfig.yaml.
  • "Model not found": Ensure the model_name in YAML matches ollama list exactly.
  • "huggingface-hub is required": Run pip install aurarouter[local] to enable model downloading from HuggingFace.
  • "llama-server binary not found": AuraRouter needs a backend plugin to run local models. Install a backend package (e.g., pip install aurarouter-cuda13) or set the AURAROUTER_LLAMACPP_BIN environment variable to point to a custom llama-server binary.
  • PySide6 issues on headless servers: PySide6 is a core dependency. On headless/server-only deployments, use the MCP server mode (aurarouter) which does not launch the GUI.

License

Copyright 2026 AuraCore Dynamics Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aurarouter-0.5.0.tar.gz (204.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aurarouter-0.5.0-py3-none-any.whl (175.2 kB view details)

Uploaded Python 3

File details

Details for the file aurarouter-0.5.0.tar.gz.

File metadata

  • Download URL: aurarouter-0.5.0.tar.gz
  • Upload date:
  • Size: 204.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aurarouter-0.5.0.tar.gz
Algorithm Hash digest
SHA256 a2b6330d44bd64603dad921de5bbbb4aac385c1b7a09b186cb4a30a4d7c957f4
MD5 58412f50bdbf44b31cd7acb40c09c550
BLAKE2b-256 efa3997c1a5071113fd618ad5712536e7df504b4ad39d33ba94b0706da028ad5

See more details on using hashes here.

Provenance

The following attestation bundles were made for aurarouter-0.5.0.tar.gz:

Publisher: publish.yml on AuraCoreDynamics/aurarouter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aurarouter-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: aurarouter-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 175.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aurarouter-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 afcf50362a8e282f43724925023148da28c8d634a093e9332ae8bb21d732c0f8
MD5 366a595bf05f07e57e93992a23e67eb2
BLAKE2b-256 ab51375a185a2b1f68664b2cf05b4a83b85ce049ad0b8212fd2ac8904d153184

See more details on using hashes here.

Provenance

The following attestation bundles were made for aurarouter-0.5.0-py3-none-any.whl:

Publisher: publish.yml on AuraCoreDynamics/aurarouter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page