Skip to main content

The Switchboard - Unified LLM API Gateway with fail-closed semantics

Project description

The Switchboard ๐Ÿ”Œ

Unified LLM API Gateway with Fail-Closed Semantics

The Switchboard is a high-performance proxy service that provides intelligent routing to multiple LLM providers. It serves as the central nervous system for all Aperion agents (Sentinel, AR, Aether), ensuring they can access LLMs reliably and cost-effectively.

๐ŸŽฏ Core Features

  • OpenAI-Compatible API: Drop-in replacement - just change base_url
  • Intelligent Task Routing: Security tasks โ†’ Premium, Docs โ†’ Free tier
  • Fail-Closed Semantics: Never silently falls back to Echo in production
  • Cost Optimization: Target 75% savings by routing volume to free tiers
  • Telemetry Injection: X-Correlation-ID propagation for tracing
  • Structured Logging: JSON cost/latency metrics (Constitution D3)

๐Ÿš€ Quick Start

Installation

pip install aperion-switchboard

# From source
pip install -e .

# With dev dependencies
pip install -e ".[dev]"

Configuration

Set environment variables for your providers:

# OpenAI (Premium tier)
export OPENAI_API_KEY=sk-...

# Google Gemini (Free tier)
export GEMINI_API_KEY=AIza...

# Cloudflare Workers AI (Low-cost tier)
export WORKERS_AI_API_KEY=your-cf-token
export WORKERS_AI_BASE_URL=https://api.cloudflare.com/client/v4/accounts/ACCT/ai/run

Running

# Development
python -m aperion_switchboard.main

# Production
uvicorn aperion_switchboard.main:app --host 0.0.0.0 --port 8080

# Docker
docker build -t switchboard .
docker run -p 8080:8080 \
  -e OPENAI_API_KEY=sk-... \
  -e GEMINI_API_KEY=AIza... \
  switchboard

๐Ÿ“ก API Usage

OpenAI-Compatible Endpoint

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-Aperion-Task-Type: security_audit" \
  -d '{
    "model": "gpt-4.1-mini",
    "messages": [{"role": "user", "content": "Analyze this code for vulnerabilities"}]
  }'

Task Types

Use the X-Aperion-Task-Type header to trigger intelligent routing:

Task Type Routes To Use Case
security_audit OpenAI Critical security analysis
production_decision OpenAI High-stakes decisions
strategic_analysis OpenAI Complex reasoning
code_review OpenAI Quality reviews
doc_update Gemini Documentation updates
doc_generation Gemini Batch doc creation
lint_analysis Gemini Fast batch processing
test_generation Gemini High-volume generation
general Gemini Default (cost-optimized)

๐Ÿ”’ Constitution Compliance

A6: Fail-Closed Semantics (Iron Rule)

The Switchboard MUST NEVER silently fall back to the Echo provider in production.

  • If no real providers are configured AND APERION_ALLOW_ECHO is not "true":
    • Service crashes on startup
    • Returns 503 for all requests
    • Logs CRITICAL error with remediation steps
# Production mode (default) - will crash if no providers configured
export APERION_ALLOW_ECHO=false

# Development mode - allows echo fallback
export APERION_ALLOW_ECHO=true

B1: Secrets Management

All credentials are loaded from environment variables:

  • OPENAI_API_KEY
  • GEMINI_API_KEY
  • WORKERS_AI_API_KEY
  • SWITCHBOARD_API_KEY (optional - for Switchboard auth)

D1: Telemetry Injection

  • Extracts X-Correlation-ID from incoming requests
  • Generates one if missing (format: sw_{uuid})
  • Propagates to all upstream provider requests
  • Adds to response headers

D3: Structured Logging

All cost/latency metrics are logged as JSON:

{
  "event": "llm_request_cost",
  "correlation_id": "sw_abc123",
  "provider": "openai",
  "model": "gpt-4.1-mini",
  "estimated_cost_usd": 0.00015,
  "tokens": {"prompt": 100, "completion": 50, "total": 150},
  "latency_ms": 1234,
  "task_type": "security_audit"
}

๐Ÿงช Testing

# Run all tests
pytest

# Run safety tests (fail-closed verification)
pytest -m safety

# Run unit tests only
pytest -m unit

# Run integration tests (requires API keys)
pytest -m integration

# With coverage
pytest --cov=aperion_switchboard --cov-report=html

๐Ÿ“Š Endpoints

Endpoint Method Description
/v1/chat/completions POST OpenAI-compatible chat
/health GET Health check
/healthz GET Kubernetes health probe
/docs GET OpenAPI documentation

๐Ÿ—๏ธ Architecture

src/aperion_switchboard/
โ”œโ”€โ”€ core/
โ”‚   โ”œโ”€โ”€ router.py      # Task routing & fallback logic
โ”‚   โ”œโ”€โ”€ protocol.py    # LLMClient abstract base class
โ”‚   โ””โ”€โ”€ fail_closed.py # Constitution A6 enforcement
โ”œโ”€โ”€ providers/
โ”‚   โ”œโ”€โ”€ openai.py      # OpenAI/compatible providers
โ”‚   โ”œโ”€โ”€ gemini.py      # Google Gemini
โ”‚   โ”œโ”€โ”€ workers.py     # Cloudflare Workers AI
โ”‚   โ””โ”€โ”€ echo.py        # Test-only echo provider
โ”œโ”€โ”€ service/
โ”‚   โ”œโ”€โ”€ app.py         # FastAPI application
โ”‚   โ”œโ”€โ”€ middleware.py  # Auth, telemetry, cost logging
โ”‚   โ””โ”€โ”€ schemas.py     # OpenAI-compatible Pydantic models
โ””โ”€โ”€ main.py            # Entry point

๐Ÿ“ˆ Cost Optimization

The Switchboard achieves ~75% cost savings by:

  1. Routing 80% of requests (docs, linting, tests) to free tiers
  2. Reserving premium providers for critical tasks only
  3. Tracking and reporting cost per request

View cost summary:

from aperion_switchboard.core.router import get_router

router = get_router()
summary = router.get_cost_summary()
print(f"Savings: {summary['savings_percent']:.1f}%")

๐Ÿ”ง Development

# Install dev dependencies
pip install -e ".[dev]"

# Run linter
ruff check src tests

# Run type checker
mypy src

# Run tests with coverage
pytest --cov=aperion_switchboard --cov-report=term-missing

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aperion_switchboard-1.2.1.tar.gz (72.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aperion_switchboard-1.2.1-py3-none-any.whl (47.1 kB view details)

Uploaded Python 3

File details

Details for the file aperion_switchboard-1.2.1.tar.gz.

File metadata

  • Download URL: aperion_switchboard-1.2.1.tar.gz
  • Upload date:
  • Size: 72.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for aperion_switchboard-1.2.1.tar.gz
Algorithm Hash digest
SHA256 b26e962b891e2a0de008c34b0bfd4b479d92bce46f6f2d999095d14b010f7660
MD5 79163b691970c72869df476835ca88c2
BLAKE2b-256 27b4ed152663e35b003d829cd976b2118cb9dfe41f6121799611efe210a92b19

See more details on using hashes here.

Provenance

The following attestation bundles were made for aperion_switchboard-1.2.1.tar.gz:

Publisher: release.yml on invictustitan2/aperion-llm-router

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file aperion_switchboard-1.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for aperion_switchboard-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7ec31a1323e65150552dbda35b50a28387d2b99f1e88ae04a78d8ef451521db5
MD5 ae96b73e44c5764d0f2f4529b2b99aa2
BLAKE2b-256 526b1517d2355cc038e1ad67451d627f8e296b163a9afcde85cabcdee0dda630

See more details on using hashes here.

Provenance

The following attestation bundles were made for aperion_switchboard-1.2.1-py3-none-any.whl:

Publisher: release.yml on invictustitan2/aperion-llm-router

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page