OpenAI-compatible AI gateway with automatic complexity-based model routing.
Project description
Cortex Gateway
OpenAI-compatible AI gateway with automatic complexity-based model routing, multi-provider support, client key management, and a built-in Web UI.
Features
- Auto-Routing — Classifies each request as
simpleorcomplexusing a 2-stage strategy: fast regex heuristics → LLM fallback (only when uncertain) - Multi-Provider — Supports both Anthropic (
claude-*) and OpenAI-compatible backends per routing tier - Context Compression — Prompt caching via Anthropic's
cache_control(ephemeral) on system prompts - OpenAI-Compatible — Drop-in proxy for
/v1/chat/completionsand/v1/models - SSE Streaming — Full streaming with
<think>tag parsing →reasoning_contentfield - Multi-Language — Automatic language detection and response localisation (Vietnamese, Chinese, Japanese, Korean, Arabic, Thai, Russian)
- Tool Use — Full support for OpenAI-format
tool_calls/toolmessages, converted transparently to Anthropic format - Client Key Management — Issue, list, and revoke
cgw-*API keys per client - SQLite Persistence — Model config, client keys, and request logs stored in a local SQLite database
- Web UI — Built-in home page, login page, and admin dashboard served at
/
Requirements
- Python 3.10+
- Anthropic or OpenAI API key (depending on provider choice)
Installation
From PyPI
pip install cortex-gateway
From source
git clone https://github.com/leandix/cortex-gateway
cd cortex-gateway
pip install -e .
Configuration
Copy .env.example to .env and fill in your API keys:
cp .env.example .env
# Simple task model (fast, cheap — e.g. Haiku)
SIMPLE_MODEL_NAME=claude-haiku-4-5
SIMPLE_MODEL_API_KEY=sk-ant-...
SIMPLE_MODEL_BASE_URL=https://api.anthropic.com
# Complex task model (powerful — e.g. Sonnet)
COMPLEX_MODEL_NAME=claude-sonnet-4-5
COMPLEX_MODEL_API_KEY=sk-ant-...
COMPLEX_MODEL_BASE_URL=https://api.anthropic.com
# Gateway
GATEWAY_HOST=0.0.0.0
GATEWAY_PORT=8000
# Optional: custom SQLite path (default: cortex_gateway.db in current dir)
# CORTEX_DB_PATH=/var/data/cortex.db
# Optional: custom JWT secret (auto-generated per instance if not set)
# JWT_SECRET=your-secret-here
Note: Model configuration is seeded from
.envon first startup into SQLite and can be updated live via the Dashboard.
Running
# Start with defaults (reads GATEWAY_HOST / GATEWAY_PORT from .env)
cortex-gateway
# Custom host and port
cortex-gateway --host 127.0.0.1 --port 9000
# Dev mode with hot reload
cortex-gateway --reload
# Multiple worker processes
cortex-gateway --workers 4
# Show version
cortex-gateway --version
# Run as Python module
python -m cortex_gateway
After startup, the following URLs are available:
| URL | Description |
|---|---|
http://localhost:8000/ |
Public home page |
http://localhost:8000/login |
Admin login |
http://localhost:8000/dashboard |
Admin dashboard |
Client Configuration
To use Cortex Gateway as a drop-in proxy, point your AI client at http://localhost:8000/v1 and provide a client API key (created from the Dashboard).
Continue (config.yaml)
models:
- name: Cortex
provider: openai
model: cortex-auto # auto-routes between simple / complex model
apiBase: http://localhost:8000/v1
apiKey: cgw-... # your client API key from the Dashboard
Override routing per request
Force a specific model by name:
models:
- name: Cortex Sonnet
provider: openai
model: claude-sonnet-4-5 # bypasses auto-routing, goes directly to complex tier
apiBase: http://localhost:8000/v1
apiKey: cgw-...
Or use custom headers:
X-Cortex-Model-Simple: claude-haiku-4-5
X-Cortex-Model-Complex: claude-sonnet-4-5
API Endpoints
Core (requires client API key)
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions |
POST | Chat completions (streaming & non-streaming) |
/v1/models |
GET | List available models |
Admin (requires JWT — obtained from POST /api/login)
| Endpoint | Method | Description |
|---|---|---|
/api/login |
POST | Login with admin password → returns JWT |
/api/change-password |
POST | Change admin password |
/api/status |
GET | Public gateway status (uptime, model names, total requests) |
/api/keys |
GET | List all client API keys |
/api/keys |
POST | Create a new client API key |
/api/keys/{key} |
DELETE | Revoke a client API key |
/api/models |
GET | Get model configuration (masked API keys) |
/api/models |
PUT | Update model configuration for simple/complex tier |
/api/stats |
GET | Request stats grouped by client key and model |
/health |
GET | Health check |
Default admin credentials:
admin123— change this immediately after first login.
Routing Engine
Routing uses a 2-stage pipeline:
- Heuristic Scoring — Fast, zero-cost regex pattern matching. Each pattern contributes a positive (complex) or negative (simple) score. If the score exceeds a threshold (≥ 2 → COMPLEX, ≤ −2 → SIMPLE), the result is returned immediately.
- LLM Classification — Called only when the heuristic score is in the "grey zone". Uses the simple model with a minimal prompt (
max_tokens=5) to outputSIMPLEorCOMPLEX.
Special overrides force SIMPLE for meta-requests like "give this chat a title".
Database
Cortex Gateway uses a local SQLite database (cortex_gateway.db by default) with the following tables:
| Table | Purpose |
|---|---|
admin |
Hashed admin password (SHA-256 + salt) |
model_config |
Active model settings per tier (simple / complex) |
client_keys |
Client API keys with name, status, and creation time |
request_logs |
Per-request log (client key, model, complexity, token counts) |
The database path can be overridden with CORTEX_DB_PATH.
Project Structure
cortex_gateway/
├── app.py # FastAPI routes, JWT auth, streaming, message format conversion
├── cli.py # CLI entry point (argparse + uvicorn)
├── config.py # Env vars, system prompts, routing patterns, language data
├── db.py # SQLite schema, admin/key/stats operations
├── router.py # Complexity classifier, language detector
└── static/ # Built-in Web UI (index.html, login.html, dashboard.html)
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cortex_gateway-1.0.0.tar.gz.
File metadata
- Download URL: cortex_gateway-1.0.0.tar.gz
- Upload date:
- Size: 70.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56c294370b2facfe05f87be47fcc16a4a23880dcde9a7b4d49dbda98bda2f40d
|
|
| MD5 |
f8e667c0f543fdf1166b9f4434354376
|
|
| BLAKE2b-256 |
4358afcac0fb5d0a829b2f05ad20697589275168156b71add11731de03e88682
|
File details
Details for the file cortex_gateway-1.0.0-py3-none-any.whl.
File metadata
- Download URL: cortex_gateway-1.0.0-py3-none-any.whl
- Upload date:
- Size: 74.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d31e79b13e036d5d4189e196f7f7364b231e2bdbfeaf348d66c0517fb45f0008
|
|
| MD5 |
50674efd568b60f6420f60487fbbd262
|
|
| BLAKE2b-256 |
6d2930e0317886aab2d9bc33da2a354941a2c8ba4244410b7b5d99f477dede18
|