Skip to main content

Unified LLM usage management — API proxy, session diagnostics, multi-CLI orchestration.

Project description

llm-relay

Unified LLM usage management — API proxy, session diagnostics, multi-CLI orchestration.

한국어 | llms.txt

Features

  • Proxy: Transparent API proxy with cache/token monitoring and 12-strategy pruning
  • Detect: 7 detectors (orphan, stuck, bloat, synthetic, cache, resume, microcompact)
  • Recover: Session recovery and doctor (7 health checks)
  • Guard: 4-tier threshold daemon with dual-zone classification
  • Cost: Per-1% cost calculation and rate-limit header analysis
  • Orch: Multi-CLI orchestration (Claude Code, Codex CLI, Gemini CLI)
  • Display: Multi-CLI session monitor with provider badges and liveness detection
  • I18n: Multi-language support (English, Korean) with browser auto-detection and LLM_RELAY_LANG env
  • MCP: 8 tools via stdio transport (cli_delegate, cli_status, cli_probe, orch_delegate, orch_history, relay_stats, session_turns, session_history)

Install

1. Set up Python environment

Windows (pip)
python -m venv .venv
.venv\Scripts\activate
Windows (conda)
conda create -n llm-relay python=3.12
conda activate llm-relay
Linux / macOS (pip)
python3 -m venv .venv
source .venv/bin/activate

2. Install llm-relay

# Default (SQLite, zero-config)
pip install llm-relay

# With proxy + web dashboard
pip install llm-relay[proxy]

# With PostgreSQL support (long-term analytics + vector search)
pip install llm-relay[pg]

# With MCP server (Python 3.10+)
pip install llm-relay[mcp]

# Everything
pip install llm-relay[all]

3. Choose database

SQLite (default) PostgreSQL
Setup Zero-config Requires PG server
Best for Getting started, light usage Long-term data analytics, vector search
Install pip install llm-relay pip install llm-relay[pg]
Config (none needed) LLM_RELAY_DB=postgresql://user:pass@host/db

4. Initialize

llm-relay init

Quick Start

One-command setup

llm-relay init              # Auto-detect CLIs, configure proxy, start server

CLI commands

llm-relay scan              # Session health check (7 detectors)
llm-relay doctor            # Configuration health check (7 checks)
llm-relay recover           # Extract session context for resumption
llm-relay serve             # Start proxy server + web dashboard
llm-relay top               # Live terminal monitor (btop-style)
llm-relay service install   # Windows: background service + auto-start (no console window)
llm-relay service stop      # Windows: stop background service
llm-relay service uninstall # Windows: remove service + cleanup

Web dashboard

# Native (Linux/macOS/Windows)
llm-relay serve --port 8080

Then open:

  • /dashboard/ — CLI status, cost, delegation history, Turn Monitor (alive sessions only; ?include_dead=1 to bypass)
  • /display/ — Turn counter with CC/Codex/Gemini session cards (alive filter: CC via cc_pid+TTY fallback, Codex/Gemini via fd-open; Windows uses mtime+process detection)
  • /history/ — Session conversation history browser

MCP server

llm-relay-mcp               # stdio transport, 8 tools

API proxy for Claude Code

# Set in Claude Code
llm-relay connect   # Auto-configures Claude Code proxy

Agent-driven setup

If you would rather have your existing coding agent (Claude Code, Codex, Gemini) run the install for you, point it at docs/AGENT_SETUP.md. It is a structured playbook the agent follows step by step, using llm-relay env-fingerprint and llm-relay verify to probe and check each step without scraping output.

llm-relay env-fingerprint --format json        # state snapshot
llm-relay verify install --format json         # is the package usable?
llm-relay verify config --format json          # is local state set up?
llm-relay verify integration --cli claude-code # is the CLI wired?
llm-relay verify all                            # everything at once

Exit code is 0 on pass/warn, 1 on fail.

CLI Status

CLI Status
Claude Code Fully supported
OpenAI Codex Fully supported
Gemini CLI Display supported, oauth-personal has known 403 server-side bug (#25425)

Platform Support

Platform Mode Notes
Linux Native Full feature set, systemd recommended
macOS Native Full feature set
Windows Native llm-relay service install for background daemon (no console window)

Requirements

  • Python >= 3.9
  • MCP tools require Python >= 3.10

License

MIT

Ecosystem

Part of the QuartzUnit open-source ecosystem.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_relay-0.9.4.tar.gz (265.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_relay-0.9.4-py3-none-any.whl (199.4 kB view details)

Uploaded Python 3

File details

Details for the file llm_relay-0.9.4.tar.gz.

File metadata

  • Download URL: llm_relay-0.9.4.tar.gz
  • Upload date:
  • Size: 265.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_relay-0.9.4.tar.gz
Algorithm Hash digest
SHA256 51e2e0846c3389f66ed66b1676adddcedc69ddec1e22d247a1cc1417512492a0
MD5 09bf8c6655af47605e3a3b08d14d7979
BLAKE2b-256 05d72fff6b6c0c878ebdb9933107f40f0d188cff5ff6e8f0a6183a752a34f1cc

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_relay-0.9.4.tar.gz:

Publisher: publish.yml on ArkNill/llm-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llm_relay-0.9.4-py3-none-any.whl.

File metadata

  • Download URL: llm_relay-0.9.4-py3-none-any.whl
  • Upload date:
  • Size: 199.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_relay-0.9.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9dc721bdf97edfe4aacd4911ac2e875ae978f6c0141a836afc59e55dfe1b5b81
MD5 e5a3d3de91768260a160c78a609c5fc3
BLAKE2b-256 8605b73d8abc5db9f7054cb82f82c9e76f543ff684fc37dd00e9526e80f16952

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_relay-0.9.4-py3-none-any.whl:

Publisher: publish.yml on ArkNill/llm-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page