Skip to main content

Unified LLM usage management — API proxy, session diagnostics, multi-CLI orchestration.

Reason this release was yanked:

Superseded by v0.9.1. Use: pip install llm-relay

Project description

llm-relay

Unified LLM usage management — API proxy, session diagnostics, multi-CLI orchestration.

한국어 | llms.txt

Why

This project started from a need to escape deep vendor lock-in with a single AI coding tool. After investigating hidden behaviors in Claude Code — silent token inflation, false rate limits, context stripping, and opaque feature flags — it became clear that relying on one vendor's black box was a risk. llm-relay was built to take back visibility and control: monitor what's actually happening, diagnose problems independently, and orchestrate across multiple CLI tools (Claude Code, Codex, Gemini) so no single provider becomes a single point of failure.

Features

  • Proxy: Transparent API proxy with cache/token monitoring and 12-strategy pruning
  • Detect: 7 detectors (orphan, stuck, synthetic, bloat, cache, resume, microcompact)
  • Recover: Session recovery and doctor (7 health checks)
  • Guard: 4-tier threshold daemon with dual-zone classification
  • Cost: Per-1% cost calculation and rate-limit header analysis
  • Orch: Multi-CLI orchestration (Claude Code, Codex CLI, Gemini CLI)
  • Display: Multi-CLI session monitor with context composition pie chart, connection type badges (SSH/tmux/tailscale/mosh), and provider liveness detection
  • History: Proxy-level conversation capture with delta/full storage, compaction detection, and web replay viewer
  • Composition: Real-time context window analysis — classifies content into 6 categories (user/assistant/tool_use/tool_result/thinking/system) with SNR metrics and duplicate read tracking
  • TUI: llm-relay top — btop-style terminal monitor with Rich Live (works over SSH, no browser needed)
  • i18n: Browser locale detection with en/ko support; server-side override via LLM_RELAY_LANG
  • MCP: 8 tools via stdio transport (cli_delegate, cli_status, cli_probe, orch_delegate, orch_history, relay_stats, session_turns, session_history)

Install

# CLI only (diagnostics, recovery, orchestration)
pip install llm-relay

# With Rich TUI (llm-relay top)
pip install llm-relay[cli]

# With proxy + web dashboard
pip install llm-relay[proxy]

# With MCP server (Python 3.10+)
pip install llm-relay[mcp]

# Everything
pip install llm-relay[all]

Quick Start

One-command setup (recommended)

pip install llm-relay[all]
llm-relay init

This single command:

  1. Detects installed CLIs (Claude Code, Codex, Gemini)
  2. Initializes the database (~/.llm-relay/usage.db)
  3. Configures Claude Code to route through the proxy (ANTHROPIC_BASE_URL)
  4. Registers the MCP server in Claude Code (8 tools)
  5. Starts the proxy server with history enabled
  6. Runs a health check to verify everything works

After init, open: http://localhost:8083/dashboard/

Options: --dry-run (preview without changes), --skip-server (configure only), --port 9090 (custom port).

Manual setup

# CLI diagnostics only (no server needed)
pip install llm-relay
llm-relay scan              # Session health check (7 detectors)
llm-relay doctor            # Configuration health check (7 checks)
llm-relay top               # Live terminal monitor (btop-style TUI)

# Web dashboard
pip install llm-relay[proxy]
llm-relay serve             # Starts proxy + dashboard on port 8083

# Then configure Claude Code to use the proxy:
# In ~/.claude/settings.json, add:
#   "env": { "ANTHROPIC_BASE_URL": "http://localhost:8083" }

Web pages:

  • /dashboard/ — CLI status, cost, quota, error rate, cache hit rate, Turn Monitor
  • /display/ — Turn counter with context composition, connection type badges
  • /history/ — Session conversation replay with compaction timeline

MCP server

llm-relay-mcp               # stdio transport, 8 tools

CLI Status

CLI Status
Claude Code Fully supported
OpenAI Codex Fully supported
Gemini CLI Display supported, oauth-personal has known 403 server-side bug (#25425)

Requirements

  • Python >= 3.9
  • MCP tools require Python >= 3.10

License

MIT

Ecosystem

Part of the QuartzUnit open-source ecosystem.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_relay-0.7.1.tar.gz (173.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_relay-0.7.1-py3-none-any.whl (150.7 kB view details)

Uploaded Python 3

File details

Details for the file llm_relay-0.7.1.tar.gz.

File metadata

  • Download URL: llm_relay-0.7.1.tar.gz
  • Upload date:
  • Size: 173.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_relay-0.7.1.tar.gz
Algorithm Hash digest
SHA256 3777522a0010e09423c9f7939dd3416fc1d44ce40a4e7660410af2baa62706c5
MD5 b005c9b21d9033a8a72406cf2db42f26
BLAKE2b-256 c1a2a51df0a3199d784ce98bbab91cfc47176fdbb548ba4e5cb6f360650e47d3

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_relay-0.7.1.tar.gz:

Publisher: publish.yml on ArkNill/llm-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llm_relay-0.7.1-py3-none-any.whl.

File metadata

  • Download URL: llm_relay-0.7.1-py3-none-any.whl
  • Upload date:
  • Size: 150.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_relay-0.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5cc1fb2735cc0d960eb000c1c19664a34dd81aca519435e975c878b33ad4ffe2
MD5 c941b567cf2d0a9c2d97c79553548884
BLAKE2b-256 7b8f8370f92642219369a1abba78e7eae0298f2a7125e4e13624a218cb06e3d2

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_relay-0.7.1-py3-none-any.whl:

Publisher: publish.yml on ArkNill/llm-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page