Skip to main content

Unified LLM usage management — API proxy, session diagnostics, multi-CLI orchestration.

Reason this release was yanked:

Superseded by v0.9.1. Use: pip install llm-relay

Project description

llm-relay

Unified LLM usage management — API proxy, session diagnostics, multi-CLI orchestration.

한국어 | llms.txt

Why

This project started from a need to escape deep vendor lock-in with a single AI coding tool. After investigating hidden behaviors in Claude Code — silent token inflation, false rate limits, context stripping, and opaque feature flags — it became clear that relying on one vendor's black box was a risk. llm-relay was built to take back visibility and control: monitor what's actually happening, diagnose problems independently, and orchestrate across multiple CLI tools (Claude Code, Codex, Gemini) so no single provider becomes a single point of failure.

Features

  • Proxy: Transparent API proxy with cache/token monitoring and 12-strategy pruning
  • Detect: 7 detectors (orphan, stuck, synthetic, bloat, cache, resume, microcompact)
  • Recover: Session recovery and doctor (7 health checks)
  • Guard: 4-tier threshold daemon with dual-zone classification
  • Cost: Per-1% cost calculation and rate-limit header analysis
  • Orch: Multi-CLI orchestration (Claude Code, Codex CLI, Gemini CLI)
  • Display: Multi-CLI session monitor with provider badges and liveness detection
  • MCP: 7 tools via stdio transport (cli_delegate, cli_status, cli_probe, orch_delegate, orch_history, relay_stats, session_turns)

Install

# CLI only (diagnostics, recovery, orchestration)
pip install llm-relay

# With proxy + web dashboard
pip install llm-relay[proxy]

# With MCP server (Python 3.10+)
pip install llm-relay[mcp]

# Everything
pip install llm-relay[all]

Quick Start

CLI diagnostics (no server needed)

llm-relay scan              # Session health check (7 detectors)
llm-relay doctor            # Configuration health check (7 checks)
llm-relay recover           # Extract session context for resumption

Web dashboard

# Option 1: Direct
pip install llm-relay[proxy]
uvicorn llm_relay.proxy.proxy:app --host 0.0.0.0 --port 8083

# Option 2: Docker
cp .env.example .env        # Edit as needed
docker compose up -d

Then open:

  • /dashboard/ — CLI status, cost, delegation history, Turn Monitor (alive sessions only; ?include_dead=1 to bypass)
  • /display/ — Turn counter with CC/Codex/Gemini session cards (alive filter: CC via cc_pid+TTY fallback, Codex/Gemini via fd-open)

MCP server

llm-relay-mcp               # stdio transport, 7 tools

API proxy for Claude Code

# Set in Claude Code
export ANTHROPIC_BASE_URL=http://localhost:8080

CLI Status

CLI Status
Claude Code Fully supported
OpenAI Codex Fully supported
Gemini CLI Display supported, oauth-personal has known 403 server-side bug (#25425)

Requirements

  • Python >= 3.9
  • MCP tools require Python >= 3.10

License

MIT

Ecosystem

Part of the QuartzUnit open-source ecosystem.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_relay-0.5.0.tar.gz (148.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_relay-0.5.0-py3-none-any.whl (133.1 kB view details)

Uploaded Python 3

File details

Details for the file llm_relay-0.5.0.tar.gz.

File metadata

  • Download URL: llm_relay-0.5.0.tar.gz
  • Upload date:
  • Size: 148.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_relay-0.5.0.tar.gz
Algorithm Hash digest
SHA256 4ebfd34d7577899187c9aa3b976e47373e88e2a5d3a12e5963f8f1cc03336569
MD5 21ab1fa9f5df764dc2fb952c218bdb8a
BLAKE2b-256 58a760ec9e0c4a86f938c3308bda86bc3d76c64ed16fa91c33341934999b3900

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_relay-0.5.0.tar.gz:

Publisher: publish.yml on ArkNill/llm-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llm_relay-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: llm_relay-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 133.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llm_relay-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3c58636f48d450f5a4d33bda7c909f0b740fd586a284a4507eea0835086ecddb
MD5 c2916ad5935dbc72d3ef44952a42bec5
BLAKE2b-256 68d761909bab279c92f3448fe035724d88734de58cddfe7607fc1aa61c70e39e

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_relay-0.5.0-py3-none-any.whl:

Publisher: publish.yml on ArkNill/llm-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page