Skip to main content

OpenAI-compatible API server that routes through Claude Code

Project description

claude-relay

CI PyPI License: MIT Python 3.10+

Drop-in OpenAI API server that routes through Claude Code.

Why

You have tools that speak the OpenAI API. You have Claude Code with its tools, MCP servers, and agentic capabilities. claude-relay bridges the two — point any OpenAI-compatible client at it and every request flows through claude -p under the hood.

  • Use Claude Code from any OpenAI client — Cursor, Continue, aider, custom scripts
  • Keep Claude Code's superpowers — tool use, MCP servers, file access, shell execution
  • Zero config — if claude works on your machine, so does this
  • Real token usage — reports actual token counts from Claude (not zeros)
  • Token-level streaming — uses --include-partial-messages for true real-time deltas

Install

# With uv (recommended)
uvx claude-relay serve

# Or install globally
uv pip install claude-relay
claude-relay serve

# Or from source
git clone https://github.com/npow/claude-relay.git
cd claude-relay
uv sync
uv run claude-relay serve

Quick start

claude-relay serve
# Server starts on http://localhost:8082

Point any OpenAI-compatible client at it:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8082/v1", api_key="unused")

# Streaming
for chunk in client.chat.completions.create(
    model="sonnet",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
):
    print(chunk.choices[0].delta.content or "", end="")

# Non-streaming
resp = client.chat.completions.create(
    model="sonnet",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)

Or with curl:

curl http://localhost:8082/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"sonnet","messages":[{"role":"user","content":"Hello"}],"stream":true}'

Configuration

claude-relay serve [--host HOST] [--port PORT]
Flag Default Description
--host 0.0.0.0 Bind address
--port 8082 Bind port

API

Endpoint Method Description
/v1/chat/completions POST Chat completions (OpenAI-compatible)
/v1/models GET List available models
/health GET Server and CLI status

All endpoints also work without the /v1 prefix. CORS is enabled for all origins.

Supported features

Feature Status
Streaming (SSE) Yes
System messages Yes (via --system-prompt)
Multi-turn conversations Yes
Multimodal (text parts) Yes
Model selection Yes
Token usage reporting Yes
CORS Yes

Models

Pass any model name — it goes directly to claude --model:

Model Description
opus Most capable
sonnet Balanced (default)
haiku Fastest

Limitations

  • temperature, max_tokens, top_p, and other sampling parameters are ignored (Claude Code CLI does not expose them)
  • No tool/function calling passthrough (Claude Code uses its own tools internally, but they aren't exposed via the OpenAI tool-calling protocol)
  • Each request spawns a new claude process (~2-3s overhead on top of API latency)
  • No image/audio content forwarding — only text parts of multimodal messages are extracted

How it works

OpenAI client  →  claude-relay  →  claude -p  →  Anthropic API
  (SSE)            (FastAPI)      (stream-json)

Each request spawns a claude -p process with --output-format stream-json --include-partial-messages. The proxy translates between the OpenAI wire format and Claude Code's streaming JSON protocol. Requests are stateless — no conversation history bleeds between calls.

Development

uv sync
uv run pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_relay-0.1.0.tar.gz (22.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

claude_relay-0.1.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file claude_relay-0.1.0.tar.gz.

File metadata

  • Download URL: claude_relay-0.1.0.tar.gz
  • Upload date:
  • Size: 22.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claude_relay-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ae86038bf6b025bf8c4ca1a05022a40967de5ef60b554ff293d153a107b64aea
MD5 7ef9fec7636e3bda4cb847b4206fb7bf
BLAKE2b-256 0983e830228c287b084365b92090f3d5a4f9aeb7f9a852a3f18a93bf9be95197

See more details on using hashes here.

Provenance

The following attestation bundles were made for claude_relay-0.1.0.tar.gz:

Publisher: publish.yml on npow/claude-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file claude_relay-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: claude_relay-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claude_relay-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 22c4d043c42389c56cf93a5194e078b9019fbf32f26026cfaf7655031477227a
MD5 14725d5c53dccb2ec272e37dcf094581
BLAKE2b-256 917e83a533f85c5ce60b642c1552eb2f2c9f93e06aa6cfd67158504af3ae36c0

See more details on using hashes here.

Provenance

The following attestation bundles were made for claude_relay-0.1.0-py3-none-any.whl:

Publisher: publish.yml on npow/claude-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page