OpenAI- and Anthropic-compatible API server that routes through agent CLIs
Project description
agent-relay
Drop-in OpenAI and Anthropic API server that routes through agent CLIs (currently Claude Code).
Compatibility note:
claude-relayremains available as a compatibility package/command alias.
Why
You have tools that speak the OpenAI or Anthropic API. You have Claude Code with its tools, MCP servers, and agentic capabilities. agent-relay bridges the two — point any compatible client at it and every request flows through claude -p under the hood.
- Use Claude Code from any OpenAI or Anthropic client — Cursor, Continue, aider, LangChain, custom scripts
- Keep Claude Code's superpowers — tool use, MCP servers, file access, shell execution
- Zero config — if
claudeworks on your machine, so does this - Real token usage — reports actual token counts from Claude (not zeros)
- Token-level streaming — uses
--include-partial-messagesfor true real-time deltas
Install
# With uv (recommended)
uvx agent-relay serve
# Or install globally
uv tool install agentrelay-cli
agent-relay serve
# Or from source
git clone https://github.com/npow/claude-relay.git
cd claude-relay
uv sync
uv run agent-relay serve
Quick start
agent-relay serve
# Server starts on http://localhost:18082
Run as background service (macOS)
# Install and auto-start on login
agent-relay service install
The installer will offer to add these to your ~/.zshrc (or ~/.bashrc) so every SDK and agent picks up the relay automatically:
export ANTHROPIC_BASE_URL="http://127.0.0.1:18082"
export OPENAI_BASE_URL="http://127.0.0.1:18082/v1"
# Check status
agent-relay service status
# Update
uv tool upgrade agentrelay-cli
agent-relay service restart
# Stop and remove
agent-relay service uninstall
Point any OpenAI-compatible client at it:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:18082/v1", api_key="unused")
# Streaming
for chunk in client.chat.completions.create(
model="sonnet",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
):
print(chunk.choices[0].delta.content or "", end="")
# Non-streaming
resp = client.chat.completions.create(
model="sonnet",
messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)
Anthropic SDK
import anthropic
# Just set the base URL — the SDK reads ANTHROPIC_BASE_URL automatically
# export ANTHROPIC_BASE_URL=http://localhost:18082
client = anthropic.Anthropic(base_url="http://localhost:18082")
# Streaming
with client.messages.stream(
model="sonnet",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
) as stream:
for text in stream.text_stream:
print(text, end="")
# Non-streaming
resp = client.messages.create(
model="sonnet",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.content[0].text)
LangChain
from langchain_anthropic import ChatAnthropic
# export ANTHROPIC_BASE_URL=http://localhost:18082
llm = ChatAnthropic(model="sonnet")
print(llm.invoke("Hello!").content)
curl
# OpenAI format
curl http://localhost:18082/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"sonnet","messages":[{"role":"user","content":"Hello"}],"stream":true}'
# OpenAI Responses format
curl http://localhost:18082/v1/responses \
-H "Content-Type: application/json" \
-d '{"model":"sonnet","input":"Hello"}'
# Anthropic format
curl http://localhost:18082/v1/messages \
-H "Content-Type: application/json" \
-d '{"model":"sonnet","max_tokens":1024,"messages":[{"role":"user","content":"Hello"}]}'
Configuration
agent-relay serve [--host HOST] [--port PORT]
| Flag | Default | Description |
|---|---|---|
--host |
0.0.0.0 |
Bind address |
--port |
18082 |
Bind port |
API
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions |
POST | Chat completions (OpenAI-compatible) |
/v1/responses |
POST | Responses API (OpenAI-compatible) |
/v1/messages |
POST | Messages (Anthropic-compatible) |
/v1/models |
GET | List available models |
/health |
GET | Server and CLI status |
All endpoints also work without the /v1 prefix. CORS is enabled for all origins.
Supported features
| Feature | Status |
|---|---|
| Streaming (SSE) | Yes |
| System messages | Yes (via --system-prompt) |
| Multi-turn conversations | Yes |
| Multimodal (text parts) | Yes |
| Model selection | Yes |
| Token usage reporting | Yes |
| CORS | Yes |
Models
Pass any model name — it goes directly to claude --model:
| Model | Description |
|---|---|
opus |
Most capable |
sonnet |
Balanced (default) |
haiku |
Fastest |
Limitations
temperature,max_tokens,top_p, and other sampling parameters are ignored (Claude Code CLI does not expose them)- No tool/function calling passthrough (Claude Code uses its own tools internally, but they aren't exposed via the OpenAI tool-calling protocol)
- Each request spawns a new
claudeprocess (~2-3s overhead on top of API latency) - No image/audio content forwarding — only text parts of multimodal messages are extracted
How it works
OpenAI client ─┐
├→ claude-relay → claude -p → Anthropic API
Anthropic client ─┘ (FastAPI) (stream-json)
Each request spawns a claude -p process with --output-format stream-json --include-partial-messages. The proxy translates between the OpenAI or Anthropic wire format and Claude Code's streaming JSON protocol. Requests are stateless — no conversation history bleeds between calls.
Development
uv sync
uv run pytest tests/ -v
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentrelay_cli-0.5.3.tar.gz.
File metadata
- Download URL: agentrelay_cli-0.5.3.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
921c799d543082956d3f75e95c71bb5356c2694ca32b77af7ff66a2f5d51f253
|
|
| MD5 |
d8c1ca4fb899744d30623089cc5cfd99
|
|
| BLAKE2b-256 |
9a460593d89a182164266863e5e47183a727388b20ccdb1ab98f4807156678c9
|
Provenance
The following attestation bundles were made for agentrelay_cli-0.5.3.tar.gz:
Publisher:
publish.yml on npow/claude-relay
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentrelay_cli-0.5.3.tar.gz -
Subject digest:
921c799d543082956d3f75e95c71bb5356c2694ca32b77af7ff66a2f5d51f253 - Sigstore transparency entry: 1006017871
- Sigstore integration time:
-
Permalink:
npow/claude-relay@e6f5cbd5e731fc413ae03725d69b16cb4e3cebdd -
Branch / Tag:
refs/tags/v0.5.3 - Owner: https://github.com/npow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e6f5cbd5e731fc413ae03725d69b16cb4e3cebdd -
Trigger Event:
release
-
Statement type:
File details
Details for the file agentrelay_cli-0.5.3-py3-none-any.whl.
File metadata
- Download URL: agentrelay_cli-0.5.3-py3-none-any.whl
- Upload date:
- Size: 14.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63453c8cfe278fd2ffb69e4c1323ada3dbac6f9d74325e9ee3a8bbd404070857
|
|
| MD5 |
f50743bba8b0ebda2f3836ea38ed3438
|
|
| BLAKE2b-256 |
1fccfd55bb83834d12afa703230e44bc8eb5f18fe1de5cdb55178278236c8099
|
Provenance
The following attestation bundles were made for agentrelay_cli-0.5.3-py3-none-any.whl:
Publisher:
publish.yml on npow/claude-relay
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
agentrelay_cli-0.5.3-py3-none-any.whl -
Subject digest:
63453c8cfe278fd2ffb69e4c1323ada3dbac6f9d74325e9ee3a8bbd404070857 - Sigstore transparency entry: 1006017872
- Sigstore integration time:
-
Permalink:
npow/claude-relay@e6f5cbd5e731fc413ae03725d69b16cb4e3cebdd -
Branch / Tag:
refs/tags/v0.5.3 - Owner: https://github.com/npow
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@e6f5cbd5e731fc413ae03725d69b16cb4e3cebdd -
Trigger Event:
release
-
Statement type: