OpenAI- and Anthropic-compatible API server that routes through agent CLIs

Project description

agent-relay

Drop-in OpenAI and Anthropic API server that routes through agent CLIs (currently Claude Code).

Compatibility note: claude-relay remains available as a compatibility package/command alias.

Why

You have tools that speak the OpenAI or Anthropic API. You have Claude Code with its tools, MCP servers, and agentic capabilities. agent-relay bridges the two — point any compatible client at it and every request flows through claude -p under the hood.

Use Claude Code from any OpenAI or Anthropic client — Cursor, Continue, aider, LangChain, custom scripts
Keep Claude Code's superpowers — tool use, MCP servers, file access, shell execution
Zero config — if claude works on your machine, so does this
Real token usage — reports actual token counts from Claude (not zeros)
Token-level streaming — uses --include-partial-messages for true real-time deltas

Install

# With uv (recommended)
uvx agent-relay serve

# Or install globally
uv tool install agentrelay-cli
agent-relay serve

# Or from source
git clone https://github.com/npow/claude-relay.git
cd claude-relay
uv sync
uv run agent-relay serve

Quick start

agent-relay serve
# Server starts on http://localhost:18082

Run as background service (macOS)

# Install and auto-start on login
agent-relay service install

The installer will offer to add these to your ~/.zshrc (or ~/.bashrc) so every SDK and agent picks up the relay automatically:

export ANTHROPIC_BASE_URL="http://127.0.0.1:18082"
export OPENAI_BASE_URL="http://127.0.0.1:18082/v1"

# Check status
agent-relay service status

# Update
uv tool upgrade agentrelay-cli
agent-relay service restart

# Stop and remove
agent-relay service uninstall

Point any OpenAI-compatible client at it:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:18082/v1", api_key="unused")

# Streaming
for chunk in client.chat.completions.create(
    model="sonnet",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
):
    print(chunk.choices[0].delta.content or "", end="")

# Non-streaming
resp = client.chat.completions.create(
    model="sonnet",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)

Anthropic SDK

import anthropic

# Just set the base URL — the SDK reads ANTHROPIC_BASE_URL automatically
# export ANTHROPIC_BASE_URL=http://localhost:18082
client = anthropic.Anthropic(base_url="http://localhost:18082")

# Streaming
with client.messages.stream(
    model="sonnet",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="")

# Non-streaming
resp = client.messages.create(
    model="sonnet",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.content[0].text)

LangChain

from langchain_anthropic import ChatAnthropic

# export ANTHROPIC_BASE_URL=http://localhost:18082
llm = ChatAnthropic(model="sonnet")
print(llm.invoke("Hello!").content)

curl

# OpenAI format
curl http://localhost:18082/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"sonnet","messages":[{"role":"user","content":"Hello"}],"stream":true}'

# OpenAI Responses format
curl http://localhost:18082/v1/responses \
  -H "Content-Type: application/json" \
  -d '{"model":"sonnet","input":"Hello"}'

# Anthropic format
curl http://localhost:18082/v1/messages \
  -H "Content-Type: application/json" \
  -d '{"model":"sonnet","max_tokens":1024,"messages":[{"role":"user","content":"Hello"}]}'

Configuration

agent-relay serve [--host HOST] [--port PORT]

Flag	Default	Description
`--host`	`0.0.0.0`	Bind address
`--port`	`18082`	Bind port

API

Endpoint	Method	Description
`/v1/chat/completions`	POST	Chat completions (OpenAI-compatible)
`/v1/responses`	POST	Responses API (OpenAI-compatible)
`/v1/messages`	POST	Messages (Anthropic-compatible)
`/v1/models`	GET	List available models
`/health`	GET	Server and CLI status

All endpoints also work without the /v1 prefix. CORS is enabled for all origins.

Supported features

Feature	Status
Streaming (SSE)	Yes
System messages	Yes (via `--system-prompt`)
Multi-turn conversations	Yes
Multimodal (text parts)	Yes
Model selection	Yes
Token usage reporting	Yes
CORS	Yes

Models

Pass any model name — it goes directly to claude --model:

Model	Description
`opus`	Most capable
`sonnet`	Balanced (default)
`haiku`	Fastest

Limitations

temperature, max_tokens, top_p, and other sampling parameters are ignored (Claude Code CLI does not expose them)
No tool/function calling passthrough (Claude Code uses its own tools internally, but they aren't exposed via the OpenAI tool-calling protocol)
Each request spawns a new claude process (~2-3s overhead on top of API latency)
No image/audio content forwarding — only text parts of multimodal messages are extracted

How it works

OpenAI client     ─┐
                    ├→  claude-relay  →  claude -p  →  Anthropic API
Anthropic client  ─┘     (FastAPI)      (stream-json)

Each request spawns a claude -p process with --output-format stream-json --include-partial-messages. The proxy translates between the OpenAI or Anthropic wire format and Claude Code's streaming JSON protocol. Requests are stateless — no conversation history bleeds between calls.

Development

uv sync
uv run pytest tests/ -v

License

MIT

Project details

Release history Release notifications | RSS feed

0.5.5

Mar 4, 2026

This version

0.5.3

Mar 1, 2026

0.5.1

Feb 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentrelay_cli-0.5.3.tar.gz (19.2 kB view details)

Uploaded Mar 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentrelay_cli-0.5.3-py3-none-any.whl (14.2 kB view details)

Uploaded Mar 1, 2026 Python 3

File details

Details for the file agentrelay_cli-0.5.3.tar.gz.

File metadata

Download URL: agentrelay_cli-0.5.3.tar.gz
Upload date: Mar 1, 2026
Size: 19.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentrelay_cli-0.5.3.tar.gz
Algorithm	Hash digest
SHA256	`921c799d543082956d3f75e95c71bb5356c2694ca32b77af7ff66a2f5d51f253`
MD5	`d8c1ca4fb899744d30623089cc5cfd99`
BLAKE2b-256	`9a460593d89a182164266863e5e47183a727388b20ccdb1ab98f4807156678c9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentrelay_cli-0.5.3.tar.gz:

Publisher: publish.yml on npow/claude-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentrelay_cli-0.5.3.tar.gz
- Subject digest: 921c799d543082956d3f75e95c71bb5356c2694ca32b77af7ff66a2f5d51f253
- Sigstore transparency entry: 1006017871
- Sigstore integration time: Mar 1, 2026
Source repository:
- Permalink: npow/claude-relay@e6f5cbd5e731fc413ae03725d69b16cb4e3cebdd
- Branch / Tag: refs/tags/v0.5.3
- Owner: https://github.com/npow
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e6f5cbd5e731fc413ae03725d69b16cb4e3cebdd
- Trigger Event: release

File details

Details for the file agentrelay_cli-0.5.3-py3-none-any.whl.

File metadata

Download URL: agentrelay_cli-0.5.3-py3-none-any.whl
Upload date: Mar 1, 2026
Size: 14.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentrelay_cli-0.5.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`63453c8cfe278fd2ffb69e4c1323ada3dbac6f9d74325e9ee3a8bbd404070857`
MD5	`f50743bba8b0ebda2f3836ea38ed3438`
BLAKE2b-256	`1fccfd55bb83834d12afa703230e44bc8eb5f18fe1de5cdb55178278236c8099`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentrelay_cli-0.5.3-py3-none-any.whl:

Publisher: publish.yml on npow/claude-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentrelay_cli-0.5.3-py3-none-any.whl
- Subject digest: 63453c8cfe278fd2ffb69e4c1323ada3dbac6f9d74325e9ee3a8bbd404070857
- Sigstore transparency entry: 1006017872
- Sigstore integration time: Mar 1, 2026
Source repository:
- Permalink: npow/claude-relay@e6f5cbd5e731fc413ae03725d69b16cb4e3cebdd
- Branch / Tag: refs/tags/v0.5.3
- Owner: https://github.com/npow
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e6f5cbd5e731fc413ae03725d69b16cb4e3cebdd
- Trigger Event: release

agentrelay-cli 0.5.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

agent-relay

Why

Install

Quick start

Run as background service (macOS)

Anthropic SDK

LangChain

curl

Configuration

API

Supported features

Models

Limitations

How it works

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance