A lightweight proxy server that converts Anthropic Messages API to OpenAI API
Project description
local-openai2anthropic
English | 中文
A lightweight proxy that lets applications built with Claude SDK talk to locally-hosted OpenAI-compatible LLMs.
What Problem This Solves
Many local LLM tools (vLLM, SGLang, etc.) provide an OpenAI-compatible API. But if you've built your app using Anthropic's Claude SDK, you can't use them directly.
This proxy translates Claude SDK calls to OpenAI API format in real-time, enabling:
- Local LLM inference with Claude-based apps
- Offline development without cloud API costs
- Privacy-first AI - data never leaves your machine
- Seamless model switching between cloud and local
- Web Search tool - built-in Tavily web search for local models
Supported Local Backends
Currently tested and supported:
| Backend | Description | Status |
|---|---|---|
| vLLM | High-throughput LLM inference | ✅ Fully supported |
| SGLang | Fast structured language model serving | ✅ Fully supported |
Other OpenAI-compatible backends may work but are not fully tested.
Quick Start
1. Install
pip install local-openai2anthropic
2. Configure Your LLM Backend (Optional)
Option A: Start a local LLM server
If you don't have an LLM server running, you can start one locally:
Example with vLLM:
vllm serve meta-llama/Llama-2-7b-chat-hf
# vLLM starts OpenAI-compatible API at http://localhost:8000/v1
Or with SGLang:
sglang launch --model-path meta-llama/Llama-2-7b-chat-hf --port 8000
# SGLang starts at http://localhost:8000/v1
Option B: Use an existing OpenAI-compatible API
If you already have a deployed OpenAI-compatible API (local or remote), you can use it directly. Just note the base URL for the next step.
Examples:
- Local vLLM/SGLang:
http://localhost:8000/v1 - Remote API:
https://api.example.com/v1
Note: If you're using Ollama, it natively supports the Anthropic API format, so you don't need this proxy. Just point your Claude SDK directly to
http://localhost:11434/v1.
3. Start the Proxy
Option A: Run in background (recommended)
export OA2A_OPENAI_BASE_URL=http://localhost:8000/v1 # Your local LLM endpoint
export OA2A_OPENAI_API_KEY=dummy # Any value, not used by local backends
oa2a start # Start server in background
# Server starts at http://localhost:8080
# View logs
oa2a logs # Show last 50 lines of logs
oa2a logs -f # Follow logs in real-time (Ctrl+C to exit)
# Check status
oa2a status # Check if server is running
# Stop server
oa2a stop # Stop background server
# Restart server
oa2a restart # Restart with same settings
Option B: Run in foreground
export OA2A_OPENAI_BASE_URL=http://localhost:8000/v1
export OA2A_OPENAI_API_KEY=dummy
oa2a # Run server in foreground (blocking)
# Press Ctrl+C to stop
4. Use in Your App
import anthropic
client = anthropic.Anthropic(
base_url="http://localhost:8080", # Point to proxy
api_key="dummy-key", # Not used
)
message = client.messages.create(
model="meta-llama/Llama-2-7b-chat-hf", # Your local model name
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)
print(message.content[0].text)
Using with Claude Code
You can configure Claude Code to use your local LLM through this proxy.
Configuration Steps
- Edit Claude Code config file at
~/.claude/settings.json:
{
"env": {
"ANTHROPIC_BASE_URL": "http://localhost:8080",
"ANTHROPIC_API_KEY": "dummy-key",
"ANTHROPIC_MODEL": "meta-llama/Llama-2-7b-chat-hf",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "meta-llama/Llama-2-7b-chat-hf",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "meta-llama/Llama-2-7b-chat-hf",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "meta-llama/Llama-2-7b-chat-hf",
"ANTHROPIC_REASONING_MODEL": "meta-llama/Llama-2-7b-chat-hf"
}
}
| Variable | Description |
|---|---|
ANTHROPIC_MODEL |
General model setting |
ANTHROPIC_DEFAULT_SONNET_MODEL |
Default model for Sonnet mode (Claude Code default) |
ANTHROPIC_DEFAULT_OPUS_MODEL |
Default model for Opus mode |
ANTHROPIC_DEFAULT_HAIKU_MODEL |
Default model for Haiku mode |
ANTHROPIC_REASONING_MODEL |
Default model for reasoning tasks |
- Or set environment variables before running Claude Code:
export ANTHROPIC_BASE_URL=http://localhost:8080
export ANTHROPIC_API_KEY=dummy-key
claude
Complete Workflow Example
Make sure ~/.claude/settings.json is configured as described above.
Terminal 1 - Start your local LLM:
vllm serve meta-llama/Llama-2-7b-chat-hf
Terminal 2 - Start the proxy (background mode):
export OA2A_OPENAI_BASE_URL=http://localhost:8000/v1
export OA2A_OPENAI_API_KEY=dummy
export OA2A_TAVILY_API_KEY="tvly-your-tavily-api-key" # Optional: enable web search
oa2a start
Terminal 3 - Launch Claude Code:
claude
Now Claude Code will use your local LLM instead of the cloud API.
To stop the proxy:
oa2a stop
Features
- ✅ Streaming responses - Real-time token streaming via SSE
- ✅ Tool calling - Local LLM function calling support
- ✅ Vision models - Multi-modal input for vision-capable models
- ✅ Web Search - Give your local LLM internet access (see below)
- ✅ Thinking mode - Supports reasoning/thinking model outputs
Web Search Capability 🔍
Bridge the gap: Give your local LLM the web search power that Claude Code users enjoy!
When using locally-hosted models with Claude Code, you lose access to the built-in web search tool. This proxy fills that gap by providing a server-side web search implementation powered by Tavily.
The Problem
| Scenario | Web Search Available? |
|---|---|
| Using Claude (cloud) in Claude Code | ✅ Built-in |
| Using local vLLM/SGLang in Claude Code | ❌ Not available |
| Using this proxy + local LLM | ✅ Enabled via Tavily |
How It Works
Claude Code → Anthropic SDK → This Proxy → Local LLM
↓
Tavily API (Web Search)
The proxy intercepts web_search_20250305 tool calls and handles them directly, regardless of whether your local model supports web search natively.
Setup Tavily Search
-
Get a free API key at tavily.com - generous free tier available
-
Configure the proxy:
export OA2A_OPENAI_BASE_URL=http://localhost:8000/v1
export OA2A_OPENAI_API_KEY=dummy
export OA2A_TAVILY_API_KEY="tvly-your-tavily-api-key" # Enable web search
oa2a
- Use in your app:
import anthropic
client = anthropic.Anthropic(
base_url="http://localhost:8080",
api_key="dummy-key",
)
message = client.messages.create(
model="meta-llama/Llama-2-7b-chat-hf",
max_tokens=1024,
tools=[
{
"name": "web_search_20250305",
"description": "Search the web for current information",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "Search query"},
},
"required": ["query"],
},
}
],
messages=[{"role": "user", "content": "What happened in AI today?"}],
)
if message.stop_reason == "tool_use":
tool_use = message.content[-1]
print(f"Searching: {tool_use.input}")
# The proxy automatically calls Tavily and returns results
Tavily Configuration Options
| Variable | Default | Description |
|---|---|---|
OA2A_TAVILY_API_KEY |
- | Your Tavily API key (get free at tavily.com) |
OA2A_TAVILY_MAX_RESULTS |
5 | Number of search results to return |
OA2A_TAVILY_TIMEOUT |
30 | Search timeout in seconds |
OA2A_WEBSEARCH_MAX_USES |
5 | Max search calls per request |
Configuration
| Variable | Required | Default | Description |
|---|---|---|---|
OA2A_OPENAI_BASE_URL |
✅ | - | Your local LLM's OpenAI-compatible endpoint |
OA2A_OPENAI_API_KEY |
✅ | - | Any value (local backends usually ignore this) |
OA2A_PORT |
❌ | 8080 | Proxy server port |
OA2A_HOST |
❌ | 0.0.0.0 | Proxy server host |
OA2A_TAVILY_API_KEY |
❌ | - | Enable web search (tavily.com) |
Architecture
Your App (Claude SDK)
│
▼
┌─────────────────────┐
│ local-openai2anthropic │ ← This proxy
│ (Port 8080) │
└─────────────────────┘
│
▼
Your Local LLM Server
(vLLM / SGLang)
(OpenAI-compatible API)
Development
git clone https://github.com/dongfangzan/local-openai2anthropic.git
cd local-openai2anthropic
pip install -e ".[dev]"
pytest
License
Apache License 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file local_openai2anthropic-0.3.6.tar.gz.
File metadata
- Download URL: local_openai2anthropic-0.3.6.tar.gz
- Upload date:
- Size: 163.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc1ff00b67dc23d839839590664ad5cb75052db66e50fe7a152818c66ee56812
|
|
| MD5 |
6dc0ed42ab390919c67616e867e9cec9
|
|
| BLAKE2b-256 |
c3a0a35175171377a678970d37edabf02205eace920cda309d1ed39bd5431760
|
Provenance
The following attestation bundles were made for local_openai2anthropic-0.3.6.tar.gz:
Publisher:
publish.yml on dongfangzan/local-openai2anthropic
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
local_openai2anthropic-0.3.6.tar.gz -
Subject digest:
cc1ff00b67dc23d839839590664ad5cb75052db66e50fe7a152818c66ee56812 - Sigstore transparency entry: 907358886
- Sigstore integration time:
-
Permalink:
dongfangzan/local-openai2anthropic@24682978019aea3a4db641fc05f2ef2f74b961f8 -
Branch / Tag:
refs/tags/v0.3.6 - Owner: https://github.com/dongfangzan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@24682978019aea3a4db641fc05f2ef2f74b961f8 -
Trigger Event:
push
-
Statement type:
File details
Details for the file local_openai2anthropic-0.3.6-py3-none-any.whl.
File metadata
- Download URL: local_openai2anthropic-0.3.6-py3-none-any.whl
- Upload date:
- Size: 46.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69663684668b56290fe1fdcea774ca09099bfcc04c68c216e64074efab9b45e1
|
|
| MD5 |
97b1910bf2bd41961855bb564463abf8
|
|
| BLAKE2b-256 |
b7501b5037951107752fcfc2c07535b234b9d960b15f0847ea63f8ca7852fd56
|
Provenance
The following attestation bundles were made for local_openai2anthropic-0.3.6-py3-none-any.whl:
Publisher:
publish.yml on dongfangzan/local-openai2anthropic
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
local_openai2anthropic-0.3.6-py3-none-any.whl -
Subject digest:
69663684668b56290fe1fdcea774ca09099bfcc04c68c216e64074efab9b45e1 - Sigstore transparency entry: 907358888
- Sigstore integration time:
-
Permalink:
dongfangzan/local-openai2anthropic@24682978019aea3a4db641fc05f2ef2f74b961f8 -
Branch / Tag:
refs/tags/v0.3.6 - Owner: https://github.com/dongfangzan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@24682978019aea3a4db641fc05f2ef2f74b961f8 -
Trigger Event:
push
-
Statement type: