Local OpenAI-compatible proxy for AI coding tools and DeepSeek thinking models
Project description
DeepSeek Bridge
A local proxy that connects AI coding tools (Cursor, GitHub Copilot, Codex, and any OpenAI-compatible client) to DeepSeek's reasoning models by repairing the reasoning_content chain that these tools commonly drop from tool-call requests.
pip install deepseek-bridge
DeepSeek's thinking-mode API requires every assistant message in a multi-turn tool-call conversation to carry its complete reasoning_content back to the server. When a client omits this field, the API returns a 400 error. DeepSeek Bridge intercepts requests, restores the missing reasoning from a local cache, and forwards them upstream — no client-side changes needed.
Features
Reasoning Repair
- Injects
reasoning_contentinto outgoing tool-call requests, restoring previously cached reasoning from regular and streamed DeepSeek responses. - Displays thinking tokens in the client UI using collapsible Markdown
<details>blocks. - Cursor Agent Mode support: automatically converts Responses API payloads to Chat Completions format.
Connection Resilience
- Connection pooling via
urllib3with keep-alive and minimal retries. - Bounded thread pool prevents thread exhaustion on long-running streaming connections.
- Configurable SSE read timeout (default 180 seconds) prevents hung threads on silent upstreams.
- Tunnel support (localhost.run by default, ngrok optional) with health check and automatic reconnection.
- Graceful shutdown on SIGTERM — active requests drain, reasoning cache is flushed.
API Compatibility
system_fingerprintin every streaming and non-streaming response.x-request-idUUID header on every response.- OpenAI-standard error format.
- CORS headers enabled by default.
/v1/embeddings,/v1/health, and/v1/modelsendpoints./v1/completionslegacy endpoint (auto-convertsprompttomessages).- Multimodal content arrays preserved.
- DeepSeek V4 thinking parameter support (
thinking,reasoning_effort,response_format,logprobs). - Silent mapping of legacy model names (
deepseek-chat,deepseek-reasoner) todeepseek-v4-flash.
Logging and Observability
- Persistent log files with
--log-dir. - Heartbeat and pool utilization counters.
- Full structured request traces with
--trace-dir. - Terminal UI dashboard with real-time metrics, config editing, and log viewing.
TUI Dashboard
Starting with v0.2.0, DeepSeek Bridge opens a Terminal UI dashboard by default. The dashboard provides live monitoring and configuration:
- Dashboard tab — Real-time request metrics, uptime, tunnel status, and pool utilization.
- Config tab — Edit proxy settings (model, network, storage) without restarting.
- Logs tab — Streaming log viewer with auto-scroll.
Use --headless to disable the TUI and run in classic CLI mode.
Why This Exists
DeepSeek's thinking-mode API enforces a strict contract: every assistant message that participates in a tool-call chain must include the full reasoning_content field. Some AI coding tools (including Cursor) drop this field from their chat transcript, causing DeepSeek to reject subsequent tool-call requests.
DeepSeek Bridge stores copies of reasoning_content from every response and patches missing entries back into requests before forwarding them upstream.
Installation
# From PyPI
pip install deepseek-bridge
# From source
git clone https://github.com/breixopd/deepseek-bridge.git
cd deepseek-bridge
uv run deepseek-bridge
Usage
# Full TUI dashboard (default)
deepseek-bridge
# Headless mode — no TUI, classic CLI output
deepseek-bridge --headless
# Run without tunnel (localhost only)
deepseek-bridge --tunnel off --port 9000
# Debug output with trace dumps
deepseek-bridge --debug --trace-dir ./dumps
# Use a custom config file
deepseek-bridge --config ./my-config.yaml
# Clear reasoning cache and exit
deepseek-bridge --clear-reasoning-cache
# Disable thinking display in client UI
deepseek-bridge --no-display-reasoning
On first run, DeepSeek Bridge creates:
~/.deepseek-bridge/config.yaml— configuration file~/.deepseek-bridge/reasoning_content.sqlite3— reasoning cache
Configuration
All settings are configurable via ~/.deepseek-bridge/config.yaml or command-line overrides. Example configuration:
model: deepseek-v4-pro
base_url: https://api.deepseek.com
thinking: enabled
reasoning_effort: max
display_reasoning: true
collapsible_reasoning: true
host: 127.0.0.1
port: 9000
tunnel: localhostrun
debug: false
cors: true
ollama: true
stream_read_timeout: 180
request_timeout: 300
Client Setup
Cursor
In Cursor, add a custom model with these settings:
- Model:
deepseek-v4-pro(ordeepseek-v4-flash) - API Key: Your DeepSeek API key
- Base URL: Your tunnel HTTPS URL with
/v1path (e.g.,https://abc123.lhr.life/v1)
Note on tunnels: Cursor blocks non-public URLs such as
localhost. DeepSeek Bridge uses localhost.run by default (a zero-dependency SSH tunnel — no installation needed). Use--tunnel offto disable tunneling. Use--tunnel ngrokif you prefer ngrok.
GitHub Copilot
Configure the Ollama endpoint in VS Code:
{
"github.copilot.chat.byok.ollamaEndpoint": "http://localhost:9000"
}
Then open Copilot Chat, navigate to "Manage Models", and your DeepSeek models appear automatically.
Agent Mode is supported — the proxy advertises tool_calls capability via the Ollama /api/show endpoint and handles reasoning repair across tool-call chains.
For the new customOAIModels path (VS Code Insiders 1.104+):
{
"github.copilot.chat.customOAIModels": {
"deepseek-v4-pro": {
"name": "DeepSeek V4 Pro",
"url": "http://localhost:9000/v1/chat/completions",
"toolCalling": true,
"vision": false,
"thinking": true,
"maxInputTokens": 1000000,
"maxOutputTokens": 384000,
"streaming": true,
"requiresAPIKey": true
}
}
}
Other OpenAI-Compatible Clients
Any client that speaks the OpenAI /v1/chat/completions API can use DeepSeek Bridge. Set the client's base URL to http://localhost:9000/v1 (or your tunnel URL).
How It Works
- Request interception: The proxy receives a
/v1/chat/completionsrequest from the client. - Format detection: If the request uses OpenAI Responses API format (common in Cursor Agent Mode), it is converted to Chat Completions format.
- Reasoning repair: Each assistant message in the conversation is checked. Missing
reasoning_contentfields are looked up in the local SQLite cache and restored. - Cache isolation: Cache keys are scoped by a SHA-256 hash of the conversation prefix, upstream model, configuration, and API key. Different conversations and users never collide.
- Response processing: Reasoning content from the upstream response is cached for future requests. Display adapters mirror reasoning thoughts into Markdown
<details>blocks visible in the client.
Known Limitations
Cursor Sub-Agents
Cursor sub-agents do not inherit custom API base URL or API key settings. This is a Cursor-side bug (see forum thread). Use the main agent (Cmd+Shift+0 to toggle) for direct DeepSeek chat. Sub-agents that route through the proxy will work correctly; those that bypass it fall back to Cursor's built-in models.
Cursor Agent Mode Responses API Format
Cursor Agent mode sends OpenAI Responses API-format payloads to the Chat Completions endpoint. DeepSeek Bridge detects and converts these automatically.
Reasoning Display
Cursor's native reasoning UI is available only for Cursor's own models. For custom endpoints, reasoning content is mirrored into visible Markdown details blocks. Use --no-display-reasoning to disable this behavior.
Development
# Run tests
uv run python -m unittest discover -s tests
# Format and lint
uv run pre-commit run --all-files
# Type check
uv run mypy src/ --check-untyped-defs
# Run with coverage
uv run coverage run -m unittest discover -s tests
uv run coverage report
CLI Reference
| Flag | Default | Description |
|---|---|---|
--headless |
off | Run without TUI |
--model |
deepseek-v4-pro |
Fallback model when request omits it |
--thinking |
enabled |
DeepSeek thinking mode |
--reasoning-effort |
max |
Reasoning effort level |
--display-reasoning |
on | Show reasoning content in client UI |
--collapsible-reasoning |
on | Use collapsible Markdown for reasoning |
--host |
127.0.0.1 |
Bind address |
--port |
9000 |
Bind port |
--tunnel |
localhostrun |
Tunnel service (off, localhostrun, ngrok) |
--base-url |
https://api.deepseek.com |
Upstream DeepSeek API URL |
--cors |
on | Send CORS headers |
--stream-read-timeout |
180 |
SSE read timeout in seconds |
--max-thread-pool |
20 |
Max concurrent request threads |
--max-pool-connections |
10 |
Max upstream connections |
--ollama / --no-ollama |
on | Enable/disable Ollama endpoints |
--log-dir |
none | Directory for persistent log files |
--trace-dir |
none | Directory for request trace dumps |
--debug |
off | Enable DEBUG-level log output |
--compact |
off | One-line-per-request output |
--config |
~/.deepseek-bridge/config.yaml | Config file path |
--no-log |
off | Disable all log file output |
--reasoning-content-path |
~/.deepseek-bridge/reasoning_content.sqlite3 | Reasoning cache path |
--reasoning-cache-max-age-seconds |
604800 | Max age of cached reasoning (seconds) |
--missing-reasoning-strategy |
recover | Strategy for missing reasoning (recover/reject) |
--max-request-body-bytes |
20971520 | Max request body size in bytes |
--clear-reasoning-cache |
off | Clear reasoning cache and exit |
--version |
- | Print version and exit |
License
MIT License
Acknowledgements
Based on yxlao/deepseek-cursor-proxy, the original project that started this work.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deepseek_bridge-0.5.0.tar.gz.
File metadata
- Download URL: deepseek_bridge-0.5.0.tar.gz
- Upload date:
- Size: 330.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
19d24729fbdc1be2d1a4dafbb018dc953fc9e3eaea2d25c0460ed677be918d51
|
|
| MD5 |
ca3470aec1c6ce30a8da11c50eb7f316
|
|
| BLAKE2b-256 |
bd883328519f86b0ceec4ed2508ed6070b256f0af3f7c0120ad87e78f2ca3c1e
|
File details
Details for the file deepseek_bridge-0.5.0-py3-none-any.whl.
File metadata
- Download URL: deepseek_bridge-0.5.0-py3-none-any.whl
- Upload date:
- Size: 70.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
764124a45a9a33b9246e780511ba0e66ead575d80c43eb8509a1e6473adbab8d
|
|
| MD5 |
d7df0ffdff5de59a11c5308bb303897f
|
|
| BLAKE2b-256 |
2b848b771b2891708ca70a463dfeeb39c58a8d01062a108c48980b7bd897e486
|