Skip to main content

Local OpenAI-compatible proxy for AI coding tools and DeepSeek thinking models

Project description

DeepSeek Bridge

PyPI version Python versions CI License

A local proxy that connects AI coding tools (Cursor, GitHub Copilot, Codex, and any OpenAI-compatible client) to DeepSeek's reasoning models by repairing the reasoning_content chain that these tools commonly drop from tool-call requests.

pip install deepseek-bridge

DeepSeek's thinking-mode API requires every assistant message in a multi-turn tool-call conversation to carry its complete reasoning_content back to the server. When a client omits this field, the API returns a 400 error. DeepSeek Bridge intercepts requests, restores the missing reasoning from a local cache, and forwards them upstream — no client-side changes needed.

Features

Reasoning Repair

  • Injects reasoning_content into outgoing tool-call requests, restoring previously cached reasoning from regular and streamed DeepSeek responses.
  • Displays thinking tokens in the client UI using collapsible Markdown <details> blocks.
  • Cursor Agent Mode support: automatically converts Responses API payloads to Chat Completions format.

Connection Resilience

  • Connection pooling via urllib3 with keep-alive and minimal retries.
  • Bounded thread pool prevents thread exhaustion on long-running streaming connections.
  • Configurable SSE read timeout (default 180 seconds) prevents hung threads on silent upstreams.
  • Tunnel support (cloudflared by default, ngrok optional) with health check and automatic reconnection.
  • Graceful shutdown on SIGTERM — active requests drain, reasoning cache is flushed.

API Compatibility

  • system_fingerprint in every streaming and non-streaming response.
  • x-request-id UUID header on every response.
  • OpenAI-standard error format.
  • CORS headers enabled by default.
  • /v1/embeddings, /v1/health, and /v1/models endpoints.
  • /v1/completions legacy endpoint (auto-converts prompt to messages).
  • Multimodal content arrays preserved.
  • DeepSeek V4 thinking parameter support (thinking, reasoning_effort, response_format, logprobs).
  • Silent mapping of legacy model names (deepseek-chat, deepseek-reasoner) to deepseek-v4-flash.

Logging and Observability

  • Persistent log files with --log-dir.
  • Heartbeat and pool utilization counters.
  • Full structured request traces with --trace-dir.
  • Terminal UI dashboard with real-time metrics, config editing, and log viewing.

TUI Dashboard

Starting with v0.2.0, DeepSeek Bridge opens a Terminal UI dashboard by default. The dashboard provides live monitoring and configuration:

  • Dashboard tab — Real-time request metrics, uptime, tunnel status, and pool utilization.
  • Config tab — Edit proxy settings (model, network, storage) without restarting.
  • Logs tab — Streaming log viewer with auto-scroll.

Use --headless to disable the TUI and run in classic CLI mode.

Why This Exists

DeepSeek's thinking-mode API enforces a strict contract: every assistant message that participates in a tool-call chain must include the full reasoning_content field. Some AI coding tools (including Cursor) drop this field from their chat transcript, causing DeepSeek to reject subsequent tool-call requests.

DeepSeek Bridge stores copies of reasoning_content from every response and patches missing entries back into requests before forwarding them upstream.

Installation

# From PyPI
pip install deepseek-bridge

# From source
git clone https://github.com/breixopd/deepseek-bridge.git
cd deepseek-bridge
uv run deepseek-bridge

Usage

# Full TUI dashboard (default)
deepseek-bridge

# Headless mode — no TUI, classic CLI output
deepseek-bridge --headless

# Run without tunnel (localhost only)
deepseek-bridge --tunnel none --port 9000

# Debug output with trace dumps
deepseek-bridge --debug --trace-dir ./dumps

# Use a custom config file
deepseek-bridge --config ./my-config.yaml

# Clear reasoning cache and exit
deepseek-bridge --clear-reasoning-cache

# Disable thinking display in client UI
deepseek-bridge --no-display-reasoning

On first run, DeepSeek Bridge creates:

  • ~/.deepseek-bridge/config.yaml — configuration file
  • ~/.deepseek-bridge/reasoning_content.sqlite3 — reasoning cache

Configuration

All settings are configurable via ~/.deepseek-bridge/config.yaml or command-line overrides. Example configuration:

model: deepseek-v4-pro
base_url: https://api.deepseek.com
thinking: enabled
reasoning_effort: max
display_reasoning: true
collapsible_reasoning: true

host: 127.0.0.1
port: 9000
tunnel: cloudflared
debug: false
cors: true
ollama: true
stream_read_timeout: 180
request_timeout: 300

Client Setup

Cursor

In Cursor, add a custom model with these settings:

  • Model: deepseek-v4-pro (or deepseek-v4-flash)
  • API Key: Your DeepSeek API key
  • Base URL: Your tunnel HTTPS URL with /v1 path (e.g., https://app.example.com/v1)

Note on tunnels: Cursor blocks non-public URLs such as localhost. DeepSeek Bridge uses Cloudflare Tunnel by default — a free, persistent HTTPS tunnel with no bandwidth or time limits. Use --tunnel none to disable tunneling. Use --tunnel ngrok if you prefer ngrok.

Cloudflare Tunnel Setup

Cloudflare Named Tunnels are free, persistent, support SSE streaming, and have no bandwidth/time limits. One-time setup:

# Install cloudflared
brew install cloudflare/cloudflare/cloudflared   # macOS
# Or download from: https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/downloads/

# Login and create a tunnel
cloudflared tunnel login
cloudflared tunnel create deepseek-bridge

# Point it at your domain
cloudflared tunnel route dns deepseek-bridge app.example.com

Then add your tunnel URL to ~/.deepseek-bridge/config.yaml:

tunnel: cloudflared
cf_url: https://app.example.com

Use --tunnel cloudflared on the CLI, or select cloudflared in the TUI dashboard.

GitHub Copilot

Configure the Ollama endpoint in VS Code:

{
  "github.copilot.chat.byok.ollamaEndpoint": "http://localhost:9000"
}

Then open Copilot Chat, navigate to "Manage Models", and your DeepSeek models appear automatically.

Agent Mode is supported — the proxy advertises tool_calls capability via the Ollama /api/show endpoint and handles reasoning repair across tool-call chains.

For the new customOAIModels path (VS Code Insiders 1.104+):

{
  "github.copilot.chat.customOAIModels": {
    "deepseek-v4-pro": {
      "name": "DeepSeek V4 Pro",
      "url": "http://localhost:9000/v1/chat/completions",
      "toolCalling": true,
      "vision": false,
      "thinking": true,
      "maxInputTokens": 1000000,
      "maxOutputTokens": 384000,
      "streaming": true,
      "requiresAPIKey": true
    }
  }
}

Other OpenAI-Compatible Clients

Any client that speaks the OpenAI /v1/chat/completions API can use DeepSeek Bridge. Set the client's base URL to http://localhost:9000/v1 (or your tunnel URL).

How It Works

  1. Request interception: The proxy receives a /v1/chat/completions request from the client.
  2. Format detection: If the request uses OpenAI Responses API format (common in Cursor Agent Mode), it is converted to Chat Completions format.
  3. Reasoning repair: Each assistant message in the conversation is checked. Missing reasoning_content fields are looked up in the local SQLite cache and restored.
  4. Cache isolation: Cache keys are scoped by a SHA-256 hash of the conversation prefix, upstream model, configuration, and API key. Different conversations and users never collide.
  5. Response processing: Reasoning content from the upstream response is cached for future requests. Display adapters mirror reasoning thoughts into Markdown <details> blocks visible in the client.

Known Limitations

Cursor Sub-Agents

Cursor sub-agents do not inherit custom API base URL or API key settings. This is a Cursor-side bug (see forum thread). Use the main agent (Cmd+Shift+0 to toggle) for direct DeepSeek chat. Sub-agents that route through the proxy will work correctly; those that bypass it fall back to Cursor's built-in models.

Cursor Agent Mode Responses API Format

Cursor Agent mode sends OpenAI Responses API-format payloads to the Chat Completions endpoint. DeepSeek Bridge detects and converts these automatically.

Reasoning Display

Cursor's native reasoning UI is available only for Cursor's own models. For custom endpoints, reasoning content is mirrored into visible Markdown details blocks. Use --no-display-reasoning to disable this behavior.

Development

# Run tests
uv run python -m unittest discover -s tests

# Format and lint
uv run pre-commit run --all-files

# Type check
uv run mypy src/ --check-untyped-defs

# Run with coverage
uv run coverage run -m unittest discover -s tests
uv run coverage report

CLI Reference

Flag Default Description
--headless off Run without TUI
--model deepseek-v4-pro Fallback model when request omits it
--thinking enabled DeepSeek thinking mode
--reasoning-effort max Reasoning effort level
--display-reasoning on Show reasoning content in client UI
--collapsible-reasoning on Use collapsible Markdown for reasoning
--host 127.0.0.1 Bind address
--port 9000 Bind port
--tunnel cloudflared Tunnel service (none, cloudflared, ngrok)
--cf-url none Cloudflare tunnel public URL
--base-url https://api.deepseek.com Upstream DeepSeek API URL
--cors on Send CORS headers
--stream-read-timeout 180 SSE read timeout in seconds
--max-thread-pool 20 Max concurrent request threads
--max-pool-connections 10 Max upstream connections
--ollama / --no-ollama on Enable/disable Ollama endpoints
--log-dir none Directory for persistent log files
--trace-dir none Directory for request trace dumps
--debug off Enable DEBUG-level log output
--compact off One-line-per-request output
--config ~/.deepseek-bridge/config.yaml Config file path
--no-log off Disable all log file output
--reasoning-content-path ~/.deepseek-bridge/reasoning_content.sqlite3 Reasoning cache path
--reasoning-cache-max-age-seconds 604800 Max age of cached reasoning (seconds)
--missing-reasoning-strategy recover Strategy for missing reasoning (recover/reject)
--max-request-body-bytes 20971520 Max request body size in bytes
--clear-reasoning-cache off Clear reasoning cache and exit
--version - Print version and exit

License

MIT License

Acknowledgements

Based on yxlao/deepseek-cursor-proxy, the original project that started this work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepseek_bridge-0.5.1.tar.gz (341.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepseek_bridge-0.5.1-py3-none-any.whl (75.0 kB view details)

Uploaded Python 3

File details

Details for the file deepseek_bridge-0.5.1.tar.gz.

File metadata

  • Download URL: deepseek_bridge-0.5.1.tar.gz
  • Upload date:
  • Size: 341.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for deepseek_bridge-0.5.1.tar.gz
Algorithm Hash digest
SHA256 758931c61e82f575c37221e4bc272c4eeb63b9e479c540d8bbe014b6af5c95ed
MD5 4756d72c18c6da32c5aa41859b4512a6
BLAKE2b-256 5c66ee44a3fd41746a1a9a47624b80efe33983748f205ba704cd1a3f6e6b3c8e

See more details on using hashes here.

File details

Details for the file deepseek_bridge-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: deepseek_bridge-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 75.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for deepseek_bridge-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ed7c200bb571d5375cfedd4a34cd2e6b8030c2c618d5e3ef3653fb819b164f23
MD5 9f24f3d8b9dfc98f1b7b551078beb8d7
BLAKE2b-256 65a8b2d089c753951bf90bf7236c4759d32539cb5df8c1cc20ef086ccb28a0fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page