Skip to main content

Local OpenAI-compatible proxy for AI coding tools and DeepSeek thinking models

Project description

DeepSeek Bridge

PyPI version Python versions CI License

A local proxy that connects AI coding tools (Cursor, GitHub Copilot, Codex, and any OpenAI-compatible client) to DeepSeek's reasoning models by repairing the reasoning_content chain that these tools commonly drop from tool-call requests.

pip install deepseek-bridge

DeepSeek's thinking-mode API requires every assistant message in a multi-turn tool-call conversation to carry its complete reasoning_content back to the server. When a client omits this field, the API returns a 400 error. DeepSeek Bridge intercepts requests, restores the missing reasoning from a local cache, and forwards them upstream — no client-side changes needed.

Features

Reasoning Repair

  • Injects reasoning_content into outgoing tool-call requests, restoring previously cached reasoning from regular and streamed DeepSeek responses.
  • Displays thinking tokens in the client UI using collapsible Markdown <details> blocks.
  • Cursor Agent Mode support: automatically converts Responses API payloads to Chat Completions format.

Connection Resilience

  • Connection pooling via urllib3 with keep-alive and minimal retries.
  • Bounded thread pool prevents thread exhaustion on long-running streaming connections.
  • Configurable SSE read timeout (default 180 seconds) prevents hung threads on silent upstreams.
  • Ngrok tunnel health check with automatic reconnection.
  • Graceful shutdown on SIGTERM — active requests drain, reasoning cache is flushed.

API Compatibility

  • system_fingerprint in every streaming and non-streaming response.
  • x-request-id UUID header on every response.
  • OpenAI-standard error format.
  • CORS headers enabled by default.
  • /v1/embeddings, /v1/health, and /v1/models endpoints.
  • /v1/completions legacy endpoint (auto-converts prompt to messages).
  • Multimodal content arrays preserved.
  • DeepSeek V4 thinking parameter support (thinking, reasoning_effort, response_format, logprobs).
  • Silent mapping of legacy model names (deepseek-chat, deepseek-reasoner) to deepseek-v4-flash.

Logging and Observability

  • Persistent log files with --log-dir.
  • Heartbeat and pool utilization counters.
  • Full structured request traces with --trace-dir.
  • Terminal UI dashboard with real-time metrics, config editing, and log viewing.

TUI Dashboard

Starting with v0.2.0, DeepSeek Bridge opens a Terminal UI dashboard by default. The dashboard provides live monitoring and configuration:

  • Dashboard tab — Real-time request metrics, uptime, ngrok status, and pool utilization.
  • Config tab — Edit proxy settings (model, network, storage) without restarting.
  • Logs tab — Streaming log viewer with auto-scroll.

Use --headless to disable the TUI and run in classic CLI mode.

Why This Exists

DeepSeek's thinking-mode API enforces a strict contract: every assistant message that participates in a tool-call chain must include the full reasoning_content field. Some AI coding tools (including Cursor) drop this field from their chat transcript, causing DeepSeek to reject subsequent tool-call requests.

DeepSeek Bridge stores copies of reasoning_content from every response and patches missing entries back into requests before forwarding them upstream.

Installation

# From PyPI
pip install deepseek-bridge

# From source
git clone https://github.com/breixopd/deepseek-bridge.git
cd deepseek-bridge
uv run deepseek-bridge

Usage

# Full TUI dashboard (default)
deepseek-bridge

# Headless mode — no TUI, classic CLI output
deepseek-bridge --headless

# Local testing without ngrok
deepseek-bridge --no-ngrok --port 9000

# Verbose output with trace dumps
deepseek-bridge --verbose --trace-dir ./dumps

# Use a custom config file
deepseek-bridge --config ./my-config.yaml

# Clear reasoning cache and exit
deepseek-bridge --clear-reasoning-cache

# Disable thinking display in client UI
deepseek-bridge --no-display-reasoning

On first run, DeepSeek Bridge creates:

  • ~/.deepseek-bridge/config.yaml — configuration file
  • ~/.deepseek-bridge/reasoning_content.sqlite3 — reasoning cache

Configuration

All settings are configurable via ~/.deepseek-bridge/config.yaml or command-line overrides. Example configuration:

model: deepseek-v4-pro
base_url: https://api.deepseek.com
thinking: enabled
reasoning_effort: max
display_reasoning: true
collapsible_reasoning: true

host: 127.0.0.1
port: 9000
ngrok: true
verbose: false
cors: true
ollama: true
stream_read_timeout: 180
request_timeout: 300

Client Setup

Cursor

In Cursor, add a custom model with these settings:

  • Model: deepseek-v4-pro (or deepseek-v4-flash)
  • API Key: Your DeepSeek API key
  • Base URL: Your ngrok HTTPS URL with /v1 path (e.g., https://example.ngrok-free.dev/v1)

Note on ngrok: Cursor blocks non-public URLs such as localhost. Use ngrok or Cloudflare Tunnel to expose the proxy. If your client supports localhost endpoints, disable ngrok with --no-ngrok.

GitHub Copilot

Configure the Ollama endpoint in VS Code:

{
  "github.copilot.chat.byok.ollamaEndpoint": "http://localhost:9000"
}

Then open Copilot Chat, navigate to "Manage Models", and your DeepSeek models appear automatically.

Agent Mode is supported — the proxy advertises tool_calls capability via the Ollama /api/show endpoint and handles reasoning repair across tool-call chains.

For the new customOAIModels path (VS Code Insiders 1.104+):

{
  "github.copilot.chat.customOAIModels": {
    "deepseek-v4-pro": {
      "name": "DeepSeek V4 Pro",
      "url": "http://localhost:9000/v1/chat/completions",
      "toolCalling": true,
      "vision": false,
      "thinking": true,
      "maxInputTokens": 1000000,
      "maxOutputTokens": 384000,
      "streaming": true,
      "requiresAPIKey": true
    }
  }
}

Other OpenAI-Compatible Clients

Any client that speaks the OpenAI /v1/chat/completions API can use DeepSeek Bridge. Set the client's base URL to http://localhost:9000/v1 (or your ngrok URL).

How It Works

  1. Request interception: The proxy receives a /v1/chat/completions request from the client.
  2. Format detection: If the request uses OpenAI Responses API format (common in Cursor Agent Mode), it is converted to Chat Completions format.
  3. Reasoning repair: Each assistant message in the conversation is checked. Missing reasoning_content fields are looked up in the local SQLite cache and restored.
  4. Cache isolation: Cache keys are scoped by a SHA-256 hash of the conversation prefix, upstream model, configuration, and API key. Different conversations and users never collide.
  5. Response processing: Reasoning content from the upstream response is cached for future requests. Display adapters mirror reasoning thoughts into Markdown <details> blocks visible in the client.

Known Limitations

Cursor Sub-Agents

Cursor sub-agents do not inherit custom API base URL or API key settings. This is a Cursor-side bug (see forum thread). Use the main agent (Cmd+Shift+0 to toggle) for direct DeepSeek chat. Sub-agents that route through the proxy will work correctly; those that bypass it fall back to Cursor's built-in models.

Cursor Agent Mode Responses API Format

Cursor Agent mode sends OpenAI Responses API-format payloads to the Chat Completions endpoint. DeepSeek Bridge detects and converts these automatically.

Reasoning Display

Cursor's native reasoning UI is available only for Cursor's own models. For custom endpoints, reasoning content is mirrored into visible Markdown details blocks. Use --no-display-reasoning to disable this behavior.

Development

# Run tests
uv run python -m unittest discover -s tests

# Format and lint
uv run pre-commit run --all-files

# Type check
uv run mypy src/ --check-untyped-defs

# Run with coverage
uv run coverage run -m unittest discover -s tests
uv run coverage report

CLI Reference

Flag Default Description
--headless off Run without TUI
--model deepseek-v4-pro Fallback model when request omits it
--thinking enabled DeepSeek thinking mode
--reasoning-effort max Reasoning effort level
--display-reasoning on Show reasoning content in client UI
--collapsible-reasoning on Use collapsible Markdown for reasoning
--host 127.0.0.1 Bind address
--port 9000 Bind port
--ngrok on Start ngrok tunnel
--base-url https://api.deepseek.com Upstream DeepSeek API URL
--cors on Send CORS headers
--stream-read-timeout 180 SSE read timeout in seconds
--max-thread-pool 20 Max concurrent request threads
--max-pool-connections 10 Max upstream connections
--ngrok-health-check-interval 30 Tunnel health check interval in seconds
--ollama / --no-ollama on Enable/disable Ollama endpoints
--log-dir none Directory for persistent log files
--trace-dir none Directory for request trace dumps
--verbose off Detailed request logging
--compact off One-line-per-request output
--config ~/.deepseek-bridge/config.yaml Config file path
--no-log off Disable all log file output
--reasoning-content-path ~/.deepseek-bridge/reasoning_content.sqlite3 Reasoning cache path
--reasoning-cache-max-age-seconds 604800 Max age of cached reasoning (seconds)
--missing-reasoning-strategy recover Strategy for missing reasoning (recover/reject)
--max-request-body-bytes 20971520 Max request body size in bytes
--clear-reasoning-cache off Clear reasoning cache and exit
--version - Print version and exit

License

MIT License

Acknowledgements

Based on yxlao/deepseek-cursor-proxy, the original project that started this work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepseek_bridge-0.3.3.tar.gz (327.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deepseek_bridge-0.3.3-py3-none-any.whl (69.4 kB view details)

Uploaded Python 3

File details

Details for the file deepseek_bridge-0.3.3.tar.gz.

File metadata

  • Download URL: deepseek_bridge-0.3.3.tar.gz
  • Upload date:
  • Size: 327.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for deepseek_bridge-0.3.3.tar.gz
Algorithm Hash digest
SHA256 85048624846c4f6f28685cff96f8c8fd5e62c178a85ba58a07e711f86ee7745b
MD5 44395139f6d62847d43067b6fe9f9989
BLAKE2b-256 2dcfa2c0d68e2e6a5d40f7f9634efaea672d27640f986d1076e1dca3f0feb2e7

See more details on using hashes here.

File details

Details for the file deepseek_bridge-0.3.3-py3-none-any.whl.

File metadata

File hashes

Hashes for deepseek_bridge-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e47154731cf58809182b17f729bbd76022d235336ad3d84cad03d576494ffe0f
MD5 5d84837a834b06c660db741f5b013ce1
BLAKE2b-256 4199c3e276c3b967556fd060ccef5fed6d103346b882243b3c05cabfac4ec15d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page