Skip to main content

A lightweight, feature-rich proxy for Ollama, designed for development, testing, and staging environments. It simplifies access to remote Ollama instances that are wrapped behind another proxy layer like OpenWebUI. Anthropic and OpenAI compatible endpoints, included.

Project description

Ollama DeProxy GitHub Tag

A lightweight, feature-rich proxy for Ollama, designed for development, testing, and staging environments. It simplifies access to remote Ollama instances that are wrapped behind another proxy layer. Anthropic and OpenAI compatible endpoints, included.

Why Use It?

If you're a developer working locally and need to access a remote Ollama instance that sits behind an application proxy such as OpenWebUI, you may encounter:

  • Additional authorization requirements
  • Wrapped or modified HTTP headers
  • Response compression or transformation
  • Reverse proxy constraints

Ollama DeProxy provides a clean and simple way to:

  • Bypass extra authorization layers
  • Forward requests transparently
  • Control streaming and decoding behavior
  • Restore direct API-like access to the upstream Ollama service

It acts as a thin, configurable HTTP bridge between your local tools and the remote Ollama instance.

Features

  • Transparent Request Forwarding: Acts as a local HTTP server (default port 11434) that forwards all requests to a remote Ollama-compatible API
  • Authentication Handling: Automatically injects custom authentication headers (JWT, API Keys) to bypass upstream proxy layers
  • Response Processing: Supports streaming, decompression (Brotli/Gzip), and header filtering
  • Model Name Correction: Replaces numeric model identifiers with actual model names
  • Response Caching: Caches responses for specific endpoints with TTL-based eviction
  • HTTP/2 Support: Full support for modern upstream connections.
  • Efficient Decoding: Use DECODE_RESPONSE to choose between automatic decompression (Brotli/Gzip) or raw binary passthrough.
  • Anthropic and OpenAI compatible endpoints detection

Quick Start

UVX

pip install uv
uvx ollama-deproxy -h

UV

pip install uv
uv venv
uv pip install ollama-deproxy
uv run ollama-deproxy -h

PIP

mkdir ollama-deproxy
cd ollama-deproxy
python -m venv venv
venv\Scripts\activate
pip install ollama-deproxy
ollama-deproxy -h
usage: ollama-deproxy [-h] [--remote-url REMOTE_URL] [--remote-auth-token REMOTE_AUTH_TOKEN] [--local-port LOCAL_PORT]
                      [--log-level LOG_LEVEL] [--hash-algorithm HASH_ALGORITHM] [--env_path ENV_PATH] [--version]

Run the Ollama DeProxy application.

options:
  -h, --help            show this help message and exit
  --remote-url REMOTE_URL
                        Override REMOTE_URL environment variable
  --remote-auth-token REMOTE_AUTH_TOKEN
                        Override REMOTE_AUTH_TOKEN environment variable
  --local-port LOCAL_PORT
                        Override local_port environment variable
  --log-level LOG_LEVEL
                        Override log level environment variable, default: INFO
  --hash-algorithm HASH_ALGORITHM
                        Override HASH_ALGORITHM environment variable, default: auto
  --env_path ENV_PATH   Override path to .env file
  --version, -v         Version of the application

Start from repository

  1. Clone the repository:
git clone https://github.com/lexxai/ollama-deproxy.git
cd ollama-deproxy
  1. Configure environment variables:
cp .env.example .env
# Edit `.env` with your configuration

Using Docker Compose

Run the following command in your terminal to start the service:

docker compose up -d

This will launch the container with the specified configuration.

Verifying the Connection

You can monitor the initialization and incoming traffic by checking the service logs:

docker compose logs -f
ollama-deproxy-1  | INFO:     Started server process [1]
ollama-deproxy-1  | INFO:     Waiting for application startup.
ollama-deproxy-1  | INFO:     Application startup complete.
ollama-deproxy-1  | INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
ollama-deproxy-1  | INFO:     172.21.0.1:60700 - "POST /api/generate HTTP/1.1" 200 OK

Zero-Auth Local Access

Once the container is active, your local applications can communicate with the remote Ollama instance via:

Local Address: http://localhost:11434

Security: The proxy handles all necessary authentication headers upstream, allowing your local tools to connect seamlessly without managing API keys or complex auth logic.

Installation

  1. Clone the repository:
git clone https://github.com/lexxai/ollama-deproxy.git
cd ollama-deproxy

Option 1 - Using uv (recommended)

uv is a blazing-fast Python package installer and resolver, written in Rust.

  1. Install uv (if not already installed):
pip install uv
# or
curl -LsSf https://astral.sh/uv/install.sh | sh
  1. Set up and sync the environment:
uv venv
uv sync
  1. Configure environment variables:
cp .env.example .env
# Edit `.env` with your configuration
  1. Run the server:
uv run -m src.ollama_deproxy.main

Option 2 - Using pip (fallback)

If you prefer pip, or uv is unavailable:

Windows

python -m venv .venv && .venv\Scripts\activate

macOS / Linux

python -m venv .venv && source .venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure .env:
cp .env.example .env
# Edit `.env` as needed
  1. Run the server:
python -m src.ollama_deproxy.main

If installed as a wheel:

ollama-deproxy

Build as a Package

Build and install as a distributable package:

UV

uv build
# Outputs:
# Successfully built dist/ollama_deproxy-x.y.z.tar.gz
# Successfully built dist/ollama_deproxy-x.y.z-py3-none-any.whl

PIP

Click to expand long output of build Ollama DeProxy
python -m venv .venv
source .venv/bin/activate # or .\venv\Scripts\activate
pip install -e .
Obtaining file:///C:/.../ollama-deproxy
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Installing backend dependencies ... done
  Preparing editable metadata (pyproject.toml) ... done
Collecting cachetools>=7.0.2 (from ollama-deproxy==0.4.1)
  Using cached cachetools-7.0.5-py3-none-any.whl.metadata (5.6 kB)
Collecting fastapi>=0.135.1 (from ollama-deproxy==0.4.1)
  Using cached fastapi-0.135.1-py3-none-any.whl.metadata (30 kB)
Collecting httpx>=0.28.1 (from httpx[brotli,http2,zstd]>=0.28.1->ollama-deproxy==0.4.1)
  Using cached httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB)
...
Building wheels for collected packages: ollama-deproxy
  Building editable for ollama-deproxy (pyproject.toml) ... done
  Created wheel for ollama-deproxy: filename=ollama_deproxy-0.4.1-py3-none-any.whl size=2640 sha256=a896df60372b3a000cd802335e23a405b0c21ce96c66c8994a139309ea8c0c56
  Stored in directory: ...\Temp\pip-ephem-wheel-cache-4tfkacrk\wheels\4e\77\b5\f2d22f84a99bda20761e769c4abe4d2465331adcc1a67f21a4
Successfully built ollama-deproxy
Installing collected packages: brotli, zstandard, websockets, typing-extensions, types-cachetools, pyyaml, python-multipart, python-dotenv, idna, hyperframe, httptools, hpack, h11, colorama, certifi, cachetools, annotated-types, annotated-doc, typing-inspection, pydantic-core, httpcore, h2, click, anyio, watchfiles, uvicorn, starlette, pydantic, httpx, fastapi, ollama-deproxy
Successfully installed annotated-doc-0.0.4 annotated-types-0.7.0 anyio-4.12.1 brotli-1.2.0 cachetools-7.0.5 certifi-2026.2.25 click-8.3.1 colorama-0.4.6 fastapi-0.135.1 h11-0.16.0 h2-4.3.0 hpack-4.1.0 httpcore-1.0.9 httptools-0.7.1 httpx-0.28.1 hyperframe-6.1.0 idna-3.11 ollama-deproxy-0.4.1 pydantic-2.12.5 pydantic-core-2.41.5 python-dotenv-1.2.2 python-multipart-0.0.22 pyyaml-6.0.3 starlette-0.52.1 types-cachetools-6.2.0.20251022 typing-extensions-4.15.0 typing-inspection-0.4.2 uvicorn-0.41.0 watchfiles-1.1.1 websockets-16.0 zstandard-0.25.0

Then run the CLI directly:

UV

uv run --no-dev ollama-deproxy

PIP

ollama-deproxy

PIP

ollama-deproxy

Expected output:

ollama-deproxy --log-level DEBUG

============================================================
🚀 Ollama DeProxy Server vx.y.z
============================================================

2026-03-13 17:58:29 DEBUG:    Starting Ollama DeProxy with DEBUG logging... DEBUG_REQUEST=False,CACHE_ENABLED=True 
2026-03-13 17:58:30 INFO:     Started server process [46908]
2026-03-13 17:58:30 INFO:     Waiting for application startup.
2026-03-13 17:58:30 INFO:     Cache key hash algorithm selected: blake2b
2026-03-13 17:58:30 INFO:     Application startup complete.
2026-03-13 17:58:30 INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
2026-03-13 17:58:41 DEBUG:    *** Finished response for /ollama/api/tags in 00:00.6
2026-03-13 17:58:41 DEBUG:    Cache set for key: ollama/api/tags:get...
2026-03-13 17:58:41 INFO:     127.0.0.1:37460 - "GET /api/tags HTTP/1.1" 200

Environment Configuration

Response Caching

The proxy includes a built-in caching system to improve performance for frequently accessed endpoints controlled by environment variables:

  • CACHE_ENABLED
  • CACHE_MAXSIZE
  • CACHE_TTL
  • HASH_ALGORITHM Includes automatic hash algorithm detection to identify the optimal cache key generation method for your platform and architecture.
    uv run ollama-deproxy
    Ollama DeProxy vx.y.z
    INFO:     Started server process [29256]
    INFO:     Waiting for application startup.
    INFO:ollama_deproxy.best_hash:Cache key hash algorithm auto-selection...
    INFO:ollama_deproxy.cache_base:Cache key hash algorithm auto-selection complete. Can store it on .env file 'HASH_ALGORITHM=blake2b' for skip autodetection next time.
    INFO:     Application startup complete.
    INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
    

Cached Endpoints:

  • /api/tags - Model list
  • /api/models - Model information
  • /api/show - Model details

Benefits:

  • Reduces latency for repeated requests
  • Decreases load on remote Ollama instances
  • Improves response times for model metadata queries

Error Logging & Diagnostics

When the remote server returns an error (HTTP 400+), the proxy interrupts the stream to capture the full context. This allows you to see exactly why the upstream rejected your request.

Example Failure: If you query a model that doesn't exist on the remote host:

ERROR:ollama_deproxy.handlers:Remote Error [400] on https://openwebui.example.com/ollama/api/show {"name":"qwen2.5-coder:1.5b-base1"} {"detail":"Model 'qwen2.5-coder:1.5b-base1' was not found"}

Where is:

Sent Body: {"name":"qwen2.5-coder:1.5b-base1"}
Recv Body: {"detail":"Model 'qwen2.5-coder:1.5b-base1' was not found"}

Example Debug Log:

LOG_LEVEL=DEBUG
Ollama DeProxy vx.y.z
2026-03-13 15:34:08 DEBUG:    Starting Ollama DeProxy with DEBUG logging... DEBUG_REQUEST=False,CACHE_ENABLED=True
2026-03-13 15:34:08 INFO:     Started server process [43460]
2026-03-13 15:34:08 INFO:     Waiting for application startup.
2026-03-13 15:34:08 INFO:     Cache key hash algorithm selected: blake2b
2026-03-13 15:34:08 INFO:     Application startup complete.
2026-03-13 15:34:08 INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
2026-03-13 15:34:57 DEBUG:    *** Finished response for /ollama/api/tags in 00:00.6
2026-03-13 15:34:57 DEBUG:    Cache set for key: ollama/api/tags:get...
2026-03-13 15:34:57 INFO:     127.0.0.1:8327 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:35:37 DEBUG:    Proxying request corrected to 'api/v1/messages' for Anthropic compatibility
2026-03-13 15:35:37 DEBUG:    *** Handling request for path: /api/v1/messages
2026-03-13 15:36:21 INFO:     127.0.0.1:61399 - "POST /v1/messages?beta=true HTTP/1.1" 200
2026-03-13 15:36:23 DEBUG:    *** Finished up stream for /api/v1/messages in 00:46.1
2026-03-13 15:36:37 DEBUG:    Cache hit for key: ollama/api/tags:get...
2026-03-13 15:36:37 INFO:     127.0.0.1:61402 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:38:25 DEBUG:    Cache hit for key: ollama/api/tags:get...
2026-03-13 15:38:25 INFO:     127.0.0.1:61408 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:38:26 DEBUG:    Cache hit for key: ollama/api/tags:get...
2026-03-13 15:38:26 INFO:     127.0.0.1:61411 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:39:03 DEBUG:    Proxying request corrected to 'ollama/v1/chat/completions' for OpenAI compatibility
2026-03-13 15:39:03 DEBUG:    *** Handling request for path: /ollama/v1/chat/completions
2026-03-13 15:39:13 INFO:     127.0.0.1:61414 - "POST /chat/completions HTTP/1.1" 200
2026-03-13 15:39:13 DEBUG:    *** Finished up stream for /ollama/v1/chat/completions in 00:09.3
2026-03-13 15:41:18 INFO:     Shutting down
2026-03-13 15:41:18 INFO:     Waiting for application shutdown.
2026-03-13 15:41:18 INFO:     Application shutdown complete.
2026-03-13 15:41:18 INFO:     Finished server process [43460]


Sleeping for 10 sec before restarting server. Press Ctrl+C to exit.
Restarting server...
2026-03-13 15:41:28 DEBUG:    Using proactor: IocpProactor
2026-03-13 15:41:28 INFO:     Started server process [43460]
2026-03-13 15:41:28 INFO:     Waiting for application startup.
2026-03-13 15:41:28 INFO:     Cache key hash algorithm selected: blake2b
2026-03-13 15:41:28 INFO:     Application startup complete.
2026-03-13 15:41:28 INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)

CLI Usage

In CLI mode, you can use the ollama-deproxy command to start the server. And also can override some environment variables.

uv run ollama-deproxy --help           
usage: ollama-deproxy [-h] [--remote-url REMOTE_URL] [--remote-auth-token REMOTE_AUTH_TOKEN]
                      [--local-port LOCAL_PORT] [--log-level LOG_LEVEL] [--env_path ENV_PATH] [--version]

Run the Ollama DeProxy application.

options:
  -h, --help            show this help message and exit
  --remote-url REMOTE_URL
                        Override REMOTE_URL environment variable
  --remote-auth-token REMOTE_AUTH_TOKEN
                        Override REMOTE_AUTH_TOKEN environment variable
  --local-port LOCAL_PORT
                        Override local_port environment variable
  --log-level LOG_LEVEL
                        Override log level environment variable
  --env_path ENV_PATH   Override path to .env file
  --version, -v         Version of the application

Reference

License

MIT License — see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_deproxy-0.4.3.tar.gz (93.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ollama_deproxy-0.4.3-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file ollama_deproxy-0.4.3.tar.gz.

File metadata

  • Download URL: ollama_deproxy-0.4.3.tar.gz
  • Upload date:
  • Size: 93.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.2

File hashes

Hashes for ollama_deproxy-0.4.3.tar.gz
Algorithm Hash digest
SHA256 497ae41816eb80bfbcb94230f1e88de4b0716aa6b50c066d48a7c0a2def4978f
MD5 f9a52dc8319d2ce901b6bd8666ceaf74
BLAKE2b-256 6d2c168db31c5a4856cbfd0081f71899808b44fe7fb7b9901dd4ff69c14214e8

See more details on using hashes here.

File details

Details for the file ollama_deproxy-0.4.3-py3-none-any.whl.

File metadata

File hashes

Hashes for ollama_deproxy-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9bcfdab91f8f26d13a756f6cd1bae8bda6eebff8c659426ca38dbd721e2407be
MD5 caf5557edab0bc00d9706f505103612e
BLAKE2b-256 0f695a9c77f98d73e7981b1a2f2b5cdad693025b38a1da2858fad74807077807

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page