A lightweight, feature-rich proxy for Ollama, designed for development, testing, and staging environments. It simplifies access to remote Ollama instances that are wrapped behind another proxy layer like OpenWebUI. Anthropic and OpenAI compatible endpoints, included.
Project description
Ollama DeProxy 
A lightweight, feature-rich proxy for Ollama, designed for development, testing, and staging environments. It simplifies access to remote Ollama instances that are wrapped behind another proxy layer. Anthropic and OpenAI compatible endpoints, included.
Why Use It?
If you're a developer working locally and need to access a remote Ollama instance that sits behind an application proxy such as OpenWebUI, you may encounter:
- Additional authorization requirements
- Wrapped or modified HTTP headers
- Response compression or transformation
- Reverse proxy constraints
Ollama DeProxy provides a clean and simple way to:
- Bypass extra authorization layers
- Forward requests transparently
- Control streaming and decoding behavior
- Restore direct API-like access to the upstream Ollama service
It acts as a thin, configurable HTTP bridge between your local tools and the remote Ollama instance.
Features
- Transparent Request Forwarding: Acts as a local HTTP server (default port
11434) that forwards all requests to a remote Ollama-compatible API - Authentication Handling: Automatically injects custom authentication headers (JWT, API Keys) to bypass upstream proxy layers
- Response Processing: Supports streaming, decompression (Brotli/Gzip), and header filtering
- Model Name Correction: Replaces numeric model identifiers with actual model names
- Response Caching: Caches responses for specific endpoints with TTL-based eviction
- HTTP/2 Support: Full support for modern upstream connections.
- Efficient Decoding: Use
DECODE_RESPONSEto choose between automatic decompression (Brotli/Gzip) or raw binary passthrough. - Anthropic and OpenAI compatible endpoints detection
Quick Start
UVX
pip install uv
uvx ollama-deproxy -h
UV
pip install uv
uv venv
uv pip install ollama-deproxy
uv run ollama-deproxy -h
PIP
mkdir ollama-deproxy
cd ollama-deproxy
python -m venv venv
venv\Scripts\activate
pip install ollama-deproxy
ollama-deproxy -h
usage: ollama-deproxy [-h] [--remote-url REMOTE_URL] [--remote-auth-token REMOTE_AUTH_TOKEN] [--local-port LOCAL_PORT]
[--log-level LOG_LEVEL] [--hash-algorithm HASH_ALGORITHM] [--env_path ENV_PATH] [--version]
Run the Ollama DeProxy application.
options:
-h, --help show this help message and exit
--remote-url REMOTE_URL
Override REMOTE_URL environment variable
--remote-auth-token REMOTE_AUTH_TOKEN
Override REMOTE_AUTH_TOKEN environment variable
--local-port LOCAL_PORT
Override local_port environment variable
--log-level LOG_LEVEL
Override log level environment variable, default: INFO
--hash-algorithm HASH_ALGORITHM
Override HASH_ALGORITHM environment variable, default: auto
--env_path ENV_PATH Override path to .env file
--version, -v Version of the application
Start from repository
- Clone the repository:
git clone https://github.com/lexxai/ollama-deproxy.git
cd ollama-deproxy
- Configure environment variables:
cp .env.example .env
# Edit `.env` with your configuration
Using Docker Compose
Run the following command in your terminal to start the service:
docker compose up -d
This will launch the container with the specified configuration.
Verifying the Connection
You can monitor the initialization and incoming traffic by checking the service logs:
docker compose logs -f
ollama-deproxy-1 | INFO: Started server process [1]
ollama-deproxy-1 | INFO: Waiting for application startup.
ollama-deproxy-1 | INFO: Application startup complete.
ollama-deproxy-1 | INFO: Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
ollama-deproxy-1 | INFO: 172.21.0.1:60700 - "POST /api/generate HTTP/1.1" 200 OK
Zero-Auth Local Access
Once the container is active, your local applications can communicate with the remote Ollama instance via:
Local Address: http://localhost:11434
Security: The proxy handles all necessary authentication headers upstream, allowing your local tools to connect seamlessly without managing API keys or complex auth logic.
Installation
- Clone the repository:
git clone https://github.com/lexxai/ollama-deproxy.git
cd ollama-deproxy
Option 1 - Using uv (recommended)
uv is a blazing-fast Python package installer and resolver, written in Rust.
- Install
uv(if not already installed):
pip install uv
# or
curl -LsSf https://astral.sh/uv/install.sh | sh
- Set up and sync the environment:
uv venv
uv sync
- Configure environment variables:
cp .env.example .env
# Edit `.env` with your configuration
- Run the server:
uv run -m src.ollama_deproxy.main
Option 2 - Using pip (fallback)
If you prefer pip, or uv is unavailable:
Windows
python -m venv .venv && .venv\Scripts\activate
macOS / Linux
python -m venv .venv && source .venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Configure
.env:
cp .env.example .env
# Edit `.env` as needed
- Run the server:
python -m src.ollama_deproxy.main
If installed as a wheel:
ollama-deproxy
Build as a Package
Build and install as a distributable package:
UV
uv build
# Outputs:
# Successfully built dist/ollama_deproxy-x.y.z.tar.gz
# Successfully built dist/ollama_deproxy-x.y.z-py3-none-any.whl
PIP
Click to expand long output of build Ollama DeProxy
python -m venv .venv
source .venv/bin/activate # or .\venv\Scripts\activate
pip install -e .
Obtaining file:///C:/.../ollama-deproxy
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... done
Installing backend dependencies ... done
Preparing editable metadata (pyproject.toml) ... done
Collecting cachetools>=7.0.2 (from ollama-deproxy==0.4.1)
Using cached cachetools-7.0.5-py3-none-any.whl.metadata (5.6 kB)
Collecting fastapi>=0.135.1 (from ollama-deproxy==0.4.1)
Using cached fastapi-0.135.1-py3-none-any.whl.metadata (30 kB)
Collecting httpx>=0.28.1 (from httpx[brotli,http2,zstd]>=0.28.1->ollama-deproxy==0.4.1)
Using cached httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB)
...
Building wheels for collected packages: ollama-deproxy
Building editable for ollama-deproxy (pyproject.toml) ... done
Created wheel for ollama-deproxy: filename=ollama_deproxy-0.4.1-py3-none-any.whl size=2640 sha256=a896df60372b3a000cd802335e23a405b0c21ce96c66c8994a139309ea8c0c56
Stored in directory: ...\Temp\pip-ephem-wheel-cache-4tfkacrk\wheels\4e\77\b5\f2d22f84a99bda20761e769c4abe4d2465331adcc1a67f21a4
Successfully built ollama-deproxy
Installing collected packages: brotli, zstandard, websockets, typing-extensions, types-cachetools, pyyaml, python-multipart, python-dotenv, idna, hyperframe, httptools, hpack, h11, colorama, certifi, cachetools, annotated-types, annotated-doc, typing-inspection, pydantic-core, httpcore, h2, click, anyio, watchfiles, uvicorn, starlette, pydantic, httpx, fastapi, ollama-deproxy
Successfully installed annotated-doc-0.0.4 annotated-types-0.7.0 anyio-4.12.1 brotli-1.2.0 cachetools-7.0.5 certifi-2026.2.25 click-8.3.1 colorama-0.4.6 fastapi-0.135.1 h11-0.16.0 h2-4.3.0 hpack-4.1.0 httpcore-1.0.9 httptools-0.7.1 httpx-0.28.1 hyperframe-6.1.0 idna-3.11 ollama-deproxy-0.4.1 pydantic-2.12.5 pydantic-core-2.41.5 python-dotenv-1.2.2 python-multipart-0.0.22 pyyaml-6.0.3 starlette-0.52.1 types-cachetools-6.2.0.20251022 typing-extensions-4.15.0 typing-inspection-0.4.2 uvicorn-0.41.0 watchfiles-1.1.1 websockets-16.0 zstandard-0.25.0
Then run the CLI directly:
UV
uv run --no-dev ollama-deproxy
PIP
ollama-deproxy
PIP
ollama-deproxy
Expected output:
ollama-deproxy --log-level DEBUG
============================================================
🚀 Ollama DeProxy Server vx.y.z
============================================================
2026-03-13 17:58:29 DEBUG: Starting Ollama DeProxy with DEBUG logging... DEBUG_REQUEST=False,CACHE_ENABLED=True
2026-03-13 17:58:30 INFO: Started server process [46908]
2026-03-13 17:58:30 INFO: Waiting for application startup.
2026-03-13 17:58:30 INFO: Cache key hash algorithm selected: blake2b
2026-03-13 17:58:30 INFO: Application startup complete.
2026-03-13 17:58:30 INFO: Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
2026-03-13 17:58:41 DEBUG: *** Finished response for /ollama/api/tags in 00:00.6
2026-03-13 17:58:41 DEBUG: Cache set for key: ollama/api/tags:get...
2026-03-13 17:58:41 INFO: 127.0.0.1:37460 - "GET /api/tags HTTP/1.1" 200
Environment Configuration
Response Caching
The proxy includes a built-in caching system to improve performance for frequently accessed endpoints controlled by environment variables:
- CACHE_ENABLED
- CACHE_MAXSIZE
- CACHE_TTL
- HASH_ALGORITHM
Includes automatic hash algorithm detection to identify the optimal cache key generation method for your platform and
architecture.
uv run ollama-deproxy Ollama DeProxy vx.y.z INFO: Started server process [29256] INFO: Waiting for application startup. INFO:ollama_deproxy.best_hash:Cache key hash algorithm auto-selection... INFO:ollama_deproxy.cache_base:Cache key hash algorithm auto-selection complete. Can store it on .env file 'HASH_ALGORITHM=blake2b' for skip autodetection next time. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
Cached Endpoints:
/api/tags- Model list/api/models- Model information/api/show- Model details
Benefits:
- Reduces latency for repeated requests
- Decreases load on remote Ollama instances
- Improves response times for model metadata queries
Error Logging & Diagnostics
When the remote server returns an error (HTTP 400+), the proxy interrupts the stream to capture the full context. This allows you to see exactly why the upstream rejected your request.
Example Failure: If you query a model that doesn't exist on the remote host:
ERROR:ollama_deproxy.handlers:Remote Error [400] on https://openwebui.example.com/ollama/api/show {"name":"qwen2.5-coder:1.5b-base1"} {"detail":"Model 'qwen2.5-coder:1.5b-base1' was not found"}
Where is:
Sent Body: {"name":"qwen2.5-coder:1.5b-base1"}
Recv Body: {"detail":"Model 'qwen2.5-coder:1.5b-base1' was not found"}
Example Debug Log:
LOG_LEVEL=DEBUG
Ollama DeProxy vx.y.z
2026-03-13 15:34:08 DEBUG: Starting Ollama DeProxy with DEBUG logging... DEBUG_REQUEST=False,CACHE_ENABLED=True
2026-03-13 15:34:08 INFO: Started server process [43460]
2026-03-13 15:34:08 INFO: Waiting for application startup.
2026-03-13 15:34:08 INFO: Cache key hash algorithm selected: blake2b
2026-03-13 15:34:08 INFO: Application startup complete.
2026-03-13 15:34:08 INFO: Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
2026-03-13 15:34:57 DEBUG: *** Finished response for /ollama/api/tags in 00:00.6
2026-03-13 15:34:57 DEBUG: Cache set for key: ollama/api/tags:get...
2026-03-13 15:34:57 INFO: 127.0.0.1:8327 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:35:37 DEBUG: Proxying request corrected to 'api/v1/messages' for Anthropic compatibility
2026-03-13 15:35:37 DEBUG: *** Handling request for path: /api/v1/messages
2026-03-13 15:36:21 INFO: 127.0.0.1:61399 - "POST /v1/messages?beta=true HTTP/1.1" 200
2026-03-13 15:36:23 DEBUG: *** Finished up stream for /api/v1/messages in 00:46.1
2026-03-13 15:36:37 DEBUG: Cache hit for key: ollama/api/tags:get...
2026-03-13 15:36:37 INFO: 127.0.0.1:61402 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:38:25 DEBUG: Cache hit for key: ollama/api/tags:get...
2026-03-13 15:38:25 INFO: 127.0.0.1:61408 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:38:26 DEBUG: Cache hit for key: ollama/api/tags:get...
2026-03-13 15:38:26 INFO: 127.0.0.1:61411 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:39:03 DEBUG: Proxying request corrected to 'ollama/v1/chat/completions' for OpenAI compatibility
2026-03-13 15:39:03 DEBUG: *** Handling request for path: /ollama/v1/chat/completions
2026-03-13 15:39:13 INFO: 127.0.0.1:61414 - "POST /chat/completions HTTP/1.1" 200
2026-03-13 15:39:13 DEBUG: *** Finished up stream for /ollama/v1/chat/completions in 00:09.3
2026-03-13 15:41:18 INFO: Shutting down
2026-03-13 15:41:18 INFO: Waiting for application shutdown.
2026-03-13 15:41:18 INFO: Application shutdown complete.
2026-03-13 15:41:18 INFO: Finished server process [43460]
Sleeping for 10 sec before restarting server. Press Ctrl+C to exit.
Restarting server...
2026-03-13 15:41:28 DEBUG: Using proactor: IocpProactor
2026-03-13 15:41:28 INFO: Started server process [43460]
2026-03-13 15:41:28 INFO: Waiting for application startup.
2026-03-13 15:41:28 INFO: Cache key hash algorithm selected: blake2b
2026-03-13 15:41:28 INFO: Application startup complete.
2026-03-13 15:41:28 INFO: Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
CLI Usage
In CLI mode, you can use the ollama-deproxy command to start the server. And also can override some environment variables.
uv run ollama-deproxy --help
usage: ollama-deproxy [-h] [--remote-url REMOTE_URL] [--remote-auth-token REMOTE_AUTH_TOKEN]
[--local-port LOCAL_PORT] [--log-level LOG_LEVEL] [--env_path ENV_PATH] [--version]
Run the Ollama DeProxy application.
options:
-h, --help show this help message and exit
--remote-url REMOTE_URL
Override REMOTE_URL environment variable
--remote-auth-token REMOTE_AUTH_TOKEN
Override REMOTE_AUTH_TOKEN environment variable
--local-port LOCAL_PORT
Override local_port environment variable
--log-level LOG_LEVEL
Override log level environment variable
--env_path ENV_PATH Override path to .env file
--version, -v Version of the application
Reference
- https://github.com/ollama/ollama
- https://docs.openwebui.com/reference/api-endpoints#-ollama-api-proxy-support
- https://lexxai.blogspot.com/2026/02/ollama-deproxy-ollama.html
- https://lexxai.github.io
License
MIT License — see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ollama_deproxy-0.4.3.tar.gz.
File metadata
- Download URL: ollama_deproxy-0.4.3.tar.gz
- Upload date:
- Size: 93.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
497ae41816eb80bfbcb94230f1e88de4b0716aa6b50c066d48a7c0a2def4978f
|
|
| MD5 |
f9a52dc8319d2ce901b6bd8666ceaf74
|
|
| BLAKE2b-256 |
6d2c168db31c5a4856cbfd0081f71899808b44fe7fb7b9901dd4ff69c14214e8
|
File details
Details for the file ollama_deproxy-0.4.3-py3-none-any.whl.
File metadata
- Download URL: ollama_deproxy-0.4.3-py3-none-any.whl
- Upload date:
- Size: 24.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bcfdab91f8f26d13a756f6cd1bae8bda6eebff8c659426ca38dbd721e2407be
|
|
| MD5 |
caf5557edab0bc00d9706f505103612e
|
|
| BLAKE2b-256 |
0f695a9c77f98d73e7981b1a2f2b5cdad693025b38a1da2858fad74807077807
|