OpenAI-compatible LLM proxy with SQLite request capture, observability, and an admin UI.

These details have not been verified by PyPI

Project links

Project description

LLM Observe Proxy

llm-observe-proxy is an OpenAI-compatible, record-only proxy for inspecting LLM traffic. It forwards requests to an upstream /v1 API, stores requests and responses in SQLite, and provides a polished local admin UI for browsing, pretty-printing, trimming, grouping task runs, and changing runtime settings.

It is useful when you want LiteLLM-style observability without introducing a full gateway or external database.

Project repository: https://github.com/shamitv/llm-observe-proxy

Features

OpenAI-compatible passthrough route: ANY /v1/{path:path}.
SQLite capture for request/response headers, bodies, status, timing, model, endpoint, streaming state, tool-call signals, image assets, and errors.
Admin UI for searching and browsing captured traffic, including per-request output TPS.
Runs for grouping all requests made during a task, benchmark, or repro workflow.
Run detail pages with request counts, LLM wall time, token totals, tokens/sec, model and endpoint breakdowns, and signal/error counts.
Detail pages with response render modes for JSON, plain text, Markdown, tool calls, and raw SSE streams.
Request image gallery for data URL and remote image references.
Settings UI for upstream URL, model upstream routes, incoming host/port preferences, all-IPs exposure, and retention trimming.
Config-driven model routes for sending selected proxy-facing model names to different upstream /v1 endpoints with optional upstream model rewrites and API key injection.
No authentication by default, intended for local or trusted development networks.

Install

From PyPI with pip:

python -m pip install llm-observe-proxy
llm-observe-proxy

From PyPI with uv:

uv tool install llm-observe-proxy
llm-observe-proxy

Run it once without installing:

uvx llm-observe-proxy

By default, the proxy listens on:

http://localhost:8080

and forwards requests to:

http://localhost:8000/v1

Open the admin UI:

http://localhost:8080/admin

Usage

Point an OpenAI-compatible client at the proxy:

from openai import OpenAI

client = OpenAI(
    api_key="local-dev-key",
    base_url="http://localhost:8080/v1",
)

response = client.chat.completions.create(
    model="gpt-demo",
    messages=[{"role": "user", "content": "Hello through the proxy"}],
)
print(response.choices[0].message.content)

Run on a different port:

llm-observe-proxy --port 8090

Expose on all interfaces:

llm-observe-proxy --expose-all-ips

Set the upstream from the CLI:

llm-observe-proxy --upstream-url http://localhost:8000/v1

Load model-specific upstream routes from a JSON file:

llm-observe-proxy --models-file .\models.json

You can also change the upstream URL, model upstream routes, and next-start incoming host/port settings from /admin/settings.

Model Routes

Model routes let one proxy endpoint send different client-facing models to different OpenAI-compatible upstreams. Routes match the request payload's top-level model exactly. Unknown models, requests without a JSON model, and generic calls such as GET /v1/models use the global upstream fallback.

Example route file:

[
  {
    "model": "local-qwen",
    "upstream_url": "http://localhost:8000/v1",
    "upstream_model": "qwen3-coder-30b"
  },
  {
    "model": "openai-mini",
    "upstream_url": "https://api.openai.com/v1",
    "upstream_model": "gpt-4.1-mini",
    "api_key_env": "OPENAI_API_KEY"
  }
]

Run with the file:

$env:OPENAI_API_KEY = "sk-..."
llm-observe-proxy --models-file .\models.json

You can also set LLM_OBSERVE_MODELS_JSON to the same JSON array. If both LLM_OBSERVE_MODELS_FILE and LLM_OBSERVE_MODELS_JSON are set, the file wins.

You can add, update, and delete UI-managed model routes from /admin/settings. UI-managed routes are stored in SQLite and take effect immediately. Routes loaded from --models-file, LLM_OBSERVE_MODELS_FILE, or LLM_OBSERVE_MODELS_JSON remain read-only in the UI, and duplicate model names are rejected.

When a route has an API key, the proxy injects Authorization: Bearer <key> for the upstream request. Captured request headers remain the original client headers; injected keys are not stored or shown in the admin UI. UI-managed routes store only api_key_env; prefer api_key_env for shared configs.

Runs

Use Runs when you want to measure or review LLM usage for one bounded task, such as processing a video, comparing local and cloud models, or reproducing an agent issue.

Open /admin/runs or use the run control on /admin.
Enter a required run name and choose Start run.
Run your application or benchmark through the proxy.
Choose End run when the task is complete.

Starting a new run automatically ends any existing active run. Requests made while a run is active are linked to that run; requests outside a run are still captured normally.

The request browser can filter by run, and request rows link back to their run. The run detail page reports LLM wall time from the first request start to the last response completion, plus token totals and tokens/sec metrics. The request table's TPS column shows per-request output tokens per second when token usage and duration are available.

Screenshots and the full developer README are available in the project repository: https://github.com/shamitv/llm-observe-proxy

Routes

ANY /v1/{path:path}: OpenAI-compatible pass-through proxy.
GET /admin: request browser.
GET /admin/requests/{id}: request/response detail view.
GET /admin/runs: run browser and active run controls.
GET /admin/runs/{id}: run metrics and associated request list.
POST /admin/runs/start: start a named run, ending any active run first.
POST /admin/runs/end: end the active run.
GET /admin/settings: upstream settings and retention tools.
POST /admin/settings/incoming: update incoming host/port settings for next startup.
POST /admin/settings/upstream: update upstream URL.
POST /admin/settings/model-routes: create or update a UI-managed model route.
POST /admin/settings/model-routes/delete: delete a UI-managed model route.
POST /admin/trim: delete records older than N days.
GET /healthz: health check.

Configuration

Environment variables:

Variable	Default	Purpose
`LLM_OBSERVE_DATABASE_URL`	`sqlite:///./llm_observe_proxy.sqlite3`	SQLite SQLAlchemy URL.
`LLM_OBSERVE_INCOMING_HOST`	`localhost`	Bind host when not exposing all IPs.
`LLM_OBSERVE_INCOMING_PORT`	`8080`	Bind port.
`LLM_OBSERVE_EXPOSE_ALL_IPS`	`false`	Bind to `0.0.0.0` when true.
`LLM_OBSERVE_UPSTREAM_URL`	`http://localhost:8000/v1`	Upstream OpenAI-compatible `/v1` base URL.
`LLM_OBSERVE_MODELS_JSON`	unset	JSON array of model route objects.
`LLM_OBSERVE_MODELS_FILE`	unset	Path to a JSON file containing model routes. Wins over `LLM_OBSERVE_MODELS_JSON`.
`LLM_OBSERVE_LOG_LEVEL`	`INFO`	Uvicorn log level.

Incoming host/port settings saved in the UI are used on the next process startup; they do not rebind a currently running process.

Tests

.\.venv\Scripts\ruff.exe check src tests
.\.venv\Scripts\python.exe -m compileall -q src tests
.\.venv\Scripts\pytest.exe -q

The test suite starts a fake upstream on localhost:8080/v1, so stop any local process using port 8080 before running tests.

Publishing

See the repository publishing guide for name checks, build commands, and the pre-publish checklist.

License

MIT.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

May 3, 2026

0.1.1

May 1, 2026

0.1.0

May 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_observe_proxy-0.2.0.tar.gz (542.9 kB view details)

Uploaded May 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_observe_proxy-0.2.0-py3-none-any.whl (40.9 kB view details)

Uploaded May 3, 2026 Python 3

File details

Details for the file llm_observe_proxy-0.2.0.tar.gz.

File metadata

Download URL: llm_observe_proxy-0.2.0.tar.gz
Upload date: May 3, 2026
Size: 542.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for llm_observe_proxy-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`1391803dab9df8fa5a923325fc3f5ae7953882be034f09e2b205776a29335ca9`
MD5	`bcd6f116fa2b65083fa31287253f7830`
BLAKE2b-256	`6a09726e97b152ce5917f26c0fc79b8e095932f1481d06b226da7fc1c7ffe166`

See more details on using hashes here.

File details

Details for the file llm_observe_proxy-0.2.0-py3-none-any.whl.

File metadata

Download URL: llm_observe_proxy-0.2.0-py3-none-any.whl
Upload date: May 3, 2026
Size: 40.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for llm_observe_proxy-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4d4593ee36435b6b918e2f0f01961434cade5212edcb98c8d353105ea25851fa`
MD5	`33760bb6e5b34b5ac809c038ee60571e`
BLAKE2b-256	`d3306241584699cab328a228650e5e129d1e1869b9557e12884a9dc10c672dba`

See more details on using hashes here.

llm-observe-proxy 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLM Observe Proxy

Features

Install

Usage

Model Routes

Runs

Routes

Configuration

Tests

Publishing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes