Lightweight observability and diagnostics for local and self-hosted LLMs (Ollama, FastAPI, OpenWebUI).I know I messed it up for both of us and I am sorry, she will understand.. you can go ahead and use the library.

Project description

llm-scope-observer

Lightweight observability and diagnostics for local and self-hosted LLMs.

I know I messed it up for both of us and I am sorry, she will understand.. you can go ahead and use the library.

llm-scope-observer is a small Python package that wraps your local LLM calls (Ollama, FastAPI backends, OpenWebUI integrations, custom Python code) and records:

Latency per call
Token usage (input, output, total, tokens/sec)
CPU / RAM / (optional) GPU utilization
Simple hallucination-risk heuristics
Error information

All metrics are stored locally (SQLite by default) and visualized in a small FastAPI-based dashboard.

Features

Request interceptor: Decorator to wrap any Python function that calls an LLM.

from llm_scope import monitor
import ollama

@monitor(model="llama3")
def generate(prompt: str) -> str:
    result = ollama.generate(model="llama3", prompt=prompt)
    return result["response"]

Token estimation:
- Approximate input and output tokens
- Track total tokens and tokens/sec per call
System metrics snapshot (per request):
- CPU %
- RAM %
- GPU % (optional, via pynvml if installed)
Hallucination risk heuristic (simple, signal-based):
- Very long answer vs. short prompt
- Strong claims without references
- Repetition patterns
- Basic self-contradiction patterns
Local dashboard:
- FastAPI backend + simple HTML UI
- SQLite storage by default
- Shows latency, token trends, errors, and resource correlation per model

Installation

pip install llm-scope-observer

Optional GPU metrics:

pip install "llm-scope-observer[gpu]"

Requires Python 3.9+.

Quickstart

1. Instrument your LLM call

from llm_scope import monitor
import time

@monitor(model="test-model")
def generate(prompt: str) -> str:
    time.sleep(0.1)
    return "hello from llm-scope-observer"

Every time generate(...) runs, a record is written to a local SQLite database (llm_scope.db by default).

2. Run the dashboard

After some traffic:

llm-scope ui --host 127.0.0.1 --port 8000
# or
python -m llm_scope.cli ui --host 127.0.0.1 --port 8000

Open:

http://127.0.0.1:8000/

and you’ll see:

Average latency per model
Slowest calls (tail latency)
Token usage and tokens/sec
Error counts
CPU / RAM / GPU vs. latency
Hallucination score per call

How it works (high level)

Middleware / decorator:
- @monitor(model="llama3") wraps any function.
- Captures start/end times, prompt, response, and errors.
- Sends a metrics record to the storage backend.
Metrics:
- Token estimation from prompt and response text.
- System stats from psutil (and optionally pynvml).
- Simple heuristics for hallucination risk.
Storage:
- SQLite via sqlite3 by default.
- One table: llm_calls with timestamps, model, metrics, error, tags.
Dashboard:
- FastAPI app.
- Reads from the same SQLite file.
- Renders an HTML summary page (no external JS required).

Roadmap

This is an early MVP. Planned next steps include:

Prompt clustering and slow-prompt detection
Model A vs. Model B comparison
Basic alerting hooks and export to tools like Grafana
Optional HTTP ingestion mode (sidecar / agent model)

License

MIT

Project details

Release history Release notifications | RSS feed

0.1.0a4 pre-release

Feb 20, 2026

This version

0.1.0a3 pre-release

Feb 20, 2026

0.1.0a2 pre-release

Feb 20, 2026

0.1.0a1 pre-release

Feb 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_scope_observer-0.1.0a3.tar.gz (10.3 kB view details)

Uploaded Feb 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_scope_observer-0.1.0a3-py3-none-any.whl (10.6 kB view details)

Uploaded Feb 20, 2026 Python 3

File details

Details for the file llm_scope_observer-0.1.0a3.tar.gz.

File metadata

Download URL: llm_scope_observer-0.1.0a3.tar.gz
Upload date: Feb 20, 2026
Size: 10.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for llm_scope_observer-0.1.0a3.tar.gz
Algorithm	Hash digest
SHA256	`1fc4cc0be6369d19b5c99d07be6e53b1be7c0f17c5854bdc294183e2ac25b7e7`
MD5	`7ba38b18fd5912f5238f7b935e9cc83f`
BLAKE2b-256	`4f3160c4b29cd6c0b342a4f58e331f7d2584c43246c43ed11e75b3015d6642e6`

See more details on using hashes here.

File details

Details for the file llm_scope_observer-0.1.0a3-py3-none-any.whl.

File metadata

Download URL: llm_scope_observer-0.1.0a3-py3-none-any.whl
Upload date: Feb 20, 2026
Size: 10.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for llm_scope_observer-0.1.0a3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7afadaf64f794ce29e66c7319ad41539cab03f7311f9b70518a233e0a3b2ef53`
MD5	`4447e3943675813eeb68e046c06652d0`
BLAKE2b-256	`6d0a74643154bc8b0ed723092eba00b6965bfc414217b45c418dd211e50a86d9`

See more details on using hashes here.

llm-scope-observer 0.1.0a3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

llm-scope-observer

Features

Installation

Quickstart

1. Instrument your LLM call

2. Run the dashboard

How it works (high level)

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes