I know I messed it up for both of us and I am sorry, she will understand.. you can go ahead and use the library.
Project description
llm-scope-observer
Lightweight observability and diagnostics for local and self-hosted LLMs.
I know I messed it up for both of us and I am sorry, she will understand.. you can go ahead and use the library.
llm-scope-observer is a small Python package that wraps your local LLM calls (Ollama, FastAPI backends, OpenWebUI integrations, custom Python code) and records:
- Latency per call
- Token usage (input, output, total, tokens/sec)
- CPU / RAM / (optional) GPU utilization
- Simple hallucination-risk heuristics
- Error information
All metrics are stored locally (SQLite by default) and visualized in a small FastAPI-based dashboard.
Features
-
Request interceptor: Decorator to wrap any Python function that calls an LLM.
from llm_scope import monitor import ollama @monitor(model="llama3") def generate(prompt: str) -> str: result = ollama.generate(model="llama3", prompt=prompt) return result["response"]
-
Token estimation:
- Approximate input and output tokens
- Track total tokens and tokens/sec per call
-
System metrics snapshot (per request):
- CPU %
- RAM %
- GPU % (optional, via
pynvmlif installed)
-
Hallucination risk heuristic (simple, signal-based):
- Very long answer vs. short prompt
- Strong claims without references
- Repetition patterns
- Basic self-contradiction patterns
-
Local dashboard:
- FastAPI backend + simple HTML UI
- SQLite storage by default
- Shows latency, token trends, errors, and resource correlation per model
Installation
pip install llm-scope-observer
Optional GPU metrics:
pip install "llm-scope-observer[gpu]"
Requires Python 3.9+.
Quickstart
1. Instrument your LLM call
from llm_scope import monitor
import time
@monitor(model="test-model")
def generate(prompt: str) -> str:
time.sleep(0.1)
return "hello from llm-scope-observer"
Every time generate(...) runs, a record is written to a local SQLite database (llm_scope.db by default).
2. Run the dashboard
After some traffic:
llm-scope ui --host 127.0.0.1 --port 8000
# or
python -m llm_scope.cli ui --host 127.0.0.1 --port 8000
Open:
and you’ll see:
- Average latency per model
- Slowest calls (tail latency)
- Token usage and tokens/sec
- Error counts
- CPU / RAM / GPU vs. latency
- Hallucination score per call
How it works (high level)
-
Middleware / decorator:
@monitor(model="llama3")wraps any function.- Captures
start/endtimes, prompt, response, and errors. - Sends a metrics record to the storage backend.
-
Metrics:
- Token estimation from prompt and response text.
- System stats from
psutil(and optionallypynvml). - Simple heuristics for hallucination risk.
-
Storage:
- SQLite via
sqlite3by default. - One table:
llm_callswith timestamps, model, metrics, error, tags.
- SQLite via
-
Dashboard:
- FastAPI app.
- Reads from the same SQLite file.
- Renders an HTML summary page (no external JS required).
Roadmap
This is an early MVP. Planned next steps include:
- Prompt clustering and slow-prompt detection
- Model A vs. Model B comparison
- Basic alerting hooks and export to tools like Grafana
- Optional HTTP ingestion mode (sidecar / agent model)
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_scope_observer-0.1.0a2.tar.gz.
File metadata
- Download URL: llm_scope_observer-0.1.0a2.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
562fba0db5b4cf9e9fb9eeac8fdbfa3d8a0ed93dbc8f6af54dfc28fd63869f38
|
|
| MD5 |
adb1da3e0b7919a280b94dd3e90e9604
|
|
| BLAKE2b-256 |
0f8856c3a3fb0223b829c45eaf0455b929bf0e4a73223213fd0d0a44c6e64aff
|
File details
Details for the file llm_scope_observer-0.1.0a2-py3-none-any.whl.
File metadata
- Download URL: llm_scope_observer-0.1.0a2-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2b3b32c5167c0a4781a632cbf4aaa2dbe52443998675e2f81bcacb49d186ad5
|
|
| MD5 |
6ddb5cd02d16f1679d44acd4869aa1c3
|
|
| BLAKE2b-256 |
02d985d88ee093e52d02fe27633ca402f7effa6ccc937e5aad827309668c1146
|