Production radar for LLM apps — capture a baseline, detect when latency, cost, or behavior drifts.
Project description
promptmetrics
Production radar for LLM apps. Capture a baseline of live traffic, get alerted when latency, cost, or behavior drifts.
promptmetrics records every LLM call to a local SQLite database, computes a statistical fingerprint of "what good looked like at deploy time," and tells you when the recent window has drifted. Single file, pip-installable, no account, no SaaS bill.
Install
pip install promptmetrics
Requires Python 3.10+.
5-minute quickstart
1. Decorate the call you care about
from openai import OpenAI
from promptmetrics import track
client = OpenAI()
@track("summarize_v1", model="gpt-4o-mini")
def summarize(text: str):
return client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": f"Summarize: {text}"}],
)
That's it. Every call is appended to ~/.promptmetrics/promptmetrics.db with input, output, latency, and token counts. The decorator never raises if storage fails — your app keeps running.
2. Capture a baseline once you have history
promptmetrics baseline summarize_v1 --window 168
Summarises the last 7 days of traces (mean / p50 / p95 / p99 latency, mean tokens) and stores them as the active baseline.
3. Check for drift
promptmetrics check summarize_v1 --window 1
Compares the most recent hour against the baseline and prints a report. Exits non-zero on DRIFTED so it composes with cron, CI, and shell pipelines.
Try it without an LLM
git clone https://github.com/pallaprolus/promptmetrics && cd promptmetrics
pip install -e .
python demo.py
promptmetrics baseline demo --db ./demo.db --window 24 --min-samples 100
promptmetrics check demo --db ./demo.db --window 1
The demo.py script seeds 300 healthy traces and 60 deliberately drifted ones so you can see a real DRIFTED report on your first run.
What it detects
| Detector | Method | Default threshold |
|---|---|---|
| Latency | Kolmogorov–Smirnov test on the latency distribution plus a percentile-ratio check on p95 | WARNING at +15% p95, DRIFTED at +30% p95 |
| Cost | Mean total-tokens ratio vs baseline | WARNING at +15%, DRIFTED at +30% |
The KS test only fires when the recent window is slower than the baseline — a faster system is good news, not an alert.
Programmatic API
from promptmetrics import PromptMetrics
with PromptMetrics() as r:
baseline = r.capture_baseline("summarize_v1", window_hours=168)
report = r.check_drift("summarize_v1", window_hours=1)
print(report.severity)
for result in report.results:
print(result.drift_type, result.severity, result.detail)
Custom token / output extractors
If your call returns something promptmetrics can't introspect, pass extractors:
@track(
"rag_query",
extract_output=lambda r: r.answer,
extract_tokens=lambda r: (r.usage.input_tokens, r.usage.output_tokens),
)
def rag_query(question: str): ...
OpenAI- and Anthropic-style usage objects are detected automatically.
What's deliberately out of scope (for v0.1)
- Slack / Discord / PagerDuty alerting
- Semantic / quality drift (LLM-as-judge, embedding similarity)
- Hosted dashboard
- Multi-baseline versioning, A/B comparison
- Cloud sync
These are planned for v0.2+. The schema already reserves loop_id and step_index columns for the next feature on the roadmap: agent-loop drift detection for multi-step agents.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file promptmetrics-0.1.0.tar.gz.
File metadata
- Download URL: promptmetrics-0.1.0.tar.gz
- Upload date:
- Size: 16.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
76d03e041e8168a11d0322af586fffb3aee29658f50b9654b169513bb2bca649
|
|
| MD5 |
242afa69fec079a83ccfbe4eb47b50be
|
|
| BLAKE2b-256 |
fddd26f9b1a8a7e49cfd4a38cf6a7be92afc4d8d5f4e2e89c55f4ccef9631783
|
Provenance
The following attestation bundles were made for promptmetrics-0.1.0.tar.gz:
Publisher:
publish.yml on pallaprolus/promptmetrics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
promptmetrics-0.1.0.tar.gz -
Subject digest:
76d03e041e8168a11d0322af586fffb3aee29658f50b9654b169513bb2bca649 - Sigstore transparency entry: 1430069467
- Sigstore integration time:
-
Permalink:
pallaprolus/promptmetrics@845068a2d0e2ce41f7f3909ee05eacce0551c584 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/pallaprolus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@845068a2d0e2ce41f7f3909ee05eacce0551c584 -
Trigger Event:
push
-
Statement type:
File details
Details for the file promptmetrics-0.1.0-py3-none-any.whl.
File metadata
- Download URL: promptmetrics-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8950ab372f2b2abc15811d722d09abe1dff59aa8634fe6066788e37d46c97880
|
|
| MD5 |
ef9d7d685d94c480ef72433a868f895b
|
|
| BLAKE2b-256 |
139ff0bb509983dcc8cf7bcb7fcb329d2d7d00b7e6d80ceedffbe152b429eb48
|
Provenance
The following attestation bundles were made for promptmetrics-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on pallaprolus/promptmetrics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
promptmetrics-0.1.0-py3-none-any.whl -
Subject digest:
8950ab372f2b2abc15811d722d09abe1dff59aa8634fe6066788e37d46c97880 - Sigstore transparency entry: 1430069618
- Sigstore integration time:
-
Permalink:
pallaprolus/promptmetrics@845068a2d0e2ce41f7f3909ee05eacce0551c584 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/pallaprolus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@845068a2d0e2ce41f7f3909ee05eacce0551c584 -
Trigger Event:
push
-
Statement type: