Production-grade silent failure detection for LLM applications — hallucination alerts, PII leak detection, semantic drift, topic guard, and real-time observability
Project description
promptwatch
Production-grade silent failure detection for LLM applications.
Traditional monitoring (Datadog, New Relic) shows 200 OK in 1.2 seconds — but it cannot detect hallucinations, PII leaks, topic drift, or quality degradation in your LLM responses. promptwatch fills that gap.
pip install promptwatch
Why promptwatch?
| Problem | Traditional APM | promptwatch |
|---|---|---|
| Hallucination risk | Blind | Scored 0–1 |
| PII leaks in output | Blind | Detected + alerted |
| Topic drift | Blind | Keyword + coverage guard |
| Toxicity | Blind | Pattern-matched |
| Quality degradation | Blind | Refusal + repetition check |
| Semantic drift over time | Blind | PSI-based drift detector |
Quickstart
from promptwatch import PromptWatcher
watcher = PromptWatcher()
result = watcher.watch(
prompt="What is the capital of France?",
response="The capital of France is Paris.",
)
print(result.passed) # True
print(result.overall_score) # 0.0
print(result.overall_risk) # RiskLevel.LOW
Alert Hooks
from promptwatch import PromptWatcher, AlertEvent
watcher = PromptWatcher(pii_threshold=0.1)
def my_alert(event: AlertEvent):
print(f"ALERT: {event.failure_type.value} — score={event.score:.2f}")
watcher.on_alert(my_alert)
watcher.watch("Tell me about John", "Contact john@example.com at 555-555-5555")
# ALERT: pii_leak — score=0.40
Async Support
import asyncio
from promptwatch import PromptWatcher
watcher = PromptWatcher()
async def main():
result = await watcher.awatch("prompt", "response")
print(result.overall_risk)
asyncio.run(main())
Batch Watching
from promptwatch.advanced import batch_watch, abatch_watch
pairs = [("prompt1", "response1"), ("prompt2", "response2")]
results = batch_watch(watcher, pairs, max_workers=8)
Topic Guard
watcher = PromptWatcher(
topic_allowed=["python", "code", "programming", "function"],
topic_blocked=["politics", "religion", "violence"],
)
result = watcher.watch("How do I code?", "This involves violent politics.")
# topic drift detected
Advanced Features
Caching
from promptwatch.advanced import WatchCache
cache = WatchCache(max_size=512, ttl=300)
cached_watch = cache.memoize(watcher)
result = cached_watch("prompt", "response")
print(cache.stats())
Drift Detection
from promptwatch.advanced import DriftDetector
detector = DriftDetector(threshold=0.1)
detector.set_baseline([0.1, 0.2, 0.1, 0.15])
print(detector.is_drifting([0.5, 0.6, 0.4, 0.55])) # True
Pipeline
from promptwatch.advanced import WatchPipeline
pipeline = WatchPipeline()
pipeline.add_step("log", lambda r: r)
pipeline.filter(lambda r: r.passed)
result = pipeline.run(watch_result)
PII Scrubbing
from promptwatch.advanced import PIIScrubber
scrubber = PIIScrubber()
clean = scrubber.mask("Email me at alice@example.com or call 555-123-4567")
# "Email me at [REDACTED_EMAIL] or call [REDACTED_PHONE]"
Regression Tracking
from promptwatch.advanced import RegressionTracker
tracker = RegressionTracker(tolerance=0.05)
tracker.record("deploy_v1", 0.10)
tracker.record("deploy_v2", 0.25)
print(tracker.is_regressing()) # True
Agent Session Monitoring
from promptwatch.advanced import AgentWatchSession
session = AgentWatchSession(watcher, max_risk_budget=2.0)
for prompt, response in agent_turns:
session.watch_turn(prompt, response)
if session.is_over_budget():
raise RuntimeError("Agent risk budget exceeded")
FastAPI Middleware
from fastapi import FastAPI
from promptwatch import PromptWatcher
from promptwatch.middleware import create_fastapi_middleware
app = FastAPI()
watcher = PromptWatcher()
app.add_middleware(create_fastapi_middleware(watcher))
CLI
promptwatch --prompt "What is the capital?" --response "Paris is the capital."
promptwatch --prompt "Tell me about John" --response "Call 555-123-4567" --json
Configuration
| Parameter | Default | Description |
|---|---|---|
hallucination_threshold |
0.5 | Flag score above this |
pii_threshold |
0.1 | Any PII triggers flag |
toxicity_threshold |
0.3 | Toxic content threshold |
quality_threshold |
0.4 | Low-quality response threshold |
topic_allowed |
None | Keywords expected in response |
topic_blocked |
None | Keywords that always trigger flag |
block_on_critical |
False | Raise exception on CRITICAL risk |
Installation
pip install promptwatch # core only
pip install promptwatch[fastapi] # with FastAPI middleware
pip install promptwatch[flask] # with Flask middleware
pip install promptwatch[opentelemetry] # with OTEL tracing
pip install promptwatch[all] # everything
Keywords
llm monitoring, ai observability, hallucination detection, pii detection, semantic drift, production ai monitoring, llm alerts, ai safety, prompt monitoring, silent failure detection, llm quality, topic drift, ai reliability, llm guardrails, prompt injection, ai production
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_watchdog-1.0.0.tar.gz.
File metadata
- Download URL: llm_watchdog-1.0.0.tar.gz
- Upload date:
- Size: 20.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4428cc00425f46f51524ec9a4abd52778b522aa09283582113407ac1cf3106b
|
|
| MD5 |
27352dec202c7608429f089a6b31b933
|
|
| BLAKE2b-256 |
893e672bac6dfaa238fbd2f1faf566e492ec7c63ad8a360a35a0d80c15fbc4e7
|
File details
Details for the file llm_watchdog-1.0.0-py3-none-any.whl.
File metadata
- Download URL: llm_watchdog-1.0.0-py3-none-any.whl
- Upload date:
- Size: 22.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc6c9983786d418ead5baf7fe7432a3fbc8e56d7af0c0cb51b06af7142453fd6
|
|
| MD5 |
c2a9fd5d5b8cf27b60c23f011db8108c
|
|
| BLAKE2b-256 |
cc68e7f4d789022b5f190d3d62f01d5ad80695c2b1afeb622ead1ff93ab4d5e1
|