5-dimensional production drift detection for RAG systems.
Project description
ragvitals
Five-dimensional production drift detection for RAG systems. Library, not a platform — bring your own time-series store.
Why
Production RAG rots in five dimensions:
- Query distribution — users start asking different questions
- Retrieval relevance — top-k recall silently falls after a re-index
- Embedding drift — corpus or query embeddings shift vs the snapshot you tuned on
- Response quality — LLM-as-judge scores degrade
- Judge drift — the judge itself drifts, and you can't tell whether the system improved or the ruler moved
Existing tools cover one or two of these. ragvitals composes the five with the same time-series store, alarming, and replay path. No platform lock-in.
Install
pip install ragvitals
# optional: CloudWatch sink
pip install "ragvitals[aws]"
Quickstart
from datetime import datetime
from ragvitals import (
Detector, Trace,
QueryDistribution, RetrievalRelevance, ResponseQuality, JudgeDrift,
InMemorySink,
)
# Reference set: queries the system was tuned on
reference_embeddings = [...]
reference_judge_scores = {"ref-1": 0.92, "ref-2": 0.88, "ref-3": 0.95}
q = QueryDistribution(); q.set_reference(reference_embeddings)
j = JudgeDrift(); j.set_reference(reference_judge_scores)
det = Detector(
dimensions=[
q,
RetrievalRelevance(metric="hit_rate", k=10),
ResponseQuality(score_keys=["faithfulness", "relevance"]),
j,
],
sinks=[InMemorySink()],
)
# Ingest traces from your live pipeline
for trace in stream_of_traces():
det.ingest(trace)
report = det.report()
print(report.degraded) # ['RetrievalRelevance']
print(report.healthy) # False
det.commit_window() # roll trailing baselines forward at end of comparison interval
What a Trace looks like
Trace(
timestamp=datetime.utcnow(),
query="What's the baggage allowance on a Wanna Get Away fare?",
query_embedding=[...], # required by QueryDistribution / EmbeddingDrift
retrieved_doc_ids=["d1", "d2"],
retrieval_scores=[0.91, 0.83],
relevance_labels=[1, 0, 0, 0, 0], # binary 0/1 per retrieved doc; required by RetrievalRelevance
response="Up to 2 free checked bags...",
judge_scores={"faithfulness": 0.92, "relevance": 0.88}, # required by ResponseQuality / JudgeDrift
metadata={"reference_id": "ref-1"}, # required by JudgeDrift
)
Each dimension only needs the fields it cares about. Missing fields produce OK-with-empty-sample reports rather than errors.
Sinks
from ragvitals import InMemorySink, JSONLSink, CloudWatchSink
InMemorySink() # tests, REPL
JSONLSink(path="/var/log/ragvitals.jsonl") # cheap, append-only
CloudWatchSink(namespace="rag/prod") # boto3-backed, requires `pip install ragvitals[aws]`
Replay against a frozen pipeline
det.ingest_jsonl("s3-or-local-path-to/traces.jsonl")
report = det.report()
What it explicitly is not
- Not a tracing tool. Bring your own JSONL / OpenTelemetry / Phoenix upstream.
- Not an annotation UI.
- Not a replacement for Ragas (which does offline eval on a golden set).
- Not Arize/Phoenix — those are platforms; this is a library that writes to a sink you choose.
Sibling libraries
If your RAG runs on AWS Bedrock, two companion libraries:
- bedrockcache — audit Anthropic prompt caching across the Bedrock + LiteLLM + Strands stack.
- bedrockstack — Bedrock-aware retry policy, cost ledger, streaming-error normalization.
- ragvitals (this) — 5-dimensional production drift detection for the RAG pipeline above.
Roadmap
- v0.2: pluggable statistical tests (KS, MWU) instead of z-score-only.
- v0.3:
Detector.replay(snapshot=...)against a saved baseline snapshot. - v0.4: drift attribution (which docs / users / queries are most responsible).
Develop
pip install -e ".[dev]"
pytest -v
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ragvitals-0.1.0.tar.gz.
File metadata
- Download URL: ragvitals-0.1.0.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
beba504272e7426c6c74cf105403188b3b154b666f916326b66c33b1f5808d96
|
|
| MD5 |
14b065899ee79282fe5ef1d5f0195a97
|
|
| BLAKE2b-256 |
6d19e88f1c97815906ab88cd52076307784602c3e2dfe6b5552d8be733a41222
|
File details
Details for the file ragvitals-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ragvitals-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2514cdf3416527c9adbfce3a37f2188d2b39f99a1e497decf5a83f39019e8b31
|
|
| MD5 |
1248666637e5c21fc57ce23ced4a3ebb
|
|
| BLAKE2b-256 |
6c19128a69331919835ac56174af127f5df63352c37d53598fdc78bcf96dca79
|