Skip to main content

Guarded RAG: grounded answers, refuse-when-unsupported, PII redaction, and an eval harness with metrics. Stdlib core, bring-your-own model.

Project description

rag-guard

ci License: MIT Python

Guarded RAG: answers grounded in retrieved context, refusal when there's no support, and an eval harness that puts a number on it.

The failure mode of RAG isn't bad retrieval. It's the confident answer with nothing behind it. rag-guard is a small, runnable pipeline that makes that hard: it refuses when retrieval finds no support, checks the answer against the context, redacts PII from the output, and traces every step. Pure-stdlib core, zero runtime dependencies, bring your own model.

"how long is shipping?"  → grounded answer, sources=[ship]      ✓
"quantum chromodynamics?" → refuses (no support), model not called ✓

rag-guard demo: grounded answer, refusal, PII redaction, eval

The three guards

  1. Refuse-when-unsupported. If the top retrieval score is below threshold, the pipeline refuses and never even calls the model. No support, no answer.
  2. Groundedness check. After the model answers, verify the answer is actually backed by the retrieved context; flag it if not. (Lexical-overlap proxy here, swappable for an NLI/LLM judge behind the same interface.)
  3. PII output filter. Emails, phones, SSNs, and card-like numbers are redacted from whatever the model returns.

Every result carries a trace (what was retrieved + scores, refused?, grounded?) so the system is auditable.

Install

pip install guarded-rag

Zero runtime dependencies — it's stdlib all the way down. (PyPI name is guarded-rag — the import is still rag_guard.)

Quickstart

from rag_guard import Retriever, RagGuard
from rag_guard import FakeProvider   # swap for a real model provider

ret = Retriever([
    {"id": "ship",    "text": "Standard shipping takes 3 to 5 business days."},
    {"id": "returns", "text": "Return any item within 30 days for a full refund."},
])
rag = RagGuard(ret, FakeProvider("Shipping takes 3 to 5 business days."))

print(rag.answer("how long does shipping take"))
# {'answer': 'Shipping takes 3 to 5 business days.', 'refused': False,
#  'grounded': True, 'support': 1.0, 'sources': ['ship', 'returns'], 'trace': {...}}

print(rag.answer("quantum chromodynamics")["refused"])   # True: refuses, no support

Measure it (the eval harness)

from rag_guard.evaluate import evaluate
cases = [
    {"query": "how long does shipping take", "gold": "ship", "expect_refusal": False},
    {"query": "quantum chromodynamics",                         "expect_refusal": True},
]
print(evaluate(rag, cases))
# {'n': 2, 'refusal_accuracy': 1.0, 'retrieval_hit_rate': 1.0, 'grounded_rate': 1.0, 'cases': [...]}

Re-run the eval on any model or config change to catch regressions before a user does.

A real run, not a demo fixture. The two cases above are an illustration. They score 1.0 across the board, so don't read anything into them. bin/eval_real.py runs a 20-case labeled set over a 12-doc corpus through a live model (claude -p):

PYTHONPATH=. python3 bin/eval_real.py   # requires claude CLI on PATH
# {'n': 20, 'refusal_accuracy': 0.9, 'retrieval_hit_rate': 1.0, 'grounded_rate': 0.8824}

The two refusal misses were out-of-corpus identity questions ("who's the CEO?") that scored just over threshold, but the groundedness guard still flagged both, so nothing unsupported got through unflagged. Full output lands in eval/results.json.

Bring your own model

The model sits behind a one-method seam: complete(prompt) -> str. FakeProvider keeps tests/CI deterministic and key-free; a real provider drops in without touching the pipeline or guards. Retrieval is the same: the stdlib TF-IDF Retriever is a stand-in for real embeddings / a vector DB behind retrieve().

Real provider

Any object with complete(prompt) -> str works. Here's an Anthropic provider in stdlib only, no SDK required:

import json, os, urllib.request

class AnthropicProvider:
    def __init__(self, model="claude-sonnet-4-5", max_tokens=512):
        self.model, self.max_tokens = model, max_tokens

    def complete(self, prompt: str) -> str:
        req = urllib.request.Request(
            "https://api.anthropic.com/v1/messages",
            data=json.dumps({
                "model": self.model,
                "max_tokens": self.max_tokens,
                "messages": [{"role": "user", "content": prompt}],
            }).encode(),
            headers={
                "x-api-key": os.environ["ANTHROPIC_API_KEY"],
                "anthropic-version": "2023-06-01",
                "content-type": "application/json",
            },
        )
        with urllib.request.urlopen(req) as resp:
            return json.load(resp)["content"][0]["text"]

rag = RagGuard(ret, AnthropicProvider())

Run / test

git clone https://github.com/Jott2121/rag-guard && cd rag-guard
pip install -e ".[dev]" && python -m pytest -q     # tests pass on Python 3.11-3.13
python bin/demo.py                                  # see grounded answer, refusal, PII redaction, eval

CI (badge above) runs the same suite across Python 3.11, 3.12, and 3.13 on every push.

About

Built by Jeff Otterson (Jott2121). Companion to agent-gate (an MCP gate for agent work), bow, fleet-mode, and agent-cost-attribution. This one's job is simple: if the context can't back the answer, the answer doesn't ship. MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

guarded_rag-0.1.0.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

guarded_rag-0.1.0-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file guarded_rag-0.1.0.tar.gz.

File metadata

  • Download URL: guarded_rag-0.1.0.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for guarded_rag-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2e6c95df42c6f5428243ab5354d7d11cbb4e09a059b2f5ce3404e832f3e34251
MD5 71b0c180efdaf44028f7c94a83132f15
BLAKE2b-256 baa51083d4f8a029fc2f2e591a5084a765b35e03f73a9b2f22b3746f404ee74f

See more details on using hashes here.

Provenance

The following attestation bundles were made for guarded_rag-0.1.0.tar.gz:

Publisher: publish.yml on Jott2121/rag-guard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file guarded_rag-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: guarded_rag-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for guarded_rag-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2bc42dab059b45a13a87840b669657d34b7624752d07f60d370178490345ca5c
MD5 12e7dec7fca1c9d982901caa00c9ac09
BLAKE2b-256 44c7cb089e55db9d8e1657c9fb44e8e614a27fa01d13c5f8c798f286490b0e14

See more details on using hashes here.

Provenance

The following attestation bundles were made for guarded_rag-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Jott2121/rag-guard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page