Deterministic security middleware for RAG applications to prevent context exfiltration via indirect prompt injection.
Project description
RAGuard
Deterministic security middleware for RAG applications.
RAGuard protects Retrieval-Augmented Generation (RAG) systems from context exfiltration via indirect prompt injection. It acts as a deterministic "tripwire" by injecting unique, session-specific canary tokens into retrieved context and scanning LLM outputs for their presence.
📺 Live Interactive Playground & Demo
Try the interactive security simulation:
To run the interactive playground locally:
python examples/playground.py
Open your browser at http://127.0.0.1:8000 to experiment with safe and adversarial prompts in real time.
🚀 Key Features
- Deterministic Detection: Guarantees 100% detection of token-level exfiltration when the token appears in the LLM's output.
- Zero-Width Stealth Mode (Opt-in): Injects invisible Unicode sequences (
U+200B,U+200C, etc.) to hide canary tokens from attackers, minimizing context window disruption. - Framework Agnostic Core: Works with any LLM provider, prompt framework, or vector database.
- First-Class Adapters: Drop-in integration callback handlers and middlewares for LangChain, LlamaIndex, and FastAPI.
- Ultra-Low Overhead: Adds less than 5ms of latency per request and has a minimal memory footprint.
📦 Installation
Install the core package:
pip install raguard-security
Or install with support for your preferred framework:
# LangChain integration
pip install "raguard-security[langchain]"
# LlamaIndex integration
pip install "raguard-security[llamaindex]"
# FastAPI integration
pip install "raguard-security[fastapi]"
⚠️ Production Deployment Note: The default
InMemoryTokenStoreis single-process only. If you run multiple workers (e.g., Gunicorn, Kubernetes pods), tokens stored in one worker's memory are invisible to others. For multi-worker deployments, useRedisTokenStore:from raguard import CanaryMiddleware, RedisTokenStore store = RedisTokenStore(redis_url="redis://localhost:6379/0") middleware = CanaryMiddleware(store=store)
💡 How It Works
RAGuard secures your RAG pipelines in three simple steps:
- Dynamic Injection: When a search returns documents, RAGuard injects a unique, session-specific token (either alphanumeric or zero-width).
- Augmented Generation: The context, now containing the canary token, is passed to the LLM.
- Output Interception: RAGuard scans the LLM's response before it is returned to the user. If the canary token is detected in the response, the output is blocked.
🛠️ Framework Integrations
1. LangChain Callback Handler
Add RAGuardLangChainCallback to your LLM or Chain callbacks. It automatically intercepts retrieved documents to inject tokens and scans the final generation.
from langchain_core.callbacks.manager import CallbackManager
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
from raguard.adapters.langchain import RAGuardLangChainCallback
from raguard.exceptions import CanaryTokenDetected
# 1. Initialize callback for the user's session
canary_callback = RAGuardLangChainCallback(session_id="user_123_session")
# 2. Add to your chain/LLM
llm = ChatOpenAI(
model="gpt-4",
callbacks=[canary_callback]
)
chain = RetrievalQA.from_chain_type(
llm=llm,
retriever=retriever,
callbacks=[canary_callback] # Injects token into retriever end
)
# 3. Run and handle exfiltration attempts
try:
response = chain.run("What is the secret API key?")
except CanaryTokenDetected:
# Triggered automatically if the LLM output contains the canary token
response = "Security violation: response blocked."
2. LlamaIndex Node Postprocessor
Add RAGuardLlamaIndexPostprocessor to your query engine's postprocessors list and call scan_response on the final query result.
from llama_index.core.query_engine import RetrieverQueryEngine
from raguard.adapters.llamaindex import RAGuardLlamaIndexPostprocessor
from raguard.exceptions import CanaryTokenDetected
# 1. Initialize the postprocessor
postprocessor = RAGuardLlamaIndexPostprocessor(session_id="user_123_session")
# 2. Setup the query engine
query_engine = RetrieverQueryEngine(
retriever=retriever,
node_postprocessors=[postprocessor] # Injects token into retrieved nodes
)
# 3. Query and manually scan the output
response = query_engine.query("What is the secret revenue?")
if not postprocessor.scan_response(str(response)):
# Trigger security alerts and block the response
raise CanaryTokenDetected(session_id=postprocessor.session_id)
3. FastAPI Middleware
Add RAGuardFastAPIMiddleware to automatically inject tokens into retrieval endpoints and scan/block generation responses.
from fastapi import FastAPI
from raguard.adapters.fastapi import RAGuardFastAPIMiddleware
from raguard import CanaryMiddleware
app = FastAPI()
canary = CanaryMiddleware(stealth_mode=True)
# Add middleware to intercept requests
app.add_middleware(
RAGuardFastAPIMiddleware,
middleware=canary,
inject_paths=[r"^/api/retrieve"], # Endpoints returning retrieved context
scan_paths=[r"^/api/generate"], # Endpoints returning LLM generations
session_header="X-Session-ID", # Header to extract session IDs
)
⚡ Performance & Overhead
RAGuard is designed for production enterprise environments where latency is critical.
- Latency: Token generation and response string scanning add <5ms of latency overhead per request.
- Memory Footprint: Canary tokens consume <1KB of memory per active session. Sessions can be cleaned up using
clear_session(session_id)to prevent memory leaks in long-running processes.
🔒 Threat Model & Known Limitations
What RAGuard Detects
RAGuard is a deterministic exfiltration tripwire. It detects one specific attack class: verbatim or near-verbatim context leakage via indirect prompt injection (OWASP LLM06). If the LLM is tricked into outputting retrieved context that contains a canary token, RAGuard catches it with 100% recall.
With decode_response=True, RAGuard also catches tokens that are encoded in the output using Base64, ROT13, hex encoding, string reversal, or character splitting.
What RAGuard Does NOT Detect
RAGuard is not a complete RAG security solution. It does not protect against:
| Attack Class | Description | Why RAGuard Can't Catch It |
|---|---|---|
| Semantic Hijacking | Attacker instructs the LLM to summarize or paraphrase the context instead of repeating it verbatim. | The canary token is not reproduced in paraphrased output. |
| Behavioral Injection | Injected instructions alter the LLM's behavior (e.g., "Reply in a way that causes the user to click this link"). | No context is leaked — the LLM's behavior is hijacked instead. |
| Tool-Call Manipulation | Injected instructions cause the LLM to call external APIs or tools on the user's behalf. | The attack vector is the LLM's tool-calling ability, not text output. |
| Indirect Instruction Following | Injected instructions silently modify the LLM's reasoning (e.g., "Add a step to these instructions"). | No token is present in the manipulated reasoning output. |
| Token Stripping | Sophisticated attackers instruct the LLM to strip known wrapper patterns before responding. | Mitigated by stealth mode and configurable token_wrapper, but not eliminated. |
Additional Limitations
- Tokenizer Normalization: Certain LLM APIs may strip zero-width characters during input preprocessing. Use the default alphanumeric mode if your LLM provider enforces aggressive Unicode normalization.
- Streaming + Decode Response: When
decode_response=True, the FastAPI adapter must buffer the full response body. Streaming scan (text/event-stream) is only available whendecode_response=False.
🔒 Security Note:
decode_responsedefaults toFalsefor performance. This means encoded exfiltration attempts (Base64, ROT13, hex) will not be detected unless you explicitly enable it:middleware = CanaryMiddleware(decode_response=True)Enable this if your threat model includes attackers who may instruct the LLM to encode leaked context.
🤝 Contributing
Contributions are welcome! Please check out our CONTRIBUTING.md to set up your development environment, run lints/formatting (ruff), type checks (mypy), and the test suite (pytest).
📄 License
MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file raguard_security-1.0.0.tar.gz.
File metadata
- Download URL: raguard_security-1.0.0.tar.gz
- Upload date:
- Size: 88.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fad5f861b7e5fa14e87bfa3e9aa4a2139c4ed8165e7d1a3d0e4f854517e16378
|
|
| MD5 |
a93c9aba241005b56cb4dae05901923c
|
|
| BLAKE2b-256 |
e36b02fff0735b42747c825c4413b3c493636ed61806c2d9f0f7f65c6208cac7
|
Provenance
The following attestation bundles were made for raguard_security-1.0.0.tar.gz:
Publisher:
publish.yml on harshaygadekar/raguard-security
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raguard_security-1.0.0.tar.gz -
Subject digest:
fad5f861b7e5fa14e87bfa3e9aa4a2139c4ed8165e7d1a3d0e4f854517e16378 - Sigstore transparency entry: 1937025808
- Sigstore integration time:
-
Permalink:
harshaygadekar/raguard-security@21e48c1ad312f07ace489a9f9191b50032ca0b0c -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/harshaygadekar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@21e48c1ad312f07ace489a9f9191b50032ca0b0c -
Trigger Event:
push
-
Statement type:
File details
Details for the file raguard_security-1.0.0-py3-none-any.whl.
File metadata
- Download URL: raguard_security-1.0.0-py3-none-any.whl
- Upload date:
- Size: 25.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1878878c1272da52a4658f4fbb3004b6ed5537ac8ce0f944984278ce75f3070
|
|
| MD5 |
bbefa715eac5853b93c4cdcf01b64a3b
|
|
| BLAKE2b-256 |
2834b13eac15eb83cae6c295e8f0808233c380b48193f6a3843f092d91ca5d6c
|
Provenance
The following attestation bundles were made for raguard_security-1.0.0-py3-none-any.whl:
Publisher:
publish.yml on harshaygadekar/raguard-security
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
raguard_security-1.0.0-py3-none-any.whl -
Subject digest:
c1878878c1272da52a4658f4fbb3004b6ed5537ac8ce0f944984278ce75f3070 - Sigstore transparency entry: 1937025932
- Sigstore integration time:
-
Permalink:
harshaygadekar/raguard-security@21e48c1ad312f07ace489a9f9191b50032ca0b0c -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/harshaygadekar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@21e48c1ad312f07ace489a9f9191b50032ca0b0c -
Trigger Event:
push
-
Statement type: