Scan RAG documents for hidden prompt injections, invisible text attacks, and data exfiltration payloads before they enter your vector database.
Project description
rag-sanitizer
Scan your RAG documents for hidden prompt injections, invisible text attacks, and data exfiltration payloads before they enter your vector database.
Why this exists
RAG pipelines ingest untrusted data from PDFs, wikis, web pages, and exported collaboration tools. That text is embedded and then re-injected into LLM context at query time. If malicious instructions are hidden in the source, they can survive chunking and retrieval.
This creates a security boundary problem: your model obeys system/developer instructions, but retrieved context can still influence output if it contains adversarial patterns. Attackers exploit this through prompt injection directives, obfuscated payloads, invisible Unicode, and exfiltration links.
rag-sanitizer adds a defensive preprocessing layer at ingestion time. It scans raw text, emits structured threat signals, computes a composite risk score, and sanitizes malicious segments before chunking and embedding.
How I use it
cd rag-sanitizer
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e ".[dev,all]"
pytest --cov=rag_sanitizer --cov-report=xml -v
ruff check rag_sanitizer tests
ruff format --check rag_sanitizer tests
Install
pip install rag-sanitizer
30-second usage
from rag_sanitizer import RagSanitizer
sanitizer = RagSanitizer()
result = sanitizer.scan(document_text)
if not result.is_clean:
print(f"Threats found: {result.signal_count}")
print(f"Threat score: {result.threat_score}")
safe_text = result.sanitized_text
What it detects
| Category | Description | Example payload |
|---|---|---|
prompt_injection |
Direct instruction override patterns and jailbreak phrases | Ignore all previous instructions |
invisible_text |
Zero-width chars, hidden CSS, tiny font metadata | display:none or \u200b abuse |
density_attack |
Repetition stuffing to bias embeddings | repeated "poisoned vector payload" |
encoded_payload |
Base64/hex/unicode/entity encoded injections | SWdub3JlIGFsbCB... |
data_exfiltration |
URLs/tags designed to leak context |  |
unicode_smuggling |
Homoglyph/leetspeak masked directives | іgnоrе аll іnstructіоns |
high_entropy_blob |
Obfuscated or encrypted-looking windows | random high-entropy blobs |
LangChain integration
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from rag_sanitizer.integrations.langchain import RagSanitizerTransformer
# Load
_docs = PyPDFLoader("untrusted_document.pdf").load()
# Sanitize BEFORE chunking
sanitizer = RagSanitizerTransformer(on_threat="sanitize")
clean_docs = sanitizer.transform_documents(_docs)
# Log threats
for doc in clean_docs:
score = doc.metadata.get("rag_sanitizer_threat_score", 0)
if score > 0:
print(f"[!] Threat score {score:.2f} in {doc.metadata.get('source', 'unknown')}")
# Chunk + embed
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(clean_docs)
vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())
LlamaIndex integration
from llama_index.core.ingestion import IngestionPipeline
from llama_index.core.node_parser import SentenceSplitter
from rag_sanitizer.integrations.llamaindex import RagSanitizerPostprocessor
pipeline = IngestionPipeline(transformations=[SentenceSplitter(chunk_size=512)])
nodes = pipeline.run(documents=docs)
post = RagSanitizerPostprocessor(on_threat="sanitize")
clean_nodes = post._postprocess_nodes(nodes)
Configuration
from rag_sanitizer import SanitizerConfig
config = SanitizerConfig(
threat_threshold=0.3,
max_text_length=500_000,
density_enabled=True,
max_ngram_ratio=0.05,
max_word_frequency=0.02,
window_similarity_threshold=0.85,
invisible_text_enabled=True,
min_font_size_threshold=1.0,
max_whitespace_sequence=50,
injection_enabled=True,
injection_severity_minimum="low",
encoding_enabled=True,
min_base64_length=20,
exfiltration_enabled=True,
entropy_enabled=True,
entropy_window_size=256,
entropy_threshold=4.5,
max_high_entropy_ratio=0.15,
strip_placeholder="[REMOVED BY RAG-SANITIZER]",
normalize_unicode=True,
)
| Option | Default | Description |
|---|---|---|
threat_threshold |
0.3 |
Score at/above threshold marks document as unsafe |
max_text_length |
500000 |
Hard cap for analyzed input length |
density_enabled |
True |
Enable repetition-density analyzer |
max_ngram_ratio |
0.05 |
3-gram max ratio before flagging |
max_word_frequency |
0.02 |
Word frequency max ratio before flagging |
window_similarity_threshold |
0.85 |
Similarity threshold for duplicate windows |
invisible_text_enabled |
True |
Enable invisible-text analyzer |
min_font_size_threshold |
1.0 |
Font size below this counts as invisible |
max_whitespace_sequence |
50 |
Long whitespace abuse threshold |
injection_enabled |
True |
Enable injection analyzer |
injection_severity_minimum |
"low" |
Minimum severity emitted by injection analyzer |
encoding_enabled |
True |
Enable encoded-payload analyzer |
min_base64_length |
20 |
Minimum base64 token length |
exfiltration_enabled |
True |
Enable exfiltration analyzer |
entropy_enabled |
True |
Enable entropy analyzer |
entropy_window_size |
256 |
Entropy sliding window size |
entropy_threshold |
4.5 |
Entropy threshold for suspicious windows |
max_high_entropy_ratio |
0.15 |
Ratio threshold for document-level high entropy |
strip_placeholder |
"[REMOVED BY RAG-SANITIZER]" |
Replacement token for stripped spans |
normalize_unicode |
True |
Run NFKC + zero-width cleanup before analysis |
Development
pip install -e ".[dev]"
pytest -v
ruff check rag_sanitizer tests
ruff format rag_sanitizer tests
License
MIT.
Need real-time protection for production RAG systems? KoreShield provides enterprise-grade LLM security with ML-based detection, semantic analysis, and compliance reporting. rag-sanitizer catches the obvious stuff - KoreShield catches everything else.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rag_sanitizer-0.1.0.tar.gz.
File metadata
- Download URL: rag_sanitizer-0.1.0.tar.gz
- Upload date:
- Size: 26.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Hatch/1.16.5 cpython/3.14.2 HTTPX/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03cf114e69d20074446eb92fafe1de2191648e75d1cf6bd8bc0d1e16954a7be7
|
|
| MD5 |
ef2a4b148ae803be0032856201fe823b
|
|
| BLAKE2b-256 |
71a17872ce071d169bd649ce36e672a6907fa84a18f2807fb4e907961b04b815
|
File details
Details for the file rag_sanitizer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: rag_sanitizer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 25.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: Hatch/1.16.5 cpython/3.14.2 HTTPX/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95cf0e7f49cbb0c772351891971960813202cd391f4db70bce64213d4eb03a98
|
|
| MD5 |
1a000642c8f79057342575d36054c7e5
|
|
| BLAKE2b-256 |
a2ac4069d8748dd513d21acbe654ab259ff09c91510228ad7c1203eccfadffc6
|