RAG provenance, write gating, and drift detection for AI agents — EU AI Act compliance infrastructure

These details have not been verified by PyPI

Project links

Project description

air-rag-trust

EU AI Act compliance infrastructure for RAG knowledge bases: Document provenance tracking, write gating, and drift detection that makes your retrieval-augmented generation pipeline compliant with Articles 10, 11, 12, and 15.

Part of the AIR Blackbox ecosystem.

EU AI Act high-risk obligations take effect December 2, 2027 (deferred from August 2026 by the EU's 2026 omnibus agreement). Regulators expect preparation to be underway now. If your agents retrieve documents and take actions based on them, you need provenance and integrity controls. See full compliance mapping →

The Problem

RAG poisoning is a persistence attack: an attacker injects malicious documents into your knowledge base, and every future query returns attacker-controlled content. Unlike prompt injection (which is transient), RAG poisoning persists until the document is removed.

air-rag-trust adds three layers of defense:

Provenance Tracking: Every document gets a SHA-256 hash, source attribution, trust classification, and tamper-evident audit chain (HMAC-SHA256).
Write Gating: Policy-based controls on who/what can add or modify documents. Source allowlists, content pattern blocking, rate limits, and actor permissions.
Drift Detection: Monitors retrieval patterns and alerts on anomalies: new sources, trust level shifts, volume spikes, document dominance, and new document bursts.

Quick Start

pip install air-rag-trust

from air_rag_trust import AirRagTrust, WritePolicy, TrustLevel

# Create trust layer with write policy
trust = AirRagTrust(
    write_policy=WritePolicy(
        allowed_sources=["internal://*", "https://docs.company.com/*"],
        blocked_content_patterns=[r"ignore previous instructions", r"system prompt"],
    )
)

# Gate-check and register a document
result = trust.ingest(
    "Our refund policy allows returns within 30 days...",
    source="internal://policies/refund.md",
    actor="data-pipeline",
)

if result["allowed"]:
    print(f"Document {result['doc_id']} registered (trust: {result['trust_level']})")

# Record retrievals for drift monitoring
trust.record_retrieval(
    query="What is our refund policy?",
    doc_ids=[result["doc_id"]],
    sources=["internal://policies/refund.md"],
    trust_levels=["standard"],
)

# Check for anomalies
alerts = trust.check_drift()
for alert in alerts:
    print(f"[{alert.severity.value}] {alert.message}")

# Verify audit chain integrity
assert trust.verify_chain()

# Export compliance evidence
evidence = trust.export_evidence()

CLI

Scan a knowledge base directory for provenance auditing:

# Audit current directory
air-rag-trust .

# Audit specific path with verbose output
air-rag-trust /path/to/knowledge-base --verbose

# JSON output for CI pipelines
air-rag-trust /path/to/kb --json

# Custom file extensions
air-rag-trust /path/to/kb -e .md .txt .pdf

Write Gate Policies

Control what enters your knowledge base:

from air_rag_trust import WritePolicy, TrustLevel

policy = WritePolicy(
    # Source controls
    allowed_sources=["internal://*", "https://docs.company.com/*"],
    blocked_sources=["https://untrusted.com/*"],
    require_source=True,

    # Actor controls
    allowed_actors=["data-pipeline", "admin", "content-team"],

    # Content controls
    max_document_size_bytes=1_000_000,
    blocked_content_patterns=[
        r"ignore previous instructions",
        r"you are now",
        r"system prompt",
        r"<script>",
    ],

    # Rate controls
    max_writes_per_minute=60,
    max_bulk_import_size=100,

    # Trust
    min_trust_for_auto_add=TrustLevel.TRUSTED,
    require_approval_for_untrusted=True,
)

Drift Detection

Monitor retrieval patterns for signs of knowledge base compromise:

from air_rag_trust import DriftConfig

config = DriftConfig(
    baseline_window_size=100,           # Retrievals for baseline
    detection_window_size=20,           # Recent window to compare
    new_source_alert=True,              # Alert on new sources
    untrusted_ratio_threshold=0.3,      # Alert if >30% untrusted
    volume_spike_multiplier=3.0,        # Alert on 3x volume
    single_doc_dominance_threshold=0.5, # Alert if one doc >50%
    new_doc_burst_threshold=5,          # Alert on burst of new docs
)

Alert types:

Alert	Severity	Indicates
`new_source`	Warning	Previously unseen source in retrievals
`trust_shift`	Critical	Untrusted content ratio exceeds threshold
`volume_spike`	Warning	Abnormal retrieval volume
`doc_dominance`	Warning	Single document dominates retrievals
`new_doc_burst`	Critical	Many new documents appear suddenly

EU AI Act Compliance Coverage

Article	Requirement	air-rag-trust Feature
Art. 10: Data Governance	Data quality and governance practices	Write gating, source allowlists, content validation
Art. 11: Technical Documentation	System documentation and audit trails	Provenance records, tamper-evident chain
Art. 12: Record-Keeping	Automatic event logging	HMAC-SHA256 audit chain, write event history
Art. 15: Robustness	Resilience against manipulation	Drift detection, pattern blocking, quarantine

API Reference

# Unified plugin
trust = AirRagTrust(write_policy=..., drift_config=...)
trust.ingest(content, source, actor)     # Gate-check + register
trust.ingest_approved(content, source)   # Bypass gate (pre-approved)
trust.bulk_ingest(documents, actor)      # Batch ingestion
trust.record_retrieval(query, doc_ids, sources, trust_levels)
trust.check_drift()                      # Manual drift check
trust.quarantine(doc_id, reason)         # Exclude from retrieval
trust.get_retrievable_ids()              # Safe doc IDs
trust.verify_chain()                     # Audit chain integrity
trust.get_stats()                        # Combined statistics
trust.export_evidence()                  # Compliance evidence bundle
trust.on_alert(callback)                 # Register alert handler

# Individual components also available
from air_rag_trust import ProvenanceTracker, WriteGate, DriftDetector

AIR Blackbox Ecosystem

Package	Framework	Install
air-langchain-trust	LangChain / LangGraph	`pip install air-langchain-trust`
air-crewai-trust	CrewAI	`pip install air-crewai-trust`
air-openai-agents-trust	OpenAI Agents SDK	`pip install air-openai-agents-trust`
air-autogen-trust	AutoGen / AG2	`pip install air-autogen-trust`
openclaw-air-trust	TypeScript / Node.js	`npm install openclaw-air-trust`
air-rag-trust	RAG Knowledge Bases (this repo)	`pip install air-rag-trust`
air-compliance	Compliance Scanner	`pip install air-compliance`
Gateway	Any HTTP agent	`docker pull ghcr.io/airblackbox/gateway:main`

Development

git clone https://github.com/airblackbox/air-rag-trust.git
cd air-rag-trust
pip install -e ".[dev]"
pytest tests/ -v

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Jun 10, 2026

0.1.0

Feb 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

air_rag_trust-0.1.1.tar.gz (27.3 kB view details)

Uploaded Jun 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

air_rag_trust-0.1.1-py3-none-any.whl (23.6 kB view details)

Uploaded Jun 10, 2026 Python 3

File details

Details for the file air_rag_trust-0.1.1.tar.gz.

File metadata

Download URL: air_rag_trust-0.1.1.tar.gz
Upload date: Jun 10, 2026
Size: 27.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for air_rag_trust-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`69b1ed64b74b700f77fd47a3f9f92caff85d1f4efceebbb5da511943e7704d93`
MD5	`8094d522d3f1e39e00d0c22023de66e9`
BLAKE2b-256	`7b1ebea7d8450552b5c0e2e1a13a7a8431ac3b2b758ac673dd3f50ffe7a06efd`

See more details on using hashes here.

File details

Details for the file air_rag_trust-0.1.1-py3-none-any.whl.

File metadata

Download URL: air_rag_trust-0.1.1-py3-none-any.whl
Upload date: Jun 10, 2026
Size: 23.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for air_rag_trust-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e4c584747d4efd5dd1859a96f32765bebf86059cfe55051e24ba0a11188a1a5`
MD5	`d5b1068cc170a13a2437292c1ef3ee47`
BLAKE2b-256	`41051d0bc26b3b83c5acf5ae78810dfda9bea2252b213f888bb2dd0590014eef`

See more details on using hashes here.

air-rag-trust 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

air-rag-trust

The Problem

Quick Start

CLI

Write Gate Policies

Drift Detection

EU AI Act Compliance Coverage

API Reference

AIR Blackbox Ecosystem

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes