Paragraph-level compliance audit trails for RAG pipelines
Project description
TrailRAG
Compliance audit layer for RAG pipelines built on rag_control. Regulators in EU AI Act / GDPR / HIPAA-regulated industries need more than a response — they need to know which document, which page, and which model version produced it. TrailRAG intercepts every retrieval call, traces each chunk to its source, and writes an immutable audit record before returning the result.
The Problem
"The AI referenced this document" is not sufficient. Regulators need: the AI extracted $2.5M from page 7, paragraph 3 of insurance_policy_v2.pdf, version 2026-01-15, at 04:32:59 UTC, by user analyst_001.
Starting August 2026, EU AI Act enforcement makes this non-negotiable:
- Art. 30 — High-risk AI systems must log every inference: input, output, retrieved sources, model identity, timestamp, and the user who triggered it.
- GDPR Art. 30 — Records of processing activities must identify the data subject, legal basis, and retention period for every automated decision.
- HIPAA §164.312(b) — Audit controls must record activity in systems that touch PHI, including AI-generated responses citing medical records.
Quick Start
pip install trailrag
from trailrag import TrailRAGEngine
engine = TrailRAGEngine(
base_engine=your_rag_control_engine,
jurisdiction="GDPR", # sets retention period automatically
db_url="postgresql://user:pw@host/db", # or "sqlite:///audit.db" for dev
)
result = engine.run(query="What is the fire coverage limit?", user_context=ctx)
print(result.audit_id) # 01KNZZDS0161Z4VEWVVV0REP0A
print(result.chunk_records) # [{chunk_id, doc_id, page_number, score, ...}, ...]
What Gets Logged
{
"audit_id": "01KNZZDS0161Z4VEWVVV0REP0A",
"timestamp_utc": "2026-04-12T04:32:59Z",
"user_id": "analyst_001",
"jurisdiction": "GDPR",
"retention_until": "2026-10-09T04:32:59Z",
"model_name": "gpt-4o",
"prompt_hash": "880d6f35...",
"query_text": "What is the fire coverage limit?",
"retrieved_chunks": [
{
"doc_id": "insurance_policy_v2.pdf",
"doc_version": "2026-01-15",
"page_number": 7,
"similarity_score": 0.91,
"retrieval_rank": 1
}
],
"response_hash": "7eb675c8...",
"total_tokens": 252,
"retrieval_latency_ms": 13.4,
"total_latency_ms": 161.2
}
Audit records are append-only. The only deletion path is purge_expired(), which removes records past their retention_until. store.delete() always raises ComplianceViolationError.
Compliance Coverage
| Regulation | Key Requirement | TrailRAG Field |
|---|---|---|
| EU AI Act Art. 30 | Decision traceability per inference | retrieved_chunks → page_number, doc_id, similarity_score |
| GDPR Art. 30 | Records of processing activities | user_id, timestamp_utc, retention_until |
| HIPAA §164.312(b) | Audit controls for PHI systems | audit_id + append-only store |
| BASEL III | Credit risk model documentation | model_version, prompt_hash, temperature |
Retention is set automatically: GDPR → 180 days, HIPAA → 6 years, EU AI Act → 10 years, SOC 2 → 1 year.
GDPR Subject Access Requests (Art. 15)
records = store.get_by_user("analyst_001", from_date=start, to_date=end)
store.export_json(audit_id, "/tmp/record_for_dpa.json")
Self-hosting (Docker)
The fastest way to run the compliance dashboard on your own infrastructure.
Prerequisites
- Docker ≥ 24 and Docker Compose ≥ 2
1 — Clone and configure
git clone https://github.com/LeQuocAnh123/RAG_control.git
cd RAG_control
Create a .env file in the project root (never commit this file):
# Required — set a strong random value; used for POST /api/purge
TRAILRAG_API_KEY=change-me-to-a-long-random-secret
# Optional — defaults shown below
TRAILRAG_DB_URL=sqlite:////data/audit.db
TRAILRAG_CORS_ORIGINS=*
2 — Build and start
docker compose up -d --build
3 — Open the dashboard
http://localhost:8080/dashboard
The API docs (OpenAPI / Swagger) are at http://localhost:8080/api/docs.
4 — Verify health
curl http://localhost:8080/health
# {"status": "ok", "total_records": 0}
Persistent storage
Audit records are stored in a Docker named volume (trailrag_data) mounted at
/data inside the container. Data survives container restarts and upgrades.
To back up:
docker run --rm -v trailrag_data:/data -v $(pwd):/backup busybox \
tar czf /backup/trailrag_backup_$(date +%Y%m%d).tar.gz /data
Switching to PostgreSQL
Set TRAILRAG_DB_URL in your .env:
TRAILRAG_DB_URL=postgresql://user:password@postgres_host:5432/trailrag
TrailRAG will create the audit_log table automatically on first start.
Purging expired records
# Dry run — count eligible records
curl -X POST "http://localhost:8080/api/purge?dry_run=true" \
-H "X-TrailRAG-Key: $TRAILRAG_API_KEY"
# Execute purge
curl -X POST "http://localhost:8080/api/purge" \
-H "X-TrailRAG-Key: $TRAILRAG_API_KEY"
Built On
- rag_control — runtime governance, RBAC, policy enforcement
- SQLAlchemy 2.0 — SQLite (dev) / PostgreSQL (production)
- Pydantic v2 — immutable, validated audit schema
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trailrag-0.1.0.tar.gz.
File metadata
- Download URL: trailrag-0.1.0.tar.gz
- Upload date:
- Size: 91.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7820685bbe6a969ba886de474299fc13f1b86077da5ad12914ac2342a615ec47
|
|
| MD5 |
a40dad87f03ab46adcfdee3442e0088b
|
|
| BLAKE2b-256 |
51d376c2624c73819ffbd0f541a208fbafba870a9963f3b6a7662d577031181f
|
File details
Details for the file trailrag-0.1.0-py3-none-any.whl.
File metadata
- Download URL: trailrag-0.1.0-py3-none-any.whl
- Upload date:
- Size: 63.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5052008d2000f6606d6e72bf07f146700bccd81469a2820ff98c31dad2d4cec0
|
|
| MD5 |
723131ed15746503119224062d1aa774
|
|
| BLAKE2b-256 |
bfbfb73d24be67fe10b44cd122693b8ba7aad32c0a710972c0b7c72561d8dadb
|