Permission-aware retrieval for RAG applications
Reason this release was yanked:
no longer maintained
Project description
RAGGuard
The security layer your RAG application is missing.
┌──────────────────────────────────────────────────────────────────────────────┐
│ BRING YOUR OWN PERMISSIONS │
├──────────────────────────────────────────────────────────────────────────────┤
│ │
│ INLINE POLICIES CUSTOM FILTERS ACL DOCUMENTS ENTERPRISE AUTH │
│ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ ┌─────────────┐ │
│ │ rules: │ │ class My │ │ {"acl": { │ │ OPA │ │
│ │ - allow: │ │ Filter: │ │ "users": │ │ Cerbos │ │
│ │ dept │ │ def build │ │ ["alice"]│ │ OpenFGA │ │
│ │ │ │ ... │ │ }} │ │ Permit.io │ │
│ └─────────────┘ └─────────────┘ └────────────┘ └─────────────┘ │
│ Code/YAML Full Control Explicit Lists Policy Engines │
│ │
└──────────────────────────────────────────────────────────────────────────────┘
The Problem: Your RAG system retrieves documents, then filters by permissions. But by then, unauthorized data has already been exposed to the retrieval layer. That's a data leak.
The Solution: RAGGuard filters during vector search, not after. Zero unauthorized exposure.
Works with any authorization system - use your existing permissions infrastructure (OPA, Cerbos, OpenFGA, custom RBAC, ACLs) or define policies inline. RAGGuard translates your authorization decisions into vector database filters.
┌─────────────────────────────────────────────────────────────────────────────┐
│ WITHOUT RAGGUARD WITH RAGGUARD │
├─────────────────────────────────────────────────────────────────────────────┤
│ Vector Search Vector Search │
│ Returns 10 docs ──────────┐ + Permission Filter │
│ (includes unauthorized) │ Returns 10 docs │
│ │ │ (all authorized) │
│ ▼ │ │ │
│ Filter in Python │ │ │
│ Remove 7 docs │ │ │
│ │ │ │ │
│ ▼ │ ▼ │
│ Return 3 docs │ Return 10 docs │
│ ❌ Data leaked │ ✅ Zero exposure │
│ ❌ Wrong count │ ✅ Correct count │
└─────────────────────────────────────────────────────────────────────────────┘
Quick Start
pip install ragguard[chromadb]
import chromadb
from ragguard import ChromaDBSecureRetriever, Policy
# 1. Your existing ChromaDB setup
client = chromadb.Client()
collection = client.create_collection("docs")
collection.add(
ids=["1", "2", "3"],
documents=["Finance Report", "Engineering Doc", "Public Blog"],
metadatas=[
{"department": "finance", "confidential": True},
{"department": "engineering", "confidential": False},
{"department": "public", "confidential": False}
]
)
# 2. Define access policy
policy = Policy.from_dict({
"version": "1",
"rules": [
{"name": "same-dept", "allow": {"conditions": ["user.department == document.department"]}},
{"name": "public", "match": {"confidential": False}, "allow": {"everyone": True}}
],
"default": "deny"
})
# 3. Search with automatic permission filtering
retriever = ChromaDBSecureRetriever(collection=collection, policy=policy)
results = retriever.search(
query="quarterly report",
user={"id": "alice", "department": "finance"},
limit=10
)
# Alice sees finance docs + public docs only
That's it. Documents are filtered at the database level. No post-filtering. No data leaks.
Bring Your Own Authorization
RAGGuard doesn't force you into a specific permissions model. Use what you already have:
Option 1: Inline Policies (shown above)
Define policies directly in code or YAML - great for getting started or simple use cases.
Option 2: Custom Filter Builders
Plug in any authorization logic with full control:
from ragguard.filters import CustomFilterBuilder
class MyAuthFilter(CustomFilterBuilder):
def build_filter(self, policy, user, backend):
# Query your auth system, check ACLs, call APIs - whatever you need
allowed_docs = my_auth_service.get_accessible_docs(user["id"])
return {"doc_id": {"$in": allowed_docs}}
retriever = ChromaDBSecureRetriever(
collection=collection,
policy=policy,
custom_filter_builder=MyAuthFilter()
)
Option 3: ACL-Based Documents
For documents with explicit access control lists:
from ragguard.filters import ACLFilterBuilder
# Documents have: {"acl": {"users": ["alice"], "groups": ["eng"], "public": false}}
retriever = QdrantSecureRetriever(
collection=collection,
policy=policy,
custom_filter_builder=ACLFilterBuilder(
get_user_groups=lambda user: fetch_groups_from_ldap(user["id"])
)
)
Option 4: Enterprise Authorization Systems
Connect to dedicated authorization services (available in ragguard-enterprise):
| System | Description |
|---|---|
| OPA | Open Policy Agent - policy as code |
| Cerbos | Access control for cloud-native apps |
| OpenFGA | Google Zanzibar-inspired fine-grained auth |
| Permit.io | Permissions as a service |
| Auth0/Okta | Identity provider integration |
Supported Backends
| Vector DBs | Graph DBs |
|---|---|
| Qdrant, ChromaDB, Pinecone, pgvector, Weaviate, Milvus, FAISS, Elasticsearch, OpenSearch, Azure AI Search | Neo4j, Neptune, TigerGraph, ArangoDB |
Integrations
LangChain • LlamaIndex • LangGraph • CrewAI • DSPy • AWS Bedrock
Documentation
| Guide | Description |
|---|---|
| Getting Started | Installation and basic setup |
| Policy Format | Policy syntax and operators |
| Backends | Database-specific examples |
| Integrations | LangChain, LlamaIndex, etc. |
| Production | Health checks, logging, async |
| Kubernetes | K8s deployment guide |
| Security | Security testing & guarantees |
| Use Cases | Multi-tenant, healthcare, etc. |
| FAQ | Common questions & limitations |
Installation
# With a specific backend
pip install ragguard[qdrant]
pip install ragguard[chromadb]
pip install ragguard[pgvector]
pip install ragguard[pinecone]
# With framework integration
pip install ragguard[langchain]
pip install ragguard[llamaindex]
# Everything
pip install ragguard[all]
Python Compatibility: Fully tested on Python 3.9-3.13. Python 3.14 has limited support due to upstream dependencies (chromadb, langchain) not yet supporting Python 3.14.
Why RAGGuard?
| Challenge | Without RAGGuard | With RAGGuard |
|---|---|---|
| Data leaks | Filter after retrieval = data exposed | Filter during search = zero exposure |
| Authorization | Rebuild permission logic for RAG | Plug in your existing auth system |
| Multi-database | Custom filter code per DB | One integration, 14 databases |
| Setup time | Days/weeks | 5 minutes |
| Security testing | DIY | Comprehensive test suite |
License
Apache-2.0 - See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ragguard-0.3.1.tar.gz.
File metadata
- Download URL: ragguard-0.3.1.tar.gz
- Upload date:
- Size: 457.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f87c298092e10946c881693e116e76fc7ed5fecec4bdc1074eac7b35cb2fcab9
|
|
| MD5 |
7b4b514c7c772405ec53ad597a27a7f3
|
|
| BLAKE2b-256 |
45f24c9eff0883d15666d7b2fa944f8da5be8a6a95d69af01af1ac8c49a99c19
|
File details
Details for the file ragguard-0.3.1-py3-none-any.whl.
File metadata
- Download URL: ragguard-0.3.1-py3-none-any.whl
- Upload date:
- Size: 300.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3355c03f79349af33d7e886d4a21a967eed533b8abc9d98796e291bd55155e4
|
|
| MD5 |
f9a9238b9db35e634bdb2991f6759b51
|
|
| BLAKE2b-256 |
940a5f6b2c7f90765ee48cb62b132febcfe326d3a136d44e8c848b8b830039c9
|