Skip to main content

Permission-aware retrieval for RAG applications

Reason this release was yanked:

no longer maintained

Project description

RAGGuard

The security layer your RAG application is missing.

PyPI version Python 3.9+ License: AGPL-3.0 Tests Security

The Problem: Your RAG system retrieves documents, then filters by permissions. But by then, unauthorized data has already been exposed to the retrieval layer. That's a data leak.

The Solution: RAGGuard filters during vector search, not after. Zero unauthorized exposure.

Works with any authorization system - use your existing permissions infrastructure (OPA, Cerbos, OpenFGA, custom RBAC, ACLs) or define policies inline. RAGGuard translates your authorization decisions into vector database filters.

┌─────────────────────────────────────────────────────────────────────────────┐
│   WITHOUT RAGGUARD                      WITH RAGGUARD                       │
├─────────────────────────────────────────────────────────────────────────────┤
│   Vector Search                         Vector Search                       │
│   Returns 10 docs ──────────┐           + Permission Filter                 │
│   (includes unauthorized)   │           Returns 10 docs                     │
│             │               │           (all authorized)                    │
│             ▼               │                  │                            │
│   Filter in Python          │                  │                            │
│   Remove 7 docs             │                  │                            │
│             │               │                  │                            │
│             ▼               │                  ▼                            │
│   Return 3 docs             │           Return 10 docs                      │
│   ❌ Data leaked            │           ✅ Zero exposure                    │
│   ❌ Wrong count            │           ✅ Correct count                    │
└─────────────────────────────────────────────────────────────────────────────┘

Quick Start

pip install ragguard[chromadb]
import chromadb
from ragguard import ChromaDBSecureRetriever, Policy

# 1. Your existing ChromaDB setup
client = chromadb.Client()
collection = client.create_collection("docs")
collection.add(
    ids=["1", "2", "3"],
    documents=["Finance Report", "Engineering Doc", "Public Blog"],
    metadatas=[
        {"department": "finance", "confidential": True},
        {"department": "engineering", "confidential": False},
        {"department": "public", "confidential": False}
    ]
)

# 2. Define access policy
policy = Policy.from_dict({
    "version": "1",
    "rules": [
        {"name": "same-dept", "allow": {"conditions": ["user.department == document.department"]}},
        {"name": "public", "match": {"confidential": False}, "allow": {"everyone": True}}
    ],
    "default": "deny"
})

# 3. Search with automatic permission filtering
retriever = ChromaDBSecureRetriever(collection=collection, policy=policy)

results = retriever.search(
    query="quarterly report",
    user={"id": "alice", "department": "finance"},
    limit=10
)
# Alice sees finance docs + public docs only

That's it. Documents are filtered at the database level. No post-filtering. No data leaks.

Bring Your Own Authorization

RAGGuard doesn't force you into a specific permissions model. Use what you already have:

Option 1: Inline Policies (shown above)

Define policies directly in code or YAML - great for getting started or simple use cases.

Option 2: Custom Filter Builders

Plug in any authorization logic with full control:

from ragguard.filters import CustomFilterBuilder

class MyAuthFilter(CustomFilterBuilder):
    def build_filter(self, policy, user, backend):
        # Query your auth system, check ACLs, call APIs - whatever you need
        allowed_docs = my_auth_service.get_accessible_docs(user["id"])
        return {"doc_id": {"$in": allowed_docs}}

retriever = ChromaDBSecureRetriever(
    collection=collection,
    policy=policy,
    custom_filter_builder=MyAuthFilter()
)

Option 3: ACL-Based Documents

For documents with explicit access control lists:

from ragguard.filters import ACLFilterBuilder

# Documents have: {"acl": {"users": ["alice"], "groups": ["eng"], "public": false}}
retriever = QdrantSecureRetriever(
    collection=collection,
    policy=policy,
    custom_filter_builder=ACLFilterBuilder(
        get_user_groups=lambda user: fetch_groups_from_ldap(user["id"])
    )
)

Option 4: Enterprise Authorization Systems

Connect to dedicated authorization services (available in ragguard-enterprise):

System Description
OPA Open Policy Agent - policy as code
Cerbos Access control for cloud-native apps
OpenFGA Google Zanzibar-inspired fine-grained auth
Permit.io Permissions as a service
Auth0/Okta Identity provider integration

Supported Backends

Vector DBs Graph DBs
Qdrant, ChromaDB, Pinecone, pgvector, Weaviate, Milvus, FAISS, Elasticsearch, OpenSearch, Azure AI Search Neo4j, Neptune, TigerGraph, ArangoDB

Integrations

LangChain • LlamaIndex • LangGraph • CrewAI • DSPy • AWS Bedrock

Documentation

Guide Description
Getting Started Installation and basic setup
Policy Format Policy syntax and operators
Backends Database-specific examples
Integrations LangChain, LlamaIndex, etc.
Production Health checks, logging, async
Kubernetes K8s deployment guide
Security Security testing & guarantees
Use Cases Multi-tenant, healthcare, etc.
FAQ Common questions & limitations

Installation

# With a specific backend
pip install ragguard[qdrant]
pip install ragguard[chromadb]
pip install ragguard[pgvector]
pip install ragguard[pinecone]

# With framework integration
pip install ragguard[langchain]
pip install ragguard[llamaindex]

# Everything
pip install ragguard[all]

Python Compatibility: Fully tested on Python 3.9-3.13. Python 3.14 has limited support due to upstream dependencies (chromadb, langchain) not yet supporting Python 3.14.

Why RAGGuard?

Challenge Without RAGGuard With RAGGuard
Data leaks Filter after retrieval = data exposed Filter during search = zero exposure
Authorization Rebuild permission logic for RAG Plug in your existing auth system
Multi-database Custom filter code per DB One integration, 14 databases
Setup time Days/weeks 5 minutes
Security testing DIY 2000+ tests included

License

AGPL-3.0 - See LICENSE for details.

For commercial licensing, contact: [your email]


Built for the RAG communityExamplesContributing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragguard-0.3.0.tar.gz (440.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragguard-0.3.0-py3-none-any.whl (273.3 kB view details)

Uploaded Python 3

File details

Details for the file ragguard-0.3.0.tar.gz.

File metadata

  • Download URL: ragguard-0.3.0.tar.gz
  • Upload date:
  • Size: 440.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for ragguard-0.3.0.tar.gz
Algorithm Hash digest
SHA256 41bb2d1a75ec2e346f950a75a1628f14654205ca4e15a3ca444b60ebe17012c3
MD5 4dc06c7738e4b862242e697a48c693e2
BLAKE2b-256 c6c136163dc9625e7c50aa72fabdd8bd14de636411302c083228188b90505152

See more details on using hashes here.

File details

Details for the file ragguard-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: ragguard-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 273.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for ragguard-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 04f6598237778109cf4f72ff0da263f6d40f5999f8779f594683fb612dcf223c
MD5 5c275443a9f568a4d50f7c274a5a0d11
BLAKE2b-256 ba8db75298465a7fa28a962e3ad551864f22c08a65b43435bc81aa2c54f5246a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page