Skip to main content

A runtime governance, security and execution control layer for Retrieval-Augmented Generation (RAG) systems.

Project description

rag_control

A runtime governance, security, and execution control layer for Retrieval-Augmented Generation (RAG) systems.

rag_control provides enterprise-grade policy enforcement, security governance, and observability for RAG applications. Control what your RAG system retrieves, how it generates responses, and enforce compliance policies at runtime.

Overview

RAG systems are powerful but can be risky in production:

  • Hallucinations: LLMs may generate content not grounded in retrieved documents
  • Data Leakage: Sensitive information might be retrieved or exposed
  • Compliance: Regulations require audit trails and enforcement controls
  • Cost: Token usage and retrieval operations need optimization

rag_control addresses these challenges with:

  • Policy-Based Generation: Define and enforce generation policies (temperature, output length, citation requirements, external knowledge restrictions)
  • Runtime Enforcement: Validate responses against policies before returning them to users
  • Governance & Security: Apply organization-level rules, role-based access control, and data classification filters
  • Comprehensive Audit Logging: Track all requests, decisions, and denials for compliance
  • Distributed Tracing: Understand execution flow and identify performance bottlenecks
  • Metrics & Observability: 18+ metrics covering throughput, latency, quality, costs, and errors

Key Features

🛡️ Policy Enforcement

  • Define multiple policies with different strictness levels
  • Control temperature, max output tokens, reasoning depth
  • Enforce citation requirements and validation
  • Prevent external knowledge generation
  • Apply context-aware fallback strategies

🔐 Governance & Security

  • Organization-level access control with deny rules
  • Retrieval filtering by data classification and metadata
  • User context validation
  • Policy resolution based on org rules and data sensitivity
  • Mixed user and document-based access control enforcement

📊 Observability

  • Audit Logging: Full request/response lifecycle tracking
  • Distributed Tracing: OpenTelemetry integration for flow analysis
  • Metrics: Token usage, latency, error rates, policy decisions

🚀 Production Ready

  • Exception-swallowing pattern ensures governance failures never break request flow
  • Comprehensive error handling with custom exception types
  • Type-safe with mypy strict mode compliance
  • 100% code coverage with extensive test suite

Installations

pip install rag_control openai_adapter pinecone_adapter

Requirements

  • Python 3.10+

Quick Start

1. Define Policies

Create a policy_config.yaml:

policies:
  - name: strict_citations
    description: Strict policy with citation enforcement
    generation:
      reasoning_level: limited
      allow_external_knowledge: false
      require_citations: true
      temperature: 0.0
    enforcement:
      validate_citations: true
      block_on_missing_citations: true
      prevent_external_knowledge: true
      max_output_tokens: 512
    logging:
      level: full

  - name: soft_research
    description: Relaxed policy for exploratory research
    generation:
      reasoning_level: full
      allow_external_knowledge: false
      require_citations: true
      temperature: 0.1
    enforcement:
      validate_citations: true
      block_on_missing_citations: false
      prevent_external_knowledge: true
      max_output_tokens: 512
    logging:
      level: full

filters:
  - name: enterprise_only
    condition:
      field: org_tier
      operator: equals
      value: enterprise
      source: user

orgs:
  - org_id: acme_corp
    description: Acme Corporation with strict citation requirements
    default_policy: strict_citations
    document_policy:
      top_k: 10
      filter_name: enterprise_only

    # Policy rules - determine which policy to apply (user context only)
    policy_rules:
      - name: allow_enterprise_strict
        description: Apply strict policy for enterprise users
        priority: 50
        effect: allow
        apply_policy: strict_citations
        when:
          all:
            - field: org_tier
              operator: equals
              value: enterprise
              source: user

    # Deny rules - block requests at runtime (user + document context)
    deny_rules:
      - name: deny_untrusted_document_source
        description: Deny if any doc comes from untrusted source
        priority: 60
        when:
          any:
            - field: metadata.source
              operator: equals
              value: public-web
              source: documents
              document_match: any

      - name: deny_external_users_restricted_docs
        description: Deny external users from restricted documents
        priority: 48
        when:
          all:
            - field: user_type
              operator: equals
              value: external
              source: user
            - field: metadata.classification
              operator: equals
              value: restricted
              source: documents
              document_match: any

2. Initialize the Engine

from rag_control import RAGControl
from rag_control.models import UserContext
from openai_adapter import OpenAILLMAdapter, OpenAIQueryEmbeddingAdapter
from pinecone_adapter import PineconeVectorStoreAdapter

# Initialize adapters
llm_adapter = OpenAILLMAdapter(
    api_key="sk-your-openai-key",
    model="gpt-4"
)

embedding_adapter = OpenAIQueryEmbeddingAdapter(
    api_key="sk-your-openai-key",
    model="text-embedding-3-small"
)

vector_store = PineconeVectorStoreAdapter(
    api_key="your-pinecone-key",
    index_name="documents",
    embedding_model="text-embedding-3-small"
)

# Initialize rag_control
engine = RAGControl(
    llm=llm_adapter,
    query_embedding=embedding_adapter,
    vector_store=vector_store,
    config_path="policy_config.yaml"
)

# Create a user context
user_context = UserContext(
    org_id="default",
    user_id="user-123",
    attributes={
      "namespace": "demo",
      "dept": "hr"
    },
)

3. Run Queries

# Execute a query with governance and policy enforcement
result = engine.run(
    query="What are the key findings from our Q1 report?",
    user_context=user_context
)

print(f"Policy applied: {result.policy_name}")
print(f"Enforcement passed: {result.enforcement_passed}")
print(f"Response: {result.response.content}")
print(f"Tokens used: {result.response.token_count}")

# Or stream responses
stream_result = engine.stream(
    query="Summarize the financial impact...",
    user_context=user_context
)

for chunk in stream_result.response:
    print(chunk.content, end="", flush=True)

Architecture

Core Components

  • Engine: Orchestrates the RAG execution pipeline with governance and policy enforcement
  • Policy Registry: Manages generation and enforcement policies
  • Governance Registry: Applies organization-level rules and access control
  • Filter Registry: Manages data classification and retrieval filters
  • Adapters: Pluggable interfaces for LLMs, embeddings, and vector stores

Execution Flow

1. Validate org identity from user context
   ↓
2. Resolve org and apply retrieval filters
   ↓
3. Embed query
   ↓
4. Retrieve documents with org-level top_k
   ↓
5. Resolve policy via governance rules
   ↓
6. Build prompt with policy context
   ↓
7. Call LLM with policy-controlled parameters
   ↓
8. Apply enforcement checks (citations, knowledge, etc.)
   ↓
9. Emit audit events and traces
   ↓
10. Return response or raise policy violation

Observability

Audit Logging

Every request generates audit events:

{
    "event": "request.received",
    "request_id": "req-abc123",
    "org_id": "acme-corp",
    "user_id": "user-123",
    "timestamp": "2026-03-04T10:30:00Z"
}

Distributed Tracing

OpenTelemetry integration tracks execution stages:

request_span
├── org_lookup_span
├── embedding_span
├── retrieval_span
├── policy_resolution_span
├── llm_generation_span
└── enforcement_span

Metrics (18 total)

  • Throughput: Request count, throughput per second
  • Latency: Request duration, stage durations
  • Quality: Retrieved document scores, top-k metrics
  • LLM: Token counts, efficiency ratios
  • Errors: Error types, error categories, denial reasons
  • Custom: Policy resolutions, embedding dimensions

Documentation

For extensive documentation, guides, and API references, visit the docs directory:

  • Getting Started guides and quick start tutorials
  • Core concepts and architecture
  • API reference and adapters documentation
  • Observability and monitoring guides
  • Configuration reference

Quick links to spec documents:

Examples

See the examples/ directory for:

  • controller-config.yaml: Complete policy configuration example

Security

  • Exception-swallowing pattern ensures governance failures are handled gracefully
  • All external inputs validated with Pydantic
  • Type-safe with strict mypy enforcement
  • Regular security audits and dependency updates

Contributing

See CONTRIBUTING.md for guidelines:

  • Issues: Anyone can open issues, bugs, and feature requests
  • Pull Requests: RetrievalLabs team members only
  • Code Standards: 100% coverage, type checking, formatting compliance required

Support

  • Check DEVELOPMENT.md for setup issues
  • Review spec documentation in rag_control/spec/ for detailed contracts
  • Open an issue for bugs and feature requests

License

This project is licensed under the RetrievalLabs Business-Restricted License (RBRL).

  • Personal/Non-Commercial Use: Permitted
  • Business/Commercial Use: Prohibited without a written contract with RetrievalLabs Co.
  • Modifications/Derivative Works: Prohibited without a written contract with RetrievalLabs Co.

See LICENSE for full terms.


Built by RetrievalLabs — Enterprise RAG Governance and Security

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_control-0.2.0.tar.gz (33.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_control-0.2.0-py3-none-any.whl (44.0 kB view details)

Uploaded Python 3

File details

Details for the file rag_control-0.2.0.tar.gz.

File metadata

  • Download URL: rag_control-0.2.0.tar.gz
  • Upload date:
  • Size: 33.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for rag_control-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ddbbdafeff5ee6442fc2d2a982450c02a9b1b5a17287de4fbb7fb893d45948e5
MD5 06e73df8f7de51935f8597962c6f8b26
BLAKE2b-256 46fdf1c256dd1f63288e1495f7ec871d0f7c60b3ee4d6954295c3a492fd984a0

See more details on using hashes here.

File details

Details for the file rag_control-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: rag_control-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 44.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.7 {"installer":{"name":"uv","version":"0.10.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for rag_control-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa7ec1a602a523e75ac0fcd47dce8cd778b9886308b2c2cfbffec6352ad93651
MD5 125dea7bd8d7aa7600127b4563754de6
BLAKE2b-256 769c3f5d3d49077d472b3897470b140682c79879ee8d3205f7dcd84c9d775008

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page