An open-source, high-transparency modular RAG framework for AI/ML applications

These details have not been verified by PyPI

Project links

Project description

GlassBox RAG

A production-ready, high-transparency modular RAG (Retrieval-Augmented Generation) framework for enterprise AI/ML applications. GlassBox provides full-pipeline observability, an integrated web dashboard, extensible plugin architecture, and real-time telemetry -- all from a single pip install.

Features
Installation
Quick Start
CLI
Configuration
Web Dashboard
API Reference
Plugin Architecture
Framework Adapters
Observability
Production Deployment
Architecture
Testing
Contributing
License

Features

Core RAG Pipeline

Modular Encoding Layer -- Swap between OpenAI, Cohere, Google, Ollama, ONNX, and Hugging Face encoders at runtime. Per-query encoder overrides supported.
Embedding Cache -- In-memory LRU cache eliminates redundant embedding API calls.
Adaptive Retrieval -- Semantic, keyword, and hybrid retrieval strategies with configurable weighting.
Cross-Encoder Reranking -- Optional reranking pass for improved precision.
Advanced Chunking -- Recursive, sentence-based, semantic, and fixed-size strategies with overlap control and chunk-size monitoring.
Document Deduplication -- Content-hash and SimHash-based deduplication with configurable thresholds. Runs automatically during batch ingestion.
Write-Back Protection -- Confidence-gated document updates with optional human review workflow.

LLM Generation

Streaming Generation -- Server-Sent Events (SSE) endpoint for real-time token streaming.
Multiple Backends -- OpenAI and Ollama backends with auto-detection from environment variables.
Token Counting -- Accurate token usage tracking via tiktoken with per-model cost estimation.
Configurable System Prompts -- Per-request or config-level system prompt overrides.

Integrated Web Dashboard

The dashboard is served directly by the FastAPI server at the root URL -- no separate frontend build or deployment required.

Overview -- Live metrics (request count, latency, token usage, cost), component health status, and recent activity feed pulled from real trace data.
Pipeline Visualization -- Interactive SVG pipeline diagram with step-by-step execution flow and real-time test execution.
Debugger -- Trace listing with filtering, detailed step-by-step timing breakdown, and trace visualization.
Telemetry -- Latency and throughput charts, cost breakdown by operation, and performance percentile tables (p50/p95/p99) -- all sourced from live metrics.
Plugins -- Interactive documentation for all plugin types with code examples.

Production Infrastructure

Rate Limiting -- Memory-based (single worker) or Redis-backed (multi-worker) rate limiting middleware.
Authentication -- JWT token authentication and API key validation with configurable scoping.
CORS -- Configurable cross-origin resource sharing.
OpenTelemetry -- Optional trace export to OTLP, Jaeger, or console backends.
Prometheus Metrics -- Scrapable /metrics/prometheus endpoint for Grafana integration.
Persistent Trace Backends -- In-memory (default), Redis, or PostgreSQL trace storage with configurable retention.

Developer Experience

Query Pipeline Hooks -- Register before/after hooks at any pipeline stage (pre-retrieval, post-retrieval, pre-generation, post-generation) for logging, transformation, or validation.
Plugin System -- Four plugin types (Encoder, Vector Store, Database, Multimodal) with a registration API.
Framework Adapters -- First-class integration with LangChain, LlamaIndex, and Haystack.
CLI -- Project scaffolding, server management, and config validation from the command line.
Full Type Safety -- Complete type annotations validated with mypy. Ships with py.typed marker.

Installation

From PyPI

pip install glassbox-rag

With Optional Dependencies

# Common setup: embeddings + generation + token counting
pip install "glassbox-rag[auto]"

# Full installation with all plugins
pip install "glassbox-rag[all]"

# Specific extras
pip install "glassbox-rag[embeddings]"          # All embedding providers
pip install "glassbox-rag[vector-stores]"       # All vector store backends
pip install "glassbox-rag[databases]"           # All database backends
pip install "glassbox-rag[telemetry]"           # OpenTelemetry + Prometheus
pip install "glassbox-rag[auth]"               # JWT authentication
pip install "glassbox-rag[generation]"         # LLM generation
pip install "glassbox-rag[adapters]"           # LangChain, LlamaIndex, Haystack

From Source

git clone https://github.com/averoe/GlassBox.git
cd glassbox-rag
pip install -e ".[dev]"

Quick Start

Python API

import asyncio
from glassbox_rag import GlassBoxEngine, GlassBoxConfig, Document

async def main():
    # Initialize -- auto-detects config from environment variables
    config = GlassBoxConfig.from_env()
    engine = GlassBoxEngine(config)
    await engine.initialize()

    # Ingest documents
    documents = [
        Document(content="GlassBox is a modular RAG framework.", metadata={"source": "docs"}),
        Document(content="It supports multiple vector stores.", metadata={"source": "docs"}),
    ]
    result = await engine.ingest(documents)
    print(f"Ingested {result['chunks_created']} chunks")

    # Retrieve
    results = await engine.retrieve("What is GlassBox?", top_k=5)
    for r in results:
        print(f"  [{r.score:.3f}] {r.content[:80]}")

    # Generate (requires generation backend configured)
    response = await engine.generate("What is GlassBox?", top_k=5)
    print(response["answer"])

    await engine.shutdown()

asyncio.run(main())

Start the Server

# Using the CLI
glassbox-rag serve --config config/default.yaml

# Or with uvicorn directly
uvicorn glassbox_rag.server:create_app --factory --host 0.0.0.0 --port 8000

Then open http://localhost:8000 for the dashboard, or http://localhost:8000/docs for the OpenAPI explorer.

CLI

# Initialize a new project with scaffolded config
glassbox-rag init --encoder openai --vector-store qdrant --database sqlite

# Start the server
glassbox-rag serve --config config/default.yaml --port 8000

# Validate configuration
glassbox-rag check --config config/default.yaml

# Show version
glassbox-rag version

Configuration

GlassBox uses YAML configuration with ${ENV_VAR} and ${ENV_VAR:default} substitution. A zero-config mode is also available via GlassBoxConfig.from_env().

server:
  host: "0.0.0.0"
  port: 8000
  workers: 4

encoding:
  default_encoder: "openai"
  cloud:
    openai:
      api_key: "${OPENAI_API_KEY}"
      model: "text-embedding-3-small"
      embedding_dim: 1536

vector_store:
  type: "qdrant"
  qdrant:
    host: "localhost"
    port: 6333
    collection_name: "glassbox_docs"

database:
  type: "sqlite"
  sqlite:
    path: "./data/glassbox.db"

chunking:
  strategy: "recursive"
  chunk_size: 512
  chunk_overlap: 50

retrieval:
  top_k: 5
  min_score: 0.3
  adaptive:
    enabled: true
    strategies:
      - name: "hybrid"
        weight_semantic: 0.6
        weight_keyword: 0.4

generation:
  backend: "openai"
  model: "gpt-4o-mini"
  temperature: 0.7

trace:
  enabled: true
  backend: "memory"

telemetry:
  prometheus_enabled: true
  service_name: "glassbox-rag"

security:
  api_key_required: false
  cors:
    enabled: true
    origins: ["*"]

writeback:
  enabled: true
  mode: "protected"
  protected:
    confidence_threshold: 0.8

metrics:
  enabled: true
  track_tokens: true
  track_latency: true
  track_cost: true

Web Dashboard

The dashboard is bundled inside the package and served automatically when the server starts. No separate build step is needed.

Access it at http://localhost:8000 after starting the server.

Pages

Page	Description
Overview	System health, live metrics (requests, latency, tokens, cost), component status, and recent activity feed sourced from real trace data.
Pipeline	Interactive SVG visualization of the RAG pipeline. Run test queries and observe step-by-step execution with timing.
Debugger	Browse execution traces, filter by status, inspect individual trace steps with duration breakdown.
Telemetry	Latency/throughput charts, per-operation cost breakdown, and performance percentile tables (p50/p95/p99).
Plugins	Interactive documentation for Encoder, Vector Store, Database, and Multimodal plugin types with code examples.

All dashboard data is fetched from the same FastAPI backend via /health, /metrics, and /traces endpoints -- no mock data.

API Reference

Core Operations

Method	Endpoint	Description
GET	`/health`	System health check with per-component status
POST	`/retrieve`	Retrieve relevant documents for a query
POST	`/retrieve/stream`	Streaming retrieval + generation via SSE
POST	`/ingest`	Ingest documents into the pipeline
PUT	`/update`	Update documents with write-back protection

Traces and Debugging

Method	Endpoint	Description
GET	`/traces`	List execution traces (supports `?limit=N`)
GET	`/traces/{trace_id}`	Get a single trace with full step tree
GET	`/traces/{trace_id}/visualize`	Visual debugger with ASCII/HTML output
GET	`/traces/compare?trace_a=...&trace_b=...`	Side-by-side trace comparison
GET	`/traces/anomalies`	Detect anomalous steps across recent traces

Metrics and Telemetry

Method	Endpoint	Description
GET	`/metrics`	JSON metrics summary (requests, cost, latency)
GET	`/metrics/prometheus`	Prometheus-scrapable text format
GET	`/metrics/chunks`	Chunk size distribution report

Dashboard

Method	Endpoint	Description
GET	`/`	Serves the web dashboard
GET	`/dashboard`	Alias for `/`
GET	`/static/*`	Static assets (CSS, JS)

Authentication

Method	Endpoint	Description
POST	`/auth/token`	Generate a JWT token

Plugin Architecture

GlassBox supports four plugin types. Register plugins via configuration or the Python API.

Encoder Plugins

from glassbox_rag.core.encoder import BaseEncoder
import numpy as np

class CustomEncoder(BaseEncoder):
    async def encode(self, texts: list[str]) -> np.ndarray:
        embeddings = await self.compute_embeddings(texts)
        return np.array(embeddings)

    async def encode_query(self, query: str) -> np.ndarray:
        return await self.encode([query])

Vector Store Plugins

from glassbox_rag.plugins.base import VectorStorePlugin

class CustomVectorStore(VectorStorePlugin):
    async def add_vectors(self, vectors, metadata):
        return await self.store(vectors, metadata)

    async def search(self, query_vector, top_k):
        return await self.perform_search(query_vector, top_k)

Database Plugins

from glassbox_rag.plugins.base import DatabasePlugin

class CustomDatabase(DatabasePlugin):
    async def store_document(self, doc_id, content, metadata):
        await self.insert(doc_id, content, metadata)

    async def get_document(self, doc_id):
        return await self.fetch(doc_id)

Query Pipeline Hooks

from glassbox_rag import GlassBoxEngine, HookPoint

engine = GlassBoxEngine(config)
await engine.initialize()

# Register a hook that runs after retrieval
@engine.hooks.register(HookPoint.POST_RETRIEVAL)
async def log_results(context):
    print(f"Retrieved {len(context['results'])} documents")
    return context

Framework Adapters

GlassBox integrates with popular LLM frameworks:

LangChain

from glassbox_rag.adapters.langchain import GlassBoxRetriever

retriever = GlassBoxRetriever(engine=engine, top_k=5)
docs = await retriever.aget_relevant_documents("your query")

LlamaIndex

from glassbox_rag.adapters.llamaindex import GlassBoxQueryEngine

query_engine = GlassBoxQueryEngine(engine=engine)
response = await query_engine.aquery("your query")

Haystack

from glassbox_rag.adapters.haystack import GlassBoxHaystackRetriever

retriever = GlassBoxHaystackRetriever(engine=engine)

Install adapters with:

pip install "glassbox-rag[adapters-langchain]"
pip install "glassbox-rag[adapters-llamaindex]"
pip install "glassbox-rag[adapters-haystack]"
pip install "glassbox-rag[adapters]"   # all three

Observability

Prometheus Metrics

Scrape http://localhost:8000/metrics/prometheus with your Prometheus instance:

# prometheus.yml
scrape_configs:
  - job_name: "glassbox-rag"
    static_configs:
      - targets: ["localhost:8000"]
    metrics_path: "/metrics/prometheus"

OpenTelemetry

Enable OTLP export in configuration:

telemetry:
  otel_enabled: true
  otel_exporter: "otlp"
  otel_endpoint: "http://localhost:4317"
  service_name: "glassbox-rag"

Trace Storage

Traces can be stored in memory (default), Redis, or PostgreSQL:

trace:
  enabled: true
  backend: "redis"        # or "postgresql" or "memory"
  retention_days: 30
  sample_rate: 1.0

Production Deployment

Docker

docker build -t glassbox-rag .

docker run -p 8000:8000 \
  -v $(pwd)/config:/app/config:ro \
  -e OPENAI_API_KEY=your_key \
  glassbox-rag

Docker Compose

# Development (app + Qdrant + PostgreSQL)
docker-compose up -d

# Production with monitoring
docker-compose -f docker-compose.prod.yml --profile monitoring up -d

Production services include:

GlassBox RAG -- Main application with health checks and resource limits
Qdrant -- Vector database with persistence
PostgreSQL -- Document storage
Redis -- Rate limiting and trace caching (optional)
Prometheus + Grafana -- Metrics and dashboards (optional, monitoring profile)

Architecture

Client Request
      |
      v
  FastAPI Server
      |
      +-- Authentication (JWT / API Key)
      +-- Rate Limiting (Memory / Redis)
      |
      v
  GlassBox Engine
      |
      +-- Encoding Layer ----> [OpenAI | Cohere | Ollama | ONNX | Google | HF]
      +-- Chunker -----------> [Recursive | Sentence | Semantic | Fixed]
      +-- Deduplicator ------> [Content Hash | SimHash]
      +-- Retriever ---------> [Semantic | Keyword | Hybrid | Adaptive]
      +-- Reranker ----------> [Cross-Encoder]
      +-- Generator ---------> [OpenAI | Ollama] (with streaming)
      +-- Vector Store ------> [Qdrant | Chroma | FAISS | Pinecone | Weaviate]
      +-- Database ----------> [PostgreSQL | SQLite | MongoDB | MySQL | Supabase]
      +-- Write-Back Manager -> [Protected | Full | Read-Only]
      +-- Hook Manager ------> [Pre/Post Retrieval | Pre/Post Generation]
      |
      +-- Trace Tracker -----> [Memory | Redis | PostgreSQL]
      +-- Metrics Tracker ---> [Prometheus | JSON]
      +-- Telemetry ---------> [OpenTelemetry | Console]
      |
      v
  Web Dashboard (served at /)
      |
      +-- Overview (live metrics + health)
      +-- Pipeline (interactive visualization)
      +-- Debugger (trace inspection)
      +-- Telemetry (charts + tables)
      +-- Plugins (documentation)

Project Structure

glassbox-rag/
  src/glassbox_rag/
    __init__.py          # Public API and version
    server.py            # FastAPI app factory with dashboard serving
    config.py            # Pydantic v2 configuration models
    cli.py               # CLI entry point
    core/
      engine.py          # Main orchestrator
      retriever.py       # Retrieval strategies
      chunker.py         # Text chunking
      encoder.py         # Embedding layer with cache
      generator.py       # LLM generation with streaming
      hooks.py           # Pipeline hook system
      dedup.py           # Document deduplication
      tokens.py          # Token counting
      writeback.py       # Write-back protection
    trace/
      tracker.py         # Concurrency-safe trace system
      visualizer.py      # ASCII/HTML trace rendering
      backends.py        # Persistent trace storage
    plugins/             # Plugin implementations
    adapters/            # LangChain, LlamaIndex, Haystack
    ui/
      static/
        app.js           # Dashboard JavaScript
        style.css        # Dashboard styles
      templates/
        dashboard.html   # Dashboard HTML
  config/
    default.yaml         # Default configuration
  tests/                 # Test suite

Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=glassbox_rag --cov-report=html

# Run specific categories
pytest tests/unit/ -v
pytest tests/integration/ -v

Contributing

Contributions are welcome. Please:

Fork the repository
Create a feature branch
Add tests for new functionality
Ensure all tests pass and type checks succeed
Submit a pull request

License

Licensed under the Apache License 2.0. See the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Apr 28, 2026

0.3.1

Apr 5, 2026

0.3.0

Apr 5, 2026

0.1.0

Apr 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glassbox_rag-1.0.0.tar.gz (127.4 kB view details)

Uploaded Apr 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

glassbox_rag-1.0.0-py3-none-any.whl (141.8 kB view details)

Uploaded Apr 28, 2026 Python 3

File details

Details for the file glassbox_rag-1.0.0.tar.gz.

File metadata

Download URL: glassbox_rag-1.0.0.tar.gz
Upload date: Apr 28, 2026
Size: 127.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for glassbox_rag-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`9388d5137c3ed1c004410ceee98a87845f190875f930c9dcdc52b8316ea0fdb4`
MD5	`2e34c25bf880c859813560ac622ac1d9`
BLAKE2b-256	`7c7aa54a1296f960263a66b08936e69f302266d6d313710cdb4486c2b1fddc07`

See more details on using hashes here.

File details

Details for the file glassbox_rag-1.0.0-py3-none-any.whl.

File metadata

Download URL: glassbox_rag-1.0.0-py3-none-any.whl
Upload date: Apr 28, 2026
Size: 141.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for glassbox_rag-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6ea7a7badb084c55d80ea1350468a6a7fcf5283bbcd40aaac5b9ea049cbb91a2`
MD5	`bb06be000b54de377e49bba1c751eaff`
BLAKE2b-256	`f2822467e4b109a5870bac82034028a9b18aec44c982a0ab3b4cf36bc638dbfb`

See more details on using hashes here.

glassbox-rag 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GlassBox RAG

Table of Contents

Features

Core RAG Pipeline

LLM Generation

Integrated Web Dashboard

Production Infrastructure

Developer Experience

Installation

From PyPI

With Optional Dependencies

From Source

Quick Start

Python API

Start the Server

CLI

Configuration

Web Dashboard

Pages

API Reference

Core Operations

Traces and Debugging

Metrics and Telemetry

Dashboard

Authentication

Plugin Architecture

Encoder Plugins

Vector Store Plugins

Database Plugins

Query Pipeline Hooks

Framework Adapters

LangChain

LlamaIndex

Haystack

Observability

Prometheus Metrics

OpenTelemetry

Trace Storage

Production Deployment

Docker

Docker Compose

Architecture

Project Structure

Testing

Contributing

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes