Skip to main content

Production-ready, extensible RAG framework with native IRIS vector search - unified API for basic, CRAG, GraphRAG, and ColBERT pipelines with RAGAS and DSPy integration

Project description

IRIS Vector RAG Templates

Production-ready Retrieval-Augmented Generation (RAG) pipelines powered by InterSystems IRIS Vector Search

Build intelligent applications that combine large language models with your enterprise data using battle-tested RAG patterns and native vector search capabilities.

License: MIT Python 3.11+ InterSystems IRIS

Why IRIS Vector RAG?

๐Ÿš€ Production-Ready - Six proven RAG architectures ready to deploy, not research prototypes

โšก Blazing Fast - Native IRIS vector search with HNSW indexing, no external vector databases needed

๐Ÿ”ง Unified API - Swap between RAG strategies with a single line of code

๐Ÿ“Š Enterprise-Grade - ACID transactions, connection pooling, and horizontal scaling built-in

๐ŸŽฏ 100% Compatible - Works seamlessly with LangChain, RAGAS, and your existing ML stack

๐Ÿงช Fully Validated - Comprehensive test suite with automated contract validation

Available RAG Pipelines

Pipeline Type Use Case Retrieval Method When to Use
basic Standard retrieval Vector similarity General Q&A, getting started, baseline comparisons
basic_rerank Improved precision Vector + cross-encoder reranking Higher accuracy requirements, legal/medical domains
crag Self-correcting Vector + evaluation + web search fallback Dynamic knowledge, fact-checking, current events
graphrag Knowledge graphs Vector + text + graph + RRF fusion Complex entity relationships, research, medical knowledge
multi_query_rrf Multi-perspective Query expansion + reciprocal rank fusion Complex queries, comprehensive coverage needed
pylate_colbert Fine-grained matching ColBERT late interaction embeddings Nuanced semantic understanding, high precision

Quick Start

1. Install

# Clone repository
git clone https://github.com/intersystems-community/iris-rag-templates.git
cd iris-rag-templates

# Setup environment (requires uv package manager)
make setup-env
make install
source .venv/bin/activate

2. Start IRIS Database

# Start IRIS with Docker Compose
docker-compose up -d

# Initialize database schema
make setup-db

# Optional: Load sample medical data
make load-data

3. Configure API Keys

cat > .env << 'EOF'
OPENAI_API_KEY=your-key-here
ANTHROPIC_API_KEY=your-key-here  # Optional, for Claude models
IRIS_HOST=localhost
IRIS_PORT=1972
IRIS_NAMESPACE=USER
IRIS_USERNAME=_SYSTEM
IRIS_PASSWORD=SYS
EOF

4. Run Your First Query

from iris_rag import create_pipeline

# Create pipeline with automatic validation
pipeline = create_pipeline('basic', validate_requirements=True)

# Load your documents
from iris_rag.core.models import Document

docs = [
    Document(
        page_content="RAG combines retrieval with generation for accurate AI responses.",
        metadata={"source": "rag_basics.pdf", "page": 1}
    ),
    Document(
        page_content="Vector search finds semantically similar content using embeddings.",
        metadata={"source": "vector_search.pdf", "page": 5}
    )
]

pipeline.load_documents(documents=docs)

# Query with LLM-generated answer
result = pipeline.query(
    query="What is RAG?",
    top_k=5,
    generate_answer=True
)

print(f"Answer: {result['answer']}")
print(f"Sources: {result['sources']}")
print(f"Retrieved: {len(result['retrieved_documents'])} documents")

Unified API Across All Pipelines

Switch RAG strategies with one line - all pipelines share the same interface:

from iris_rag import create_pipeline

# Try different strategies instantly
for pipeline_type in ['basic', 'basic_rerank', 'crag', 'multi_query_rrf', 'graphrag']:
    pipeline = create_pipeline(pipeline_type)

    result = pipeline.query(
        query="What are the latest cancer treatment approaches?",
        top_k=5,
        generate_answer=True
    )

    print(f"\n{pipeline_type.upper()}:")
    print(f"  Answer: {result['answer'][:150]}...")
    print(f"  Retrieved: {len(result['retrieved_documents'])} docs")
    print(f"  Confidence: {result['metadata'].get('confidence', 'N/A')}")

Standardized Response Format

100% LangChain & RAGAS compatible responses:

{
    "query": "What is diabetes?",
    "answer": "Diabetes is a chronic metabolic condition...",  # LLM answer
    "retrieved_documents": [Document(...)],                   # LangChain Documents
    "contexts": ["context 1", "context 2"],                   # RAGAS contexts
    "sources": ["medical.pdf p.12", "diabetes.pdf p.3"],     # Source citations
    "execution_time": 0.523,
    "metadata": {
        "num_retrieved": 5,
        "pipeline_type": "basic",
        "retrieval_method": "vector",
        "generated_answer": True,
        "processing_time": 0.523
    }
}

Pipeline Deep Dives

CRAG: Self-Correcting Retrieval

Automatically evaluates retrieval quality and falls back to web search when needed:

from iris_rag import create_pipeline

pipeline = create_pipeline('crag')

# CRAG evaluates retrieved documents and uses web search if quality is low
result = pipeline.query(
    query="What happened in the 2024 Olympics opening ceremony?",
    top_k=5,
    generate_answer=True
)

# Check which retrieval method was used
print(f"Method: {result['metadata']['retrieval_method']}")  # 'vector' or 'web_search'
print(f"Confidence: {result['metadata']['confidence']}")     # 0.0 - 1.0

HybridGraphRAG: Multi-Modal Search

Combines vector search, text search, and knowledge graph traversal:

pipeline = create_pipeline('graphrag')

result = pipeline.query(
    query_text="cancer treatment targets",
    method="rrf",        # Reciprocal Rank Fusion across all methods
    vector_k=30,         # Top 30 from vector search
    text_k=30,           # Top 30 from text search
    graph_k=10,          # Top 10 from knowledge graph
    generate_answer=True
)

# Rich metadata includes entities and relationships
print(f"Entities: {result['metadata']['entities']}")
print(f"Relationships: {result['metadata']['relationships']}")
print(f"Graph depth: {result['metadata']['graph_depth']}")

MultiQueryRRF: Multi-Perspective Retrieval

Expands queries into multiple perspectives and fuses results:

pipeline = create_pipeline('multi_query_rrf')

# Automatically generates query variations and combines results
result = pipeline.query(
    query="How does machine learning work?",
    top_k=10,
    generate_answer=True
)

# See the generated query variations
print(f"Query variations: {result['metadata']['generated_queries']}")
print(f"Fusion method: {result['metadata']['fusion_method']}")  # 'rrf'

Enterprise Features

Production-Ready Database

IRIS provides everything you need in one database:

  • โœ… Native vector search (no external vector DB needed)
  • โœ… ACID transactions (your data is safe)
  • โœ… SQL + NoSQL + Vector in one platform
  • โœ… Horizontal scaling and clustering
  • โœ… Enterprise-grade security and compliance

Connection Pooling

Automatic concurrency management:

from iris_rag.storage import IRISVectorStore

# Connection pool handles concurrency automatically
store = IRISVectorStore()

# Safe for multi-threaded applications
# Pool manages connections, no manual management needed

Automatic Schema Management

Database schema created and migrated automatically:

pipeline = create_pipeline('basic', validate_requirements=True)
# โœ… Checks database connection
# โœ… Validates schema exists
# โœ… Migrates to latest version if needed
# โœ… Reports validation results

RAGAS Evaluation Built-In

Measure your RAG pipeline performance:

# Evaluate all pipelines on your data
make test-ragas-sample

# Generates detailed metrics:
# - Answer Correctness
# - Faithfulness
# - Context Precision
# - Context Recall
# - Answer Relevance

Model Context Protocol (MCP) Support

Expose RAG pipelines as MCP tools for use with Claude Desktop and other MCP clients:

# Start MCP server
python -m iris_rag.mcp

# Available MCP tools:
# - rag_basic
# - rag_basic_rerank
# - rag_crag
# - rag_multi_query_rrf
# - rag_graphrag
# - rag_hybrid_graphrag
# - health_check
# - list_tools

Configure in Claude Desktop:

{
  "mcpServers": {
    "iris-rag": {
      "command": "python",
      "args": ["-m", "iris_rag.mcp"],
      "env": {
        "OPENAI_API_KEY": "your-key"
      }
    }
  }
}

Architecture Overview

iris_rag/
โ”œโ”€โ”€ core/              # Abstract base classes (RAGPipeline, VectorStore)
โ”œโ”€โ”€ pipelines/         # Pipeline implementations
โ”‚   โ”œโ”€โ”€ basic.py                    # BasicRAG
โ”‚   โ”œโ”€โ”€ basic_rerank.py             # Reranking pipeline
โ”‚   โ”œโ”€โ”€ crag.py                     # Corrective RAG
โ”‚   โ”œโ”€โ”€ multi_query_rrf.py          # Multi-query with RRF
โ”‚   โ”œโ”€โ”€ graphrag.py                 # Graph-based RAG
โ”‚   โ””โ”€โ”€ hybrid_graphrag.py          # Hybrid multi-modal
โ”œโ”€โ”€ storage/           # Vector store implementations
โ”‚   โ”œโ”€โ”€ vector_store_iris.py        # IRIS vector store
โ”‚   โ””โ”€โ”€ schema_manager.py           # Schema management
โ”œโ”€โ”€ mcp/              # Model Context Protocol server
โ”œโ”€โ”€ api/              # Production REST API
โ”œโ”€โ”€ services/         # Business logic (entity extraction, etc.)
โ”œโ”€โ”€ config/           # Configuration management
โ””โ”€โ”€ validation/       # Pipeline contract validation

Documentation

๐Ÿ“š Comprehensive documentation for every use case:

Performance Benchmarks

Native IRIS vector search delivers:

  • ๐Ÿš€ 50-100x faster than traditional solutions for hybrid search
  • โšก Sub-second queries on millions of documents
  • ๐Ÿ“Š Linear scaling with IRIS clustering
  • ๐Ÿ’พ 10x less memory than external vector databases

Testing & Quality

# Run comprehensive test suite
make test

# Test specific categories
pytest tests/unit/           # Unit tests (fast)
pytest tests/integration/    # Integration tests (with IRIS)
pytest tests/contract/       # API contract validation

# Run with coverage
pytest --cov=iris_rag --cov-report=html

For detailed testing documentation, see DEVELOPMENT.md

Research & References

This implementation is based on peer-reviewed research:

Contributing

We welcome contributions! See CONTRIBUTING.md for:

  • Development setup
  • Testing guidelines
  • Code style and standards
  • Pull request process

Community & Support

License

MIT License - see LICENSE for details.


Built with โค๏ธ by the InterSystems Community

Powering intelligent applications with enterprise-grade RAG

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iris_vector_rag-0.2.3.tar.gz (509.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iris_vector_rag-0.2.3-py3-none-any.whl (543.2 kB view details)

Uploaded Python 3

File details

Details for the file iris_vector_rag-0.2.3.tar.gz.

File metadata

  • Download URL: iris_vector_rag-0.2.3.tar.gz
  • Upload date:
  • Size: 509.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for iris_vector_rag-0.2.3.tar.gz
Algorithm Hash digest
SHA256 dbdb3a9f4d28a45dc0fadefec9e4f20dc90a5191e98a356cfb6d0e7b07c087cc
MD5 6f36c7e6fda1321468eafcef3a998037
BLAKE2b-256 73ed9f980a27d37d903b419350082ec1578c0f002ce7f4cbeb032858715d8187

See more details on using hashes here.

File details

Details for the file iris_vector_rag-0.2.3-py3-none-any.whl.

File metadata

File hashes

Hashes for iris_vector_rag-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f39e490f7456521d4c55379f2dda7b505bd186b504f8bb10aaaa28c45f37092c
MD5 986544d93831870f95b46265146ee4b8
BLAKE2b-256 a895209e2b1576b5252c666d75e145b4c1f52f5b3f66ba3f598c02fdcfdb90b0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page