Enterprise RAG pipelines with native IRIS vector search. 6 production implementations with RAGAS evaluation, LangChain, AWS/Azure configs. No external VectorDB required.

These details have not been verified by PyPI

Project links

Project description

IRIS Vector RAG Templates

Production-ready Retrieval-Augmented Generation (RAG) pipelines powered by InterSystems IRIS Vector Search

Build intelligent applications that combine large language models with your enterprise data using battle-tested RAG patterns and native vector search capabilities.

Author: Thomas Dyar (thomas.dyar@intersystems.com)

Why IRIS Vector RAG?

🚀 Production-Ready - Six proven RAG architectures ready to deploy, not research prototypes

⚡ Blazing Fast - Native IRIS vector search with HNSW indexing, no external vector databases needed

🔧 Unified API - Swap between RAG strategies with a single line of code

📊 Enterprise-Grade - ACID transactions, connection pooling, and horizontal scaling built-in

🎯 100% Compatible - Works seamlessly with LangChain, RAGAS, and your existing ML stack

🧪 Fully Validated - Comprehensive test suite with automated contract validation

Available RAG Pipelines

Pipeline Type	Use Case	Retrieval Method	When to Use
basic	Standard retrieval	Vector similarity	General Q&A, getting started, baseline comparisons
basic_rerank	Improved precision	Vector + cross-encoder reranking	Higher accuracy requirements, legal/medical domains
crag	Self-correcting	Vector + evaluation + web search fallback	Dynamic knowledge, fact-checking, current events
graphrag	Knowledge graphs	Vector + text + graph + RRF fusion	Complex entity relationships, research, medical knowledge
multi_query_rrf	Multi-perspective	Query expansion + reciprocal rank fusion	Complex queries, comprehensive coverage needed
pylate_colbert	Fine-grained matching	ColBERT late interaction embeddings	Nuanced semantic understanding, high precision

Quick Start

1. Install

# Clone repository
git clone https://github.com/intersystems-community/iris-vector-rag.git
cd iris-vector-rag

# Setup environment (requires uv package manager)
make setup-env
make install
source .venv/bin/activate

# GraphRAG dependency (required for graphrag pipelines)
pip install iris-vector-graph

2. Start IRIS Database

# Start IRIS with Docker Compose
docker-compose up -d

# Initialize database schema
make setup-db

# Optional: Load sample medical data
make load-data

3. Configure API Keys

cat > .env << 'EOF'
OPENAI_API_KEY=your-key-here
ANTHROPIC_API_KEY=your-key-here  # Optional, for Claude models
IRIS_HOST=localhost
IRIS_PORT=1972
IRIS_NAMESPACE=USER
IRIS_USERNAME=_SYSTEM
IRIS_PASSWORD=SYS
EOF

4. Run Your First Query

from iris_vector_rag import create_pipeline

# Create pipeline with automatic validation
pipeline = create_pipeline('basic', validate_requirements=True)

# Load your documents
from iris_rag.core.models import Document

docs = [
    Document(
        page_content="RAG combines retrieval with generation for accurate AI responses.",
        metadata={"source": "rag_basics.pdf", "page": 1}
    ),
    Document(
        page_content="Vector search finds semantically similar content using embeddings.",
        metadata={"source": "vector_search.pdf", "page": 5}
    )
]

pipeline.load_documents(documents=docs)

# Query with LLM-generated answer
result = pipeline.query(
    query="What is RAG?",
    top_k=5,
    generate_answer=True
)

print(f"Answer: {result['answer']}")
print(f"Sources: {result['sources']}")
print(f"Retrieved: {len(result['retrieved_documents'])} documents")

Unified API Across All Pipelines

Switch RAG strategies with one line - all pipelines share the same interface:

from iris_vector_rag import create_pipeline

# Start with basic
pipeline = create_pipeline('basic')
result = pipeline.query("What are the latest cancer treatment approaches?", top_k=5)

# Upgrade to basic_rerank for better accuracy
pipeline = create_pipeline('basic_rerank')
result = pipeline.query("What are the latest cancer treatment approaches?", top_k=5)

# Try graphrag for entity reasoning
pipeline = create_pipeline('graphrag')
result = pipeline.query("What are the latest cancer treatment approaches?", top_k=5)

# All pipelines return the same response format
print(f"Answer: {result['answer']}")
print(f"Sources: {result['sources']}")
print(f"Retrieved: {len(result['retrieved_documents'])} documents")

Standardized Response Format

100% LangChain & RAGAS compatible responses:

{
    "query": "What is diabetes?",
    "answer": "Diabetes is a chronic metabolic condition...",  # LLM answer
    "retrieved_documents": [Document(...)],                   # LangChain Documents
    "contexts": ["context 1", "context 2"],                   # RAGAS contexts
    "sources": ["medical.pdf p.12", "diabetes.pdf p.3"],     # Source citations
    "execution_time": 0.523,
    "metadata": {
        "num_retrieved": 5,
        "pipeline_type": "basic",
        "retrieval_method": "vector",
        "generated_answer": True,
        "processing_time": 0.523
    }
}

Pipeline Selection

Each pipeline uses the same API - just change the pipeline type:

basic - Fast vector similarity search, great for getting started
basic_rerank - Vector + cross-encoder reranking for higher accuracy
crag - Self-correcting with web search fallback for current events
graphrag - Multi-modal: vector + text + knowledge graph fusion
multi_query_rrf - Query expansion with reciprocal rank fusion
pylate_colbert - ColBERT late interaction for fine-grained matching

📖 Complete Pipeline Guide → - Decision tree, performance comparison, configuration examples

Enterprise Features

Production-Ready Database

IRIS provides everything you need in one database:

✅ Native vector search (no external vector DB needed)
✅ ACID transactions (your data is safe)
✅ SQL + NoSQL + Vector in one platform
✅ Horizontal scaling and clustering
✅ Enterprise-grade security and compliance

Connection Pooling

Automatic concurrency management:

from iris_rag.storage import IRISVectorStore

# Connection pool handles concurrency automatically
store = IRISVectorStore()

# Safe for multi-threaded applications
# Pool manages connections, no manual management needed

Automatic Schema Management

Database schema created and migrated automatically:

pipeline = create_pipeline('basic', validate_requirements=True)
# ✅ Checks database connection
# ✅ Validates schema exists
# ✅ Migrates to latest version if needed
# ✅ Reports validation results

RAGAS Evaluation Built-In

Measure your RAG pipeline performance:

# Evaluate all pipelines on your data
make test-ragas-sample

# Generates detailed metrics:
# - Answer Correctness
# - Faithfulness
# - Context Precision
# - Context Recall
# - Answer Relevance

IRIS EMBEDDING: Auto-Vectorization

Automatic embedding generation with model caching - eliminates repeated model loading overhead for faster document vectorization.

Key Features:

⚡ Intelligent model caching - models stay in memory across operations
🎯 Multi-field vectorization - combine title, abstract, and content fields
💾 Automatic device selection - GPU, Apple Silicon (MPS), or CPU fallback

from iris_vector_rag import create_pipeline

# Enable IRIS EMBEDDING support
pipeline = create_pipeline(
    'basic',
    embedding_config='medical_embeddings_v1'
)

# Documents auto-vectorize on INSERT
pipeline.load_documents(documents=docs)

📖 Complete IRIS EMBEDDING Guide → - Configuration, performance tuning, multi-field vectorization, troubleshooting

Fast Iteration & Evaluation (New)

Develop and benchmark RAG pipelines with minimal latency and cost.

💾 Persistent Disk Caching - Cache LLM responses to local JSON files to avoid redundant API costs and enable offline development.
⚡ Auto-Hardening Bypass - Automatically bypasses IRIS password locks for instant connectivity in local/CI containers.
📊 Unified Evaluation Framework - Standardized multi-hop metrics (Recall@K, EM, F1) and dataset loaders (HotpotQA, MuSiQue).

# Enable disk-based caching
pipeline = create_pipeline('basic', llm_cache_backend='disk')

# Standardized multi-hop evaluation
from iris_vector_rag.evaluation import DatasetLoader, MetricsCalculator
loader = DatasetLoader()
queries = loader.load('musique', sample_size=100)

Model Context Protocol (MCP) Support

Expose RAG pipelines as MCP tools for Claude Desktop and other MCP clients - enables conversational RAG workflows where Claude queries your documents during conversations.

# Start MCP server
python -m iris_vector_rag.mcp

All pipelines available as MCP tools: rag_basic, rag_basic_rerank, rag_crag, rag_graphrag, rag_multi_query_rrf, rag_pylate_colbert.

📖 Complete MCP Integration Guide → - Claude Desktop setup, configuration, testing, production deployment

Architecture Overview

Framework-first design with abstract base classes (RAGPipeline, VectorStore) and concrete implementations for 6 production-ready pipelines.

Key Components: Core abstractions, pipeline implementations, IRIS vector store, MCP server, REST API, validation framework.

📖 Comprehensive Architecture Guide → - System design, component interactions, extension points

Documentation

📚 Comprehensive documentation for every use case:

User Guide - Complete installation and usage
API Reference - Detailed API documentation
Pipeline Guide - When to use each pipeline
MCP Integration - Model Context Protocol setup
Production Readiness - Deployment checklist

Testing & Quality

make test  # Run comprehensive test suite
pytest tests/unit/           # Unit tests
pytest tests/integration/    # Integration tests

Research & References

This implementation is based on peer-reviewed research:

Basic RAG: Lewis et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, NeurIPS 2020
CRAG: Yan et al., Corrective Retrieval Augmented Generation, arXiv 2024
GraphRAG: Edge et al., From Local to Global: A Graph RAG Approach, arXiv 2024
ColBERT: Khattab & Zaharia, ColBERT: Efficient and Effective Passage Search, SIGIR 2020

Contributing

We welcome contributions! See CONTRIBUTING.md for development setup, testing guidelines, and pull request process.

Community & Support

🐛 Issues: GitHub Issues
📖 Documentation: Full Documentation
🏢 Enterprise Support: InterSystems Support

License

MIT License - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.10.1

Apr 19, 2026

0.10.0

Apr 19, 2026

0.9.0

Apr 18, 2026

0.8.0

Apr 15, 2026

0.7.0

Apr 5, 2026

0.6.0

Mar 31, 2026

0.5.19

Feb 28, 2026

0.5.18

Feb 16, 2026

0.5.17

Feb 16, 2026

0.5.16

Dec 29, 2025

0.5.14

Nov 25, 2025

0.5.13

Nov 24, 2025

0.5.12

Nov 24, 2025

0.5.11

Nov 24, 2025

0.5.10

Nov 23, 2025

0.5.9

Nov 23, 2025

0.5.8

Nov 23, 2025

0.5.3

Nov 13, 2025

0.5.2

Nov 9, 2025

0.5.1

Nov 9, 2025

0.5.0

Nov 9, 2025

0.4.1

Nov 9, 2025

0.4.0

Nov 9, 2025

0.3.3

Nov 9, 2025

0.3.2

Nov 9, 2025

0.3.1

Nov 9, 2025

0.3.0

Nov 9, 2025

0.2.6

Nov 8, 2025

0.2.5

Nov 8, 2025

0.2.4

Nov 8, 2025

0.2.3

Nov 6, 2025

0.2.2

Nov 6, 2025

0.2.1

Nov 6, 2025

0.2.0

Nov 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iris_vector_rag-0.10.1.tar.gz (455.5 kB view details)

Uploaded Apr 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iris_vector_rag-0.10.1-py3-none-any.whl (494.8 kB view details)

Uploaded Apr 19, 2026 Python 3

File details

Details for the file iris_vector_rag-0.10.1.tar.gz.

File metadata

Download URL: iris_vector_rag-0.10.1.tar.gz
Upload date: Apr 19, 2026
Size: 455.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for iris_vector_rag-0.10.1.tar.gz
Algorithm	Hash digest
SHA256	`1a0cfbf6553a1ed78ca614edf3d7c39fa69212449c954c54654e531bb143c39a`
MD5	`b1c5e8a3174463d2154501e1277e86b1`
BLAKE2b-256	`05992fe94a66392a1f344b375956aa8913ec76bef634c8f0cc0f315fb121afc6`

See more details on using hashes here.

File details

Details for the file iris_vector_rag-0.10.1-py3-none-any.whl.

File metadata

Download URL: iris_vector_rag-0.10.1-py3-none-any.whl
Upload date: Apr 19, 2026
Size: 494.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for iris_vector_rag-0.10.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7d316a7eec4c40474d9e87c592a1240f10e879dbf4dc50e23e5849769b3242fc`
MD5	`8e0cb95d5a62582e8ab545ec6418b3d6`
BLAKE2b-256	`24e8d103acf2f5a4ff3138416bb08f59cd2459e698969748d986d8b93dc120a7`

See more details on using hashes here.

iris-vector-rag 0.10.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

IRIS Vector RAG Templates

Why IRIS Vector RAG?

Available RAG Pipelines

Quick Start

1. Install

2. Start IRIS Database

3. Configure API Keys

4. Run Your First Query

Unified API Across All Pipelines

Standardized Response Format

Pipeline Selection

Enterprise Features

Production-Ready Database

Connection Pooling

Automatic Schema Management

RAGAS Evaluation Built-In

IRIS EMBEDDING: Auto-Vectorization

Fast Iteration & Evaluation (New)

Model Context Protocol (MCP) Support

Architecture Overview

Documentation

Testing & Quality

Research & References

Contributing

Community & Support

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes