State-of-the-art RAG system with MongoDB Atlas and Voyage AI

These details have not been verified by PyPI

Project links

Project description

HybridRAG

State-of-the-art Retrieval-Augmented Generation (RAG) system powered by MongoDB Atlas and Voyage AI.

Features

MongoDB Atlas Storage - Unified vector, graph, and key-value storage
Voyage AI Embeddings - High-quality embeddings with voyage-3-large (1024 dimensions)
Voyage AI Reranking - Precision reranking with rerank-2.5
Multi-Provider LLM Support - Claude, GPT-4, and Gemini
Knowledge Graph Construction - Automatic entity and relationship extraction
Entity Boosting - Enhanced retrieval through entity-aware reranking
Implicit Semantic Expansion - Find related concepts via vector similarity
Conversation Memory - Multi-turn conversation support with MongoDB-backed sessions
Hybrid Search - Combined vector and text search with MongoDB $rankFusion

Quick Start

Prerequisites

Python 3.11+
MongoDB Atlas cluster with Vector Search enabled
Voyage AI API key
LLM API key (Anthropic, OpenAI, or Google)

Installation

# Clone the repository
git clone https://github.com/romiluz13/Hybrid-Search-RAG.git
cd hybridrag

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -e .

Configuration

Create a .env file with your credentials:

# MongoDB Atlas
MONGODB_URI=mongodb+srv://user:password@cluster.mongodb.net
MONGODB_DATABASE=hybridrag

# Voyage AI
VOYAGE_API_KEY=pa-xxxxxxxxxxxxx

# LLM Provider (choose one)
ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxx
# OPENAI_API_KEY=sk-xxxxxxxxxxxxx
# GEMINI_API_KEY=xxxxxxxxxxxxx

# Optional: Langfuse Observability
# LANGFUSE_PUBLIC_KEY=pk-lf-xxxxxxxxxxxxx
# LANGFUSE_SECRET_KEY=sk-lf-xxxxxxxxxxxxx

Basic Usage

import asyncio
from hybridrag import create_hybridrag, Settings

async def main():
    # Initialize HybridRAG
    settings = Settings(
        mongodb_database="my_database",
        llm_provider="anthropic",  # or "openai", "gemini"
    )
    rag = await create_hybridrag(settings)

    # Ingest documents
    await rag.ingest("path/to/document.pdf")

    # Query with conversation memory
    session_id = await rag.create_conversation_session()

    result = await rag.query_with_memory(
        query="What is this document about?",
        session_id=session_id,
        mode="mix",  # Combines knowledge graph and vector search
    )

    print(result["answer"])

asyncio.run(main())

Query Modes

Mode	Description	Best For
`mix`	Knowledge graph + vector search (recommended)	General queries
`local`	Entity-focused retrieval	Specific entity queries
`global`	Community summaries	High-level overview
`hybrid`	Local + global	Comprehensive answers
`naive`	Vector search only	Simple similarity search

API Server

Start the FastAPI server:

uvicorn src.hybridrag.api.main:app --reload

Endpoints

POST /query - Query the RAG system
POST /ingest - Ingest documents
POST /sessions - Create conversation session
GET /sessions/{id}/history - Get conversation history
GET /health - Health check

Architecture

┌─────────────────────────────────────────────────────────────┐
│                        HybridRAG                            │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │   Voyage    │  │   Claude/   │  │    MongoDB Atlas    │  │
│  │  Embeddings │  │  GPT/Gemini │  │  (Vector + Graph)   │  │
│  └─────────────┘  └─────────────┘  └─────────────────────┘  │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────────────────────────────────────────────────┐│
│  │                   Enhancements                          ││
│  │  • Entity Boosting  • Implicit Expansion  • Reranking   ││
│  └─────────────────────────────────────────────────────────┘│
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────────────────────────────────────────────────┐│
│  │                 Conversation Memory                     ││
│  │           MongoDB-backed session storage                ││
│  └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

Configuration Options

from hybridrag import Settings

settings = Settings(
    # MongoDB
    mongodb_database="hybridrag",

    # Embedding
    embedding_model="voyage-3-large",
    embedding_dimensions=1024,

    # Reranking
    rerank_model="rerank-2.5",
    rerank_top_k=10,

    # LLM
    llm_provider="anthropic",  # "openai", "gemini"
    llm_model="claude-sonnet-4-20250514",

    # Query defaults
    default_query_mode="mix",
    chunk_top_k=10,
    entity_top_k=60,
)

Development

# Run tests
pytest tests/ -v

# Type checking
mypy src/

# Format code
black src/ tests/
isort src/ tests/

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

License

Apache License 2.0 - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Dec 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mongodb_hybridrag-0.3.0.tar.gz (380.7 kB view details)

Uploaded Dec 15, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mongodb_hybridrag-0.3.0-py3-none-any.whl (401.8 kB view details)

Uploaded Dec 15, 2025 Python 3

File details

Details for the file mongodb_hybridrag-0.3.0.tar.gz.

File metadata

Download URL: mongodb_hybridrag-0.3.0.tar.gz
Upload date: Dec 15, 2025
Size: 380.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for mongodb_hybridrag-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`be03a3e8a26e547c98142d20b901b58bc5a258d436934cd495a5f5ef8c67df98`
MD5	`1c28d49fcffe2a22c8937568d5a5cadf`
BLAKE2b-256	`1f2c1e8cc144d0e44e76a124cb186573ea14f7a71263c089c46d4306a0e00e73`

See more details on using hashes here.

File details

Details for the file mongodb_hybridrag-0.3.0-py3-none-any.whl.

File metadata

Download URL: mongodb_hybridrag-0.3.0-py3-none-any.whl
Upload date: Dec 15, 2025
Size: 401.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for mongodb_hybridrag-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`870b23ed68047eb88d9bc403327dac2ec881b8d7209a45b7ddaed685c91e2d0a`
MD5	`08e9c3c5390e28ca85b339909d34eb22`
BLAKE2b-256	`8b3e213bccea84775f3abf1e5ef14e58537d1f0551eddbed7a25f58a57a36eb1`

See more details on using hashes here.

mongodb-hybridrag 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

HybridRAG

Features

Quick Start

Prerequisites

Installation

Configuration

Basic Usage

Query Modes

API Server

Endpoints

Architecture

Configuration Options

Development

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes