Skip to main content

Agentic Retrieval Augmented Generation (RAG) with LanceDB

Project description

Haiku RAG

Retrieval-Augmented Generation (RAG) library built on LanceDB.

haiku.rag is a Retrieval-Augmented Generation (RAG) library built to work with LanceDB as a local vector database. It uses LanceDB for storing embeddings and performs semantic (vector) search as well as full-text search combined through native hybrid search with Reciprocal Rank Fusion. Both open-source (Ollama) as well as commercial (OpenAI, VoyageAI) embedding providers are supported.

Features

  • Local LanceDB: No external servers required, supports also LanceDB cloud storage, S3, Google Cloud & Azure
  • Multiple embedding providers: Ollama, LM Studio, VoyageAI, OpenAI, vLLM
  • Multiple QA providers: Any provider/model supported by Pydantic AI (Ollama, LM Studio, OpenAI, Anthropic, etc.)
  • Native hybrid search: Vector + full-text search with native LanceDB RRF reranking
  • Reranking: Default search result reranking with MixedBread AI, Cohere, Zero Entropy, or vLLM
  • Question answering: Built-in QA agents on your documents
  • Research graph (multi‑agent): Plan → Search → Evaluate → Synthesize with agentic AI
  • File monitoring: Auto-index files when run as server
  • CLI & Python API: Use from command line or Python
  • MCP server: Expose as tools for AI assistants
  • Flexible document processing: Local (docling) or remote (docling-serve) processing

Installation

Python 3.12 or newer required

Full Package (Recommended)

uv pip install haiku.rag

Includes all features: document processing, all embedding providers, and rerankers.

Slim Package (Minimal Dependencies)

uv pip install haiku.rag-slim

Install only the extras you need. See the Installation documentation for available options

Quick Start

# Add documents
haiku-rag add "Your content here"
haiku-rag add "Your content here" --meta author=alice --meta topic=notes
haiku-rag add-src document.pdf --meta source=manual

# Search
haiku-rag search "query"

# Search with filters
haiku-rag search "query" --filter "uri LIKE '%.pdf' AND title LIKE '%paper%'"

# Ask questions
haiku-rag ask "Who is the author of haiku.rag?"

# Ask questions with citations
haiku-rag ask "Who is the author of haiku.rag?" --cite

# Deep QA (multi-agent question decomposition)
haiku-rag ask "Who is the author of haiku.rag?" --deep --cite

# Deep QA with verbose output
haiku-rag ask "Who is the author of haiku.rag?" --deep --verbose

# Multi‑agent research (iterative plan/search/evaluate)
haiku-rag research \
  "What are the main drivers and trends of global temperature anomalies since 1990?" \
  --max-iterations 2 \
  --confidence-threshold 0.8 \
  --max-concurrency 3 \
  --verbose

# Rebuild database (re-chunk and re-embed all documents)
haiku-rag rebuild

# Start server with file monitoring
haiku-rag serve --monitor

To customize settings, create a haiku.rag.yaml config file (see Configuration).

Python Usage

from haiku.rag.client import HaikuRAG
from haiku.rag.config import Config
from haiku.rag.graph.agui import stream_graph
from haiku.rag.graph.research import (
    ResearchContext,
    ResearchDeps,
    ResearchState,
    build_research_graph,
)

async with HaikuRAG("database.lancedb") as client:
    # Add document
    doc = await client.create_document("Your content")

    # Search (reranking enabled by default)
    results = await client.search("query")
    for chunk, score in results:
        print(f"{score:.3f}: {chunk.content}")

    # Ask questions
    answer = await client.ask("Who is the author of haiku.rag?")
    print(answer)

    # Ask questions with citations
    answer = await client.ask("Who is the author of haiku.rag?", cite=True)
    print(answer)

    # Multi‑agent research pipeline (Plan → Search → Evaluate → Synthesize)
    # Graph settings (provider, model, max_iterations, etc.) come from config
    graph = build_research_graph(config=Config)
    question = (
        "What are the main drivers and trends of global temperature "
        "anomalies since 1990?"
    )
    context = ResearchContext(original_question=question)
    state = ResearchState.from_config(context=context, config=Config)
    deps = ResearchDeps(client=client)

    # Blocking run (final result only)
    report = await graph.run(state=state, deps=deps)
    print(report.title)

    # Streaming progress (AG-UI events)
    async for event in stream_graph(graph, state, deps):
        if event["type"] == "STEP_STARTED":
            print(f"Starting step: {event['stepName']}")
        elif event["type"] == "ACTIVITY_SNAPSHOT":
            print(f"  {event['content']}")
        elif event["type"] == "RUN_FINISHED":
            print("\nResearch complete!\n")
            result = event["result"]
            print(result["title"])
            print(result["executive_summary"])

MCP Server

Use with AI assistants like Claude Desktop:

haiku-rag serve --stdio

Provides tools for document management and search directly in your AI assistant.

Examples

See the examples directory for working examples:

  • Interactive Research Assistant - Full-stack research assistant with Pydantic AI and AG-UI featuring human-in-the-loop approval and real-time state synchronization
  • Docker Setup - Complete Docker deployment with file monitoring and MCP server
  • A2A Server - Self-contained A2A protocol server package with conversational agent interface

Documentation

Full documentation at: https://ggozad.github.io/haiku.rag/

mcp-name: io.github.ggozad/haiku-rag

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haiku_rag-0.19.1.tar.gz (276.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

haiku_rag-0.19.1-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file haiku_rag-0.19.1.tar.gz.

File metadata

  • Download URL: haiku_rag-0.19.1.tar.gz
  • Upload date:
  • Size: 276.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for haiku_rag-0.19.1.tar.gz
Algorithm Hash digest
SHA256 95960ae4351c9af33c1b45c996c0a7c13583b0501d7af02c44d3773251c95656
MD5 38002618fbc336ce3b0eba28660b966b
BLAKE2b-256 3fd49b2f8108034253ee55d9e047c26a045022dc5b2970e66bc3b7df8326bae0

See more details on using hashes here.

File details

Details for the file haiku_rag-0.19.1-py3-none-any.whl.

File metadata

  • Download URL: haiku_rag-0.19.1-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for haiku_rag-0.19.1-py3-none-any.whl
Algorithm Hash digest
SHA256 789ede1dd83e5b098c1a890fa078ef01db27a1bb40ee6fce0add9fe29420c9a7
MD5 5a2c40152553fae97080cedc37408e89
BLAKE2b-256 02e941c12675a243f4f018baba9ae6885c581a39aad897ab405ee22ae015c296

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page