Skip to main content

Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling

Project description

Haiku RAG

Agentic RAG built on LanceDB, Pydantic AI, and Docling.

Features

  • Hybrid search — Vector + full-text with Reciprocal Rank Fusion
  • Reranking — MxBAI, Cohere, Zero Entropy, or vLLM
  • Question answering — QA agents with citations (page numbers, section headings)
  • Research agents — Multi-agent workflows via pydantic-graph: plan, search, evaluate, synthesize
  • Document structure — Stores full DoclingDocument, enabling structure-aware context expansion and visual grounding
  • Multiple providers — Embeddings: Ollama, OpenAI, VoyageAI, LM Studio, vLLM. QA/Research: any model supported by Pydantic AI
  • Local-first — Embedded LanceDB, no servers required. Also supports S3, GCS, Azure, and LanceDB Cloud
  • MCP server — Expose as tools for AI assistants (Claude Desktop, etc.)
  • File monitoring — Watch directories and auto-index on changes
  • Inspector — TUI for browsing documents, chunks, and search results
  • CLI & Python API — Full functionality from command line or code

Installation

Python 3.12 or newer required

Full Package (Recommended)

uv pip install haiku.rag

Includes all features: document processing, all embedding providers, and rerankers.

Slim Package (Minimal Dependencies)

uv pip install haiku.rag-slim

Install only the extras you need. See the Installation documentation for available options

Quick Start

# Index a PDF
haiku-rag add-src paper.pdf

# Search
haiku-rag search "attention mechanism"

# Ask questions with citations
haiku-rag ask "What datasets were used for evaluation?" --cite

# Deep QA — decomposes complex questions into sub-queries
haiku-rag ask "How does the proposed method compare to the baseline on MMLU?" --deep

# Research mode — iterative planning and search
haiku-rag research "What are the limitations of the approach?" --verbose

# Watch a directory for changes
haiku-rag serve --monitor

See Configuration for customization options.

Python API

from haiku.rag.client import HaikuRAG

async with HaikuRAG("research.lancedb", create=True) as rag:
    # Index documents
    await rag.create_document_from_source("paper.pdf")
    await rag.create_document_from_source("https://arxiv.org/pdf/1706.03762")

    # Search — returns chunks with provenance
    results = await rag.search("self-attention")
    for result in results:
        print(f"{result.score:.2f} | p.{result.page_numbers} | {result.content[:100]}")

    # QA with citations
    answer, citations = await rag.ask("What is the complexity of self-attention?")
    print(answer)
    for cite in citations:
        print(f"  [{cite.chunk_id}] p.{cite.page_numbers}: {cite.content[:80]}")

For research agents and streaming with AG-UI, see the Agents docs.

MCP Server

Use with AI assistants like Claude Desktop:

haiku-rag serve --mcp --stdio

Add to your Claude Desktop configuration:

{
  "mcpServers": {
    "haiku-rag": {
      "command": "haiku-rag",
      "args": ["serve", "--mcp", "--stdio"]
    }
  }
}

Provides tools for document management, search, QA, and research directly in your AI assistant.

Examples

See the examples directory for working examples:

  • Interactive Research Assistant - Full-stack research assistant with Pydantic AI and AG-UI featuring human-in-the-loop approval and real-time state synchronization
  • Docker Setup - Complete Docker deployment with file monitoring and MCP server
  • A2A Server - Self-contained A2A protocol server package with conversational agent interface

Documentation

Full documentation at: https://ggozad.github.io/haiku.rag/

mcp-name: io.github.ggozad/haiku-rag

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haiku_rag-0.20.0.tar.gz (271.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

haiku_rag-0.20.0-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file haiku_rag-0.20.0.tar.gz.

File metadata

  • Download URL: haiku_rag-0.20.0.tar.gz
  • Upload date:
  • Size: 271.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for haiku_rag-0.20.0.tar.gz
Algorithm Hash digest
SHA256 3ceeda78b497c32feead5ec96f290e9b9143abb7f726b2458a9a54abf85ec328
MD5 1a5cc57f3d5b35c4e5e33225ed1190d5
BLAKE2b-256 64885774b117227ae3b682dc62023e8df2f9254cc8fb7c8c583b31272e3740bc

See more details on using hashes here.

File details

Details for the file haiku_rag-0.20.0-py3-none-any.whl.

File metadata

  • Download URL: haiku_rag-0.20.0-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for haiku_rag-0.20.0-py3-none-any.whl
Algorithm Hash digest
SHA256 80662db3710bdeb1b552ca52a0d44230243f9c825be1d06284cc6cdacc897f49
MD5 36bd60da149f92512a8c25c543fe4859
BLAKE2b-256 03a3e60073463ad66cde974227d5b2cc10563a20b1ab852f39e9b27652bc5fc6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page