Skip to main content

Opinionated agentic RAG powered by LanceDB, Pydantic AI, and Docling

Project description

Haiku RAG

Tests codecov

Agentic RAG built on LanceDB, Pydantic AI, and Docling.

New: vision and multimodal search. Picture-aware ingestion captures embedded figure bytes; vision-capable QA models receive them alongside text. Multimodal embedders put picture vectors in the same space as text, enabling text-as-query → figure hits and image-as-query retrieval.

Features

  • Hybrid search — Vector + full-text with Reciprocal Rank Fusion
  • Multimodal & cross-modal search — Multimodal embedders (vLLM) put picture vectors in the same space as text; supports text-as-query → figure hits and image-as-query
  • Question answering — RAG skill with citations (page numbers, section headings)
  • Vision QA — Vision-capable models receive figure bytes alongside chunk text
  • Reranking — MxBAI, Cohere, Zero Entropy, or vLLM
  • Analysis skill — Complex analytical tasks via sandboxed Python code execution (aggregation, computation, multi-document analysis)
  • Conversational RAG — Chat TUI and web application for multi-turn conversations with session memory
  • Document structure — Stores full DoclingDocument, enabling structure-aware context expansion
  • Multiple providers — Embeddings: Ollama, OpenAI, VoyageAI, LM Studio, vLLM (multimodal). QA: any model supported by Pydantic AI
  • Local-first — Embedded LanceDB, no servers required. Also supports S3, GCS, Azure, and LanceDB Cloud
  • CLI & Python API — Full functionality from command line or code
  • MCP server — Expose as tools for AI assistants (Claude Desktop, etc.)
  • Visual grounding — View chunks highlighted on original page images
  • File monitoring — Watch directories and auto-index on changes
  • Time travel — Query the database at any historical point with --before
  • Inspector — TUI for browsing documents, chunks, and search results

Installation

Python 3.12 or newer required

Full Package (Recommended)

pip install haiku.rag

Includes all features: document processing, all embedding providers, and rerankers.

Using uv? uv pip install haiku.rag

Slim Package (Minimal Dependencies)

pip install haiku.rag-slim

Install only the extras you need. See the Installation documentation for available options.

Quick Start

Note: Requires an embedding provider (Ollama, OpenAI, etc.). See the Tutorial for setup instructions.

# Index a PDF
haiku-rag add-src paper.pdf

# Search
haiku-rag search "attention mechanism"

# Ask questions with citations
haiku-rag ask "What datasets were used for evaluation?" --cite

# Analyze — complex analytical tasks via code execution
haiku-rag analyze "How many documents mention transformers?"

# Interactive chat — multi-turn conversations with memory
haiku-rag chat

# Watch a directory for changes
haiku-rag serve --monitor

See Configuration for customization options.

Python API

from haiku.rag.client import HaikuRAG

async with HaikuRAG("knowledge.lancedb", create=True) as rag:
    # Index documents
    await rag.create_document_from_source("paper.pdf")
    await rag.create_document_from_source("https://arxiv.org/pdf/1706.03762")

    # Search — returns chunks with provenance
    results = await rag.search("self-attention")
    for result in results:
        print(f"{result.score:.2f} | p.{result.page_numbers} | {result.content[:100]}")

    # QA with citations
    answer, citations = await rag.ask("What is the complexity of self-attention?")
    print(answer)
    for cite in citations:
        print(f"  [{cite.chunk_id}] p.{cite.page_numbers}: {cite.content[:80]}")

For details on the skills the client wraps, see the Skills docs.

MCP Server

Use with AI assistants like Claude Desktop:

haiku-rag serve --mcp --stdio

Add to your Claude Desktop configuration:

{
  "mcpServers": {
    "haiku-rag": {
      "command": "haiku-rag",
      "args": ["serve", "--mcp", "--stdio"]
    }
  }
}

Provides tools for document management, search, QA, and analysis directly in your AI assistant.

Examples

See the examples directory for working examples:

  • Docker Setup - Complete Docker deployment with file monitoring and MCP server
  • Web Application - Full-stack conversational RAG with CopilotKit frontend

Documentation

Full documentation at: https://ggozad.github.io/haiku.rag/

License

This project is licensed under the MIT License.

mcp-name: io.github.ggozad/haiku-rag

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haiku_rag-0.48.1.tar.gz (439.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

haiku_rag-0.48.1-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file haiku_rag-0.48.1.tar.gz.

File metadata

  • Download URL: haiku_rag-0.48.1.tar.gz
  • Upload date:
  • Size: 439.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for haiku_rag-0.48.1.tar.gz
Algorithm Hash digest
SHA256 1d97fe4541a4b0e00b8a95d089a1cf243d6bb8dd0fd0fb272ff8884bd9e91019
MD5 05325000ea95a095d4d31a24a6907815
BLAKE2b-256 f700ff3fea8581e397e1f0fce7f44d08beebee78ab41932ccbd6e8b7acf315f3

See more details on using hashes here.

File details

Details for the file haiku_rag-0.48.1-py3-none-any.whl.

File metadata

  • Download URL: haiku_rag-0.48.1-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for haiku_rag-0.48.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9933780b6102ad46de9f4d80a9fe5b43d70c9b1f66a0f8905b60a1ae741e1d9e
MD5 99c2827b5b48cc9c0ea08c3d9fb3193b
BLAKE2b-256 9b456cf5d91a7dec42067c62f19ed5c49f80b8e731459150e04a764c1e5f4d04

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page