Agentic Retrieval Augmented Generation (RAG) with LanceDB

These details have not been verified by PyPI

Project description

Haiku RAG

Retrieval-Augmented Generation (RAG) library built on LanceDB.

haiku.rag is a Retrieval-Augmented Generation (RAG) library built to work with LanceDB as a local vector database. It uses LanceDB for storing embeddings and performs semantic (vector) search as well as full-text search combined through native hybrid search with Reciprocal Rank Fusion. Both open-source (Ollama) as well as commercial (OpenAI, VoyageAI) embedding providers are supported.

Note: Configuration now uses YAML files instead of environment variables. If you're upgrading from an older version, run haiku-rag init-config --from-env to migrate your .env file to haiku.rag.yaml. See Configuration for details.

Features

Local LanceDB: No external servers required, supports also LanceDB cloud storage, S3, Google Cloud & Azure
Multiple embedding providers: Ollama, VoyageAI, OpenAI, vLLM
Multiple QA providers: Any provider/model supported by Pydantic AI
Research graph (multi‑agent): Plan → Search → Evaluate → Synthesize with agentic AI
Native hybrid search: Vector + full-text search with native LanceDB RRF reranking
Reranking: Default search result reranking with MixedBread AI, Cohere, or vLLM
Question answering: Built-in QA agents on your documents
File monitoring: Auto-index files when run as server
40+ file formats: PDF, DOCX, HTML, Markdown, code files, URLs
MCP server: Expose as tools for AI assistants
A2A agent: Conversational agent with context and multi-turn dialogue
CLI & Python API: Use from command line or Python

Quick Start

# Install
# Python 3.12 or newer required
uv pip install haiku.rag

# Add documents
haiku-rag add "Your content here"
haiku-rag add "Your content here" --meta author=alice --meta topic=notes
haiku-rag add-src document.pdf --meta source=manual

# Search
haiku-rag search "query"

# Ask questions
haiku-rag ask "Who is the author of haiku.rag?"

# Ask questions with citations
haiku-rag ask "Who is the author of haiku.rag?" --cite

# Deep QA (multi-agent question decomposition)
haiku-rag ask "Who is the author of haiku.rag?" --deep --cite

# Deep QA with verbose output
haiku-rag ask "Who is the author of haiku.rag?" --deep --verbose

# Multi‑agent research (iterative plan/search/evaluate)
haiku-rag research \
  "What are the main drivers and trends of global temperature anomalies since 1990?" \
  --max-iterations 2 \
  --confidence-threshold 0.8 \
  --max-concurrency 3 \
  --verbose

# Rebuild database (re-chunk and re-embed all documents)
haiku-rag rebuild

# Start server with file monitoring
haiku-rag serve --monitor

To customize settings, create a haiku.rag.yaml config file (see Configuration).

Python Usage

from haiku.rag.client import HaikuRAG
from haiku.rag.research import (
    PlanNode,
    ResearchContext,
    ResearchDeps,
    ResearchState,
    build_research_graph,
    stream_research_graph,
)

async with HaikuRAG("database.lancedb") as client:
    # Add document
    doc = await client.create_document("Your content")

    # Search (reranking enabled by default)
    results = await client.search("query")
    for chunk, score in results:
        print(f"{score:.3f}: {chunk.content}")

    # Ask questions
    answer = await client.ask("Who is the author of haiku.rag?")
    print(answer)

    # Ask questions with citations
    answer = await client.ask("Who is the author of haiku.rag?", cite=True)
    print(answer)

    # Multi‑agent research pipeline (Plan → Search → Evaluate → Synthesize)
    graph = build_research_graph()
    question = (
        "What are the main drivers and trends of global temperature "
        "anomalies since 1990?"
    )
    state = ResearchState(
        context=ResearchContext(original_question=question),
        max_iterations=2,
        confidence_threshold=0.8,
        max_concurrency=2,
    )
    deps = ResearchDeps(client=client)

    # Blocking run (final result only)
    result = await graph.run(
        PlanNode(provider="openai", model="gpt-4o-mini"),
        state=state,
        deps=deps,
    )
    print(result.output.title)

    # Streaming progress (log/report/error events)
    async for event in stream_research_graph(
        graph,
        PlanNode(provider="openai", model="gpt-4o-mini"),
        state,
        deps,
    ):
        if event.type == "log":
            iteration = event.state.iterations if event.state else state.iterations
            print(f"[{iteration}] {event.message}")
        elif event.type == "report":
            print("\nResearch complete!\n")
            print(event.report.title)
            print(event.report.executive_summary)

MCP Server

Use with AI assistants like Claude Desktop:

haiku-rag serve --stdio

Provides tools for document management and search directly in your AI assistant.

A2A Agent

Run as a conversational agent with the Agent-to-Agent protocol:

# Start the A2A server
haiku-rag serve --a2a

# Connect with the interactive client (in another terminal)
haiku-rag a2aclient

The A2A agent provides:

Multi-turn dialogue with context
Intelligent multi-search for complex questions
Source citations with titles and URIs
Full document retrieval on request

Examples

See the examples directory for working examples:

Interactive Research Assistant - Full-stack research assistant with Pydantic AI and AG-UI featuring human-in-the-loop approval and real-time state synchronization
Docker Setup - Complete Docker deployment with file monitoring, MCP server, and A2A agent
A2A Security - Authentication examples (API key, OAuth2, GitHub)

Documentation

Full documentation at: https://ggozad.github.io/haiku.rag/

Installation - Provider setup
Configuration - YAML configuration
CLI - Command reference
Python API - Complete API docs
Agents - QA agent and multi-agent research
MCP Server - Model Context Protocol integration
A2A Agent - Agent-to-Agent protocol support
Benchmarks - Performance Benchmarks

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.44.0

Apr 29, 2026

0.43.1

Apr 25, 2026

0.43.0

Apr 24, 2026

0.42.1

Apr 22, 2026

0.42.0

Apr 22, 2026

0.41.0

Apr 20, 2026

0.40.1

Apr 17, 2026

0.40.0

Apr 16, 2026

0.39.0

Apr 9, 2026

0.38.0

Apr 8, 2026

0.37.0

Apr 7, 2026

0.36.3

Apr 1, 2026

0.36.2

Mar 28, 2026

0.36.1

Mar 27, 2026

0.36.0

Mar 26, 2026

0.35.1

Mar 24, 2026

0.35.0

Mar 24, 2026

0.34.1

Mar 16, 2026

0.34.0

Mar 13, 2026

0.33.3

Mar 12, 2026

0.33.2

Mar 11, 2026

0.33.1

Mar 6, 2026

0.33.0

Mar 4, 2026

0.32.3

Mar 3, 2026

0.32.2

Feb 28, 2026

0.32.0

Feb 24, 2026

0.31.1

Feb 20, 2026

0.31.0

Feb 20, 2026

0.30.2

Feb 19, 2026

0.30.1

Feb 17, 2026

0.30.0

Feb 16, 2026

0.29.1

Feb 10, 2026

0.29.0

Feb 6, 2026

0.28.0

Jan 31, 2026

0.27.2

Jan 29, 2026

0.27.1

Jan 27, 2026

0.27.0

Jan 26, 2026

0.26.9

Jan 22, 2026

0.26.8

Jan 22, 2026

0.26.7

Jan 20, 2026

0.26.6

Jan 19, 2026

0.26.5

Jan 16, 2026

0.26.4

Jan 15, 2026

0.26.3

Jan 15, 2026

0.26.2

Jan 13, 2026

0.26.1

Jan 13, 2026

0.26.0

Jan 13, 2026

0.25.0

Jan 12, 2026

0.24.2

Jan 8, 2026

0.24.1

Jan 8, 2026

0.24.0

Jan 7, 2026

0.23.2

Jan 5, 2026

0.23.1

Dec 29, 2025

0.23.0

Dec 26, 2025

0.22.0

Dec 19, 2025

0.21.0

Dec 18, 2025

0.20.2

Dec 12, 2025

0.20.1

Dec 11, 2025

0.20.0

Dec 10, 2025

0.19.6

Dec 3, 2025

0.19.5

Dec 1, 2025

0.19.4

Nov 28, 2025

0.19.3

Nov 27, 2025

0.19.2

Nov 27, 2025

0.19.1

Nov 26, 2025

0.19.0

Nov 25, 2025

0.18.0

Nov 21, 2025

0.17.2

Nov 19, 2025

0.17.1

Nov 18, 2025

0.17.0

Nov 17, 2025

0.16.1

Nov 14, 2025

0.16.0

Nov 13, 2025

0.15.0

Nov 7, 2025

0.14.1

Nov 6, 2025

0.14.0

Nov 5, 2025

0.13.3

Oct 30, 2025

0.13.2

Oct 29, 2025

0.13.1

Oct 28, 2025

This version

0.13.0

Oct 27, 2025

0.12.1

Oct 16, 2025

0.12.0

Oct 14, 2025

0.11.4

Oct 7, 2025

0.11.3

Oct 1, 2025

0.11.2

Sep 30, 2025

0.11.1

Sep 25, 2025

0.11.0

Sep 24, 2025

0.10.2

Sep 23, 2025

0.10.1

Sep 22, 2025

0.10.0

Sep 19, 2025

0.9.3

Sep 19, 2025

0.9.2

Sep 17, 2025

0.9.1

Sep 17, 2025

0.9.0

Sep 17, 2025

0.8.1

Sep 15, 2025

0.8.0

Sep 10, 2025

0.7.7

Sep 9, 2025

0.7.6

Sep 8, 2025

0.7.5

Sep 5, 2025

0.7.4

Sep 5, 2025

0.7.3

Sep 5, 2025

0.7.2

Sep 4, 2025

0.7.1

Sep 3, 2025

0.7.0

Sep 3, 2025

0.6.0

Aug 17, 2025

0.5.5

Aug 12, 2025

0.5.4

Aug 12, 2025

0.5.2

Aug 11, 2025

0.5.1

Aug 8, 2025

0.5.0

Aug 1, 2025

0.4.3

Aug 1, 2025

0.4.2

Jul 26, 2025

0.4.1

Jul 23, 2025

0.4.0

Jul 20, 2025

0.3.4

Jul 16, 2025

0.3.3

Jul 9, 2025

0.3.2

Jul 4, 2025

0.3.1

Jul 4, 2025

0.3.0

Jun 28, 2025

0.2.0

Jun 20, 2025

0.1.0

Jun 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

haiku_rag-0.13.0.tar.gz (277.2 kB view details)

Uploaded Oct 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

haiku_rag-0.13.0-py3-none-any.whl (94.3 kB view details)

Uploaded Oct 27, 2025 Python 3

File details

Details for the file haiku_rag-0.13.0.tar.gz.

File metadata

Download URL: haiku_rag-0.13.0.tar.gz
Upload date: Oct 27, 2025
Size: 277.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.14

File hashes

Hashes for haiku_rag-0.13.0.tar.gz
Algorithm	Hash digest
SHA256	`31fd08178cf4a2ffaf5033b6c7135a1c04ac7c55e8cf75e39dfb631ba0b4c227`
MD5	`9ed1d70235f9fd7a688a1224970c1467`
BLAKE2b-256	`34b052a0e0fff534fdda5bf22a6f9a33cb063af3dd3e26b8bed56e5adeb8f85e`

See more details on using hashes here.

File details

Details for the file haiku_rag-0.13.0-py3-none-any.whl.

File metadata

Download URL: haiku_rag-0.13.0-py3-none-any.whl
Upload date: Oct 27, 2025
Size: 94.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.14

File hashes

Hashes for haiku_rag-0.13.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f50a8d191db90aae3f9d2e17d27d38074aa436bd9f239b3ffdadabc7bcfbe925`
MD5	`223ae38d20c35b06c94b1bf1baa0d449`
BLAKE2b-256	`cd5d9749e1b343cdd1dc7b3f6e7b85d94d932d33a61873cf888109d7f6b5303f`

See more details on using hashes here.

haiku.rag 0.13.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Haiku RAG

Features

Quick Start

Python Usage

MCP Server

A2A Agent

Examples

Documentation

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes