Skip to main content

Local RAG for the terminal. Drop files in a folder, ask questions, get answers. JSON API for coding agents.

Project description

lilbee

PyPI Downloads Python 3.11+ CI Coverage License: MIT

Local RAG for the terminal. Ground your LLM answers in real documents — no hallucinations, no cloud, no Docker.



Why lilbee

Index your documents and code into a local knowledge base, then ask questions grounded in what's actually there. Most tools like this only handle code. lilbee handles PDFs, Word docs, epics — and code too, with AST-aware chunking.

  • Documents and code alike — add anything from a vehicle manual to an entire codebase
  • Fully offline — runs on your machine with Ollama and LanceDB, no cloud APIs or Docker
  • Works with AI agents — MCP server and JSON CLI so agents can search your knowledge base too

Add files (lilbee add), then ask questions or search. Once indexed, search works without Ollama — agents use their own LLM to reason over the retrieved chunks.

Demos

AI agent using lilbee (opencode)

opencode + lilbee

An AI coding agent shells out to lilbee --json search to ground its answers in your documents.

Interactive local offline chat

[!NOTE] Entirely local on a 2021 M1 Pro with 32 GB RAM.

Model switching via tab completion, then a Q&A grounded in an indexed PDF.

Interactive local offline chat

Code index and search

Code search

Add a codebase and search with natural language. Tree-sitter provides AST-aware chunking.

JSON output

JSON output

Structured JSON output for agents and scripts.

Install

Prerequisites

  • Python 3.11+
  • Ollama — only the embedding model is required for indexing and search (which is all agents need):
    ollama pull nomic-embed-text    # required — used for embedding during sync
    
    If you want to use lilbee as a standalone local chat (no cloud LLM), also pull a chat model:
    ollama pull mistral             # or qwen3, llama3, etc.
    
  • Optional (for image OCR): brew install tesseract / apt install tesseract-ocr

Install

pip install lilbee        # or: uv tool install lilbee

Development (run from source)

git clone https://github.com/tobocop2/lilbee && cd lilbee
uv sync
uv run lilbee

Quick start

# Check version
lilbee --version

# Chat with a local LLM (requires Ollama)
lilbee

# Add documents to your knowledge base
lilbee add ~/Documents/manual.pdf ~/notes/

# Ask questions — answers come from your documents via a local LLM
lilbee ask "What is the recommended oil change interval?"

# Search documents — returns raw chunks, no LLM needed at query time
lilbee search "oil change interval"

# Remove a document from the knowledge base
lilbee remove manual.pdf

# Use a different local chat model (requires ollama pull <model>)
lilbee ask "Explain this" --model qwen3

# Check what's indexed
lilbee status

Interactive chat

Running lilbee or lilbee chat enters an interactive REPL with conversation history, streaming responses, and slash commands:

Command Description
/status Show indexed documents and config
/add [path] Add a file or directory (tab-completes paths)
/model [name] Show or switch chat model (tab-completes Ollama models)
/version Show lilbee version
/reset Delete all documents and data (asks for confirmation)
/help Show available commands
/quit Exit chat

Slash commands and paths tab-complete. A spinner shows while waiting for the first token from the LLM.

Agent integration

lilbee can serve as a local retrieval backend for AI coding agents via MCP or JSON CLI. See docs/agent-integration.md for setup and usage.

Supported formats

Format Extensions Requires
PDF .pdf
Office .docx, .xlsx, .pptx
eBook .epub
Images (OCR) .png, .jpg, .jpeg, .tiff, .bmp, .webp Tesseract
Data .csv, .tsv
Text .md, .txt, .html, .rst
Code .py, .js, .ts, .go, .rs, .java and 150+ more via tree-sitter (AST-aware chunking)

Configuration

All settings are configurable via environment variables:

Variable Default Description
LILBEE_DATA (platform default) Data directory path
LILBEE_CHAT_MODEL mistral Ollama chat model
LILBEE_EMBEDDING_MODEL nomic-embed-text Embedding model
LILBEE_EMBEDDING_DIM 768 Embedding dimensions
LILBEE_CHUNK_SIZE 512 Tokens per chunk
LILBEE_CHUNK_OVERLAP 100 Overlap tokens between chunks
LILBEE_MAX_EMBED_CHARS 2000 Max characters per chunk for embedding
LILBEE_TOP_K 10 Number of retrieval results
LILBEE_SYSTEM_PROMPT (built-in) Custom system prompt for RAG answers

CLI also accepts --model / -m, --data-dir / -d, and --version / -V flags.

How it works

Documents are hashed and synced automatically — new files get ingested, modified files re-ingested, deleted files removed. Kreuzberg handles extraction and chunking across all document formats (PDF, Office, images via OCR, etc.), while tree-sitter provides AST-aware chunking for code. Chunks are embedded via Ollama and stored in LanceDB. Ollama uses llama.cpp with native Metal support, which is significantly faster than in-process alternatives like ONNX Runtime — CoreML can't accelerate nomic-embed-text's rotary embeddings, making CPU the only ONNX path on macOS (~170ms/chunk vs near-instant with Ollama's GPU inference). Queries embed the question, find the most relevant chunks by vector similarity, and pass them as context to the LLM.

Data location

Platform Path
macOS ~/Library/Application Support/lilbee/
Linux ~/.local/share/lilbee/
Windows %LOCALAPPDATA%/lilbee/

Override with LILBEE_DATA=/path or --data-dir.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lilbee-0.3.1.tar.gz (2.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lilbee-0.3.1-py3-none-any.whl (31.5 kB view details)

Uploaded Python 3

File details

Details for the file lilbee-0.3.1.tar.gz.

File metadata

  • Download URL: lilbee-0.3.1.tar.gz
  • Upload date:
  • Size: 2.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lilbee-0.3.1.tar.gz
Algorithm Hash digest
SHA256 05cbbdd576d840f3133984910db3372c0edf2e788934bc4421391c16863d3556
MD5 ef08de782c03454d9fa238cdcf571dd6
BLAKE2b-256 32c443569ccfc8cb1d5846bd4bedd320f22fcbce6c7d5f137e0e5f0906ba5b20

See more details on using hashes here.

Provenance

The following attestation bundles were made for lilbee-0.3.1.tar.gz:

Publisher: publish.yml on tobocop2/lilbee

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lilbee-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: lilbee-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 31.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lilbee-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d790a20b3ab6becc8b08974f4b312d34c877ee668f47d1ac15bec2d4bc0c7bde
MD5 f459c556a2127760b52f2c13c5e56a4c
BLAKE2b-256 91f25a4ee116061974aef1a5ed6c37a75825a0a3a61483a0dff26df50d3c469e

See more details on using hashes here.

Provenance

The following attestation bundles were made for lilbee-0.3.1-py3-none-any.whl:

Publisher: publish.yml on tobocop2/lilbee

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page