Skip to main content

A local, offline knowledge base. Add documents and code, ask questions grounded in what's actually there.

Project description

lilbee

This is an experimental tool and a work in progress. There will be issues with some formats and performance at scale is unknown at this time

PyPI Python 3.11+ CI Coverage Platforms License: MIT Downloads

A local, offline knowledge base. Add documents and code, ask questions grounded in what's actually there.



Why lilbee

Index your documents and code into a local knowledge base, then ask questions grounded in what's actually there. Most tools like this only handle code. lilbee handles PDFs, Word docs, epics — and code too, with AST-aware chunking.

  • Documents and code alike — add anything from a vehicle manual to an entire codebase
  • Fully offline — runs on your machine with Ollama and LanceDB, no cloud APIs or Docker
  • Works with AI agents — MCP server and JSON CLI so agents can search your knowledge base too

Add files (lilbee add), then ask questions or search. Once indexed, search works without Ollama — agents use their own LLM to reason over the retrieved chunks.

Demos

AI agent using lilbee (opencode)

opencode + lilbee

An AI coding agent shells out to lilbee --json search to ground its answers in your documents.

Interactive local offline chat

[!NOTE] Entirely local on a 2021 M1 Pro with 32 GB RAM.

Model switching via tab completion, then a Q&A grounded in an indexed PDF.

Interactive local offline chat

Code index and search

Code search

Add a codebase and search with natural language. Tree-sitter provides AST-aware chunking.

JSON output

JSON output

Structured JSON output for agents and scripts.

Install

Prerequisites

  • Python 3.11+
  • Ollama — the embedding model (nomic-embed-text) is auto-pulled on first sync if not already installed. If you want to use lilbee as a standalone local chat (no cloud LLM), also pull a chat model:
    ollama pull qwen3:8b     # or llama3, mistral, etc.
    
  • Optional (for image OCR): brew install tesseract / apt install tesseract-ocr

First-time download: If you're new to Ollama, expect the first run to take a while — models are large files that need to be downloaded once. For example, qwen3:8b is ~5 GB and the embedding model nomic-embed-text is ~274 MB. After the initial download, models are cached locally and load in seconds. You can check what you have installed with ollama list.

Install

pip install lilbee        # or: uv tool install lilbee

Development (run from source)

git clone https://github.com/tobocop2/lilbee && cd lilbee
uv sync
uv run lilbee

Quick start

# Check version
lilbee --version

# Chat with a local LLM (requires Ollama)
lilbee

# Add documents to your knowledge base
lilbee add ~/Documents/manual.pdf ~/notes/

# Ask questions — answers come from your documents via a local LLM
lilbee ask "What is the recommended oil change interval?"

# Search documents — returns raw chunks, no LLM needed at query time
lilbee search "oil change interval"

# Remove a document from the knowledge base
lilbee remove manual.pdf

# Use a different local chat model (requires ollama pull <model>)
lilbee ask "Explain this" --model qwen3

# Check what's indexed
lilbee status

Interactive chat

Running lilbee or lilbee chat enters an interactive REPL with conversation history, streaming responses, and slash commands:

Command Description
/status Show indexed documents and config
/add [path] Add a file or directory (tab-completes paths)
/model [name] Show or switch chat model (tab-completes Ollama models)
/version Show lilbee version
/reset Delete all documents and data (asks for confirmation)
/help Show available commands
/quit Exit chat

Slash commands and paths tab-complete. A spinner shows while waiting for the first token from the LLM.

Agent integration

lilbee can serve as a local retrieval backend for AI coding agents via MCP or JSON CLI. See docs/agent-integration.md for setup and usage.

Supported formats

Format Extensions Requires
PDF .pdf
Office .docx, .xlsx, .pptx
eBook .epub
Images (OCR) .png, .jpg, .jpeg, .tiff, .bmp, .webp Tesseract
Data .csv, .tsv
Text .md, .txt, .html, .rst
Code .py, .js, .ts, .go, .rs, .java and 150+ more via tree-sitter (AST-aware chunking)

Configuration

All settings are configurable via environment variables:

Variable Default Description
LILBEE_DATA (platform default) Data directory path
LILBEE_CHAT_MODEL qwen3:8b Ollama chat model
LILBEE_EMBEDDING_MODEL nomic-embed-text Embedding model
LILBEE_EMBEDDING_DIM 768 Embedding dimensions
LILBEE_CHUNK_SIZE 512 Tokens per chunk
LILBEE_CHUNK_OVERLAP 100 Overlap tokens between chunks
LILBEE_MAX_EMBED_CHARS 2000 Max characters per chunk for embedding
LILBEE_TOP_K 10 Number of retrieval results
LILBEE_SYSTEM_PROMPT (built-in) Custom system prompt for RAG answers

CLI also accepts --model / -m, --data-dir / -d, and --version / -V flags.

How it works

Documents are hashed and synced automatically — new files get ingested, modified files re-ingested, deleted files removed. Kreuzberg handles extraction and chunking across all document formats (PDF, Office, images via OCR, etc.), while tree-sitter provides AST-aware chunking for code. Chunks are embedded via Ollama and stored in LanceDB. Ollama uses llama.cpp with native Metal support, which is significantly faster than in-process alternatives like ONNX Runtime — CoreML can't accelerate nomic-embed-text's rotary embeddings, making CPU the only ONNX path on macOS (~170ms/chunk vs near-instant with Ollama's GPU inference). Queries embed the question, find the most relevant chunks by vector similarity, and pass them as context to the LLM.

Data location

Platform Path
macOS ~/Library/Application Support/lilbee/
Linux ~/.local/share/lilbee/
Windows %LOCALAPPDATA%/lilbee/

Override with LILBEE_DATA=/path or --data-dir.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lilbee-0.3.8.tar.gz (2.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lilbee-0.3.8-py3-none-any.whl (35.9 kB view details)

Uploaded Python 3

File details

Details for the file lilbee-0.3.8.tar.gz.

File metadata

  • Download URL: lilbee-0.3.8.tar.gz
  • Upload date:
  • Size: 2.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lilbee-0.3.8.tar.gz
Algorithm Hash digest
SHA256 bc011565c6f080b90068ae7b107acecff8cb976621838ea037c5e0ee66e64b78
MD5 9c79caf0e43044b9fb45c7145ecc9e3b
BLAKE2b-256 16db9fbd6b636aac5eb4380fa991cd126ae390f6b53548f495fb4275f09a546f

See more details on using hashes here.

Provenance

The following attestation bundles were made for lilbee-0.3.8.tar.gz:

Publisher: publish.yml on tobocop2/lilbee

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file lilbee-0.3.8-py3-none-any.whl.

File metadata

  • Download URL: lilbee-0.3.8-py3-none-any.whl
  • Upload date:
  • Size: 35.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for lilbee-0.3.8-py3-none-any.whl
Algorithm Hash digest
SHA256 851298e0495629b1a49dd47110299b94a4653c47304564eb76563e6a7c006962
MD5 037e6e9c00eee11d8a63cb84f7da3a61
BLAKE2b-256 ec63e16ff95ad6bc2a7178a773c24daedd641985ed388507088c80a3868eedae

See more details on using hashes here.

Provenance

The following attestation bundles were made for lilbee-0.3.8-py3-none-any.whl:

Publisher: publish.yml on tobocop2/lilbee

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page