Local RAG for the terminal. Drop files in a folder, ask questions, get answers. JSON API for coding agents.
Project description
lilbee
Local RAG for the terminal. Ground your LLM answers in real documents — no hallucinations, no cloud, no Docker.
- Why lilbee
- Demos
- Install
- Quick start
- Interactive chat
- Agent integration
- Supported formats
- Configuration
- How it works
Why lilbee
Index your documents and code into a local knowledge base, then ask questions grounded in what's actually there. Most tools like this only handle code. lilbee handles PDFs, Word docs, epics — and code too, with AST-aware chunking.
- Documents and code alike — add anything from a vehicle manual to an entire codebase
- Fully offline — runs on your machine with Ollama and LanceDB, no cloud APIs or Docker
- Works with AI agents — MCP server and JSON CLI so agents can search your knowledge base too
Add files (lilbee add), then ask questions or search. Once indexed, search works without Ollama — agents use their own LLM to reason over the retrieved chunks.
Demos
AI agent using lilbee (opencode)
An AI coding agent shells out to lilbee --json search to ground its answers in your documents.
Interactive local offline chat
[!NOTE] Entirely local on a 2021 M1 Pro with 32 GB RAM.
Model switching via tab completion, then a Q&A grounded in an indexed PDF.
Code index and search
Add a codebase and search with natural language. Tree-sitter provides AST-aware chunking.
JSON output
Structured JSON output for agents and scripts.
Install
Prerequisites
- Python 3.11+
- Ollama — only the embedding model is required for indexing and search (which is all agents need):
ollama pull nomic-embed-text # required — used for embedding during sync
If you want to use lilbee as a standalone local chat (no cloud LLM), also pull a chat model:ollama pull mistral # or qwen3, llama3, etc.
- Optional (for image OCR):
brew install tesseract/apt install tesseract-ocr
Install
pip install lilbee # or: uv tool install lilbee
Development (run from source)
git clone https://github.com/tobocop2/lilbee && cd lilbee
uv sync
uv run lilbee
Quick start
# Check version
lilbee --version
# Chat with a local LLM (requires Ollama)
lilbee
# Add documents to your knowledge base
lilbee add ~/Documents/manual.pdf ~/notes/
# Ask questions — answers come from your documents via a local LLM
lilbee ask "What is the recommended oil change interval?"
# Search documents — returns raw chunks, no LLM needed at query time
lilbee search "oil change interval"
# Remove a document from the knowledge base
lilbee remove manual.pdf
# Use a different local chat model (requires ollama pull <model>)
lilbee ask "Explain this" --model qwen3
# Check what's indexed
lilbee status
Interactive chat
Running lilbee or lilbee chat enters an interactive REPL with conversation history, streaming responses, and slash commands:
| Command | Description |
|---|---|
/status |
Show indexed documents and config |
/add [path] |
Add a file or directory (tab-completes paths) |
/model [name] |
Show or switch chat model (tab-completes Ollama models) |
/version |
Show lilbee version |
/reset |
Delete all documents and data (asks for confirmation) |
/help |
Show available commands |
/quit |
Exit chat |
Slash commands and paths tab-complete. A spinner shows while waiting for the first token from the LLM.
Agent integration
lilbee can serve as a local retrieval backend for AI coding agents via MCP or JSON CLI. See docs/agent-integration.md for setup and usage.
Supported formats
| Format | Extensions | Requires |
|---|---|---|
.pdf |
— | |
| Office | .docx, .xlsx, .pptx |
— |
| eBook | .epub |
— |
| Images (OCR) | .png, .jpg, .jpeg, .tiff, .bmp, .webp |
Tesseract |
| Data | .csv, .tsv |
— |
| Text | .md, .txt, .html, .rst |
— |
| Code | .py, .js, .ts, .go, .rs, .java and 150+ more via tree-sitter (AST-aware chunking) |
— |
Configuration
All settings are configurable via environment variables:
| Variable | Default | Description |
|---|---|---|
LILBEE_DATA |
(platform default) | Data directory path |
LILBEE_CHAT_MODEL |
mistral |
Ollama chat model |
LILBEE_EMBEDDING_MODEL |
nomic-embed-text |
Embedding model |
LILBEE_EMBEDDING_DIM |
768 |
Embedding dimensions |
LILBEE_CHUNK_SIZE |
512 |
Tokens per chunk |
LILBEE_CHUNK_OVERLAP |
100 |
Overlap tokens between chunks |
LILBEE_MAX_EMBED_CHARS |
2000 |
Max characters per chunk for embedding |
LILBEE_TOP_K |
10 |
Number of retrieval results |
LILBEE_SYSTEM_PROMPT |
(built-in) | Custom system prompt for RAG answers |
CLI also accepts --model / -m, --data-dir / -d, and --version / -V flags.
How it works
Documents are hashed and synced automatically — new files get ingested, modified files re-ingested, deleted files removed. Kreuzberg handles extraction and chunking across all document formats (PDF, Office, images via OCR, etc.), while tree-sitter provides AST-aware chunking for code. Chunks are embedded via Ollama and stored in LanceDB. Ollama uses llama.cpp with native Metal support, which is significantly faster than in-process alternatives like ONNX Runtime — CoreML can't accelerate nomic-embed-text's rotary embeddings, making CPU the only ONNX path on macOS (~170ms/chunk vs near-instant with Ollama's GPU inference). Queries embed the question, find the most relevant chunks by vector similarity, and pass them as context to the LLM.
Data location
| Platform | Path |
|---|---|
| macOS | ~/Library/Application Support/lilbee/ |
| Linux | ~/.local/share/lilbee/ |
| Windows | %LOCALAPPDATA%/lilbee/ |
Override with LILBEE_DATA=/path or --data-dir.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lilbee-0.3.1.tar.gz.
File metadata
- Download URL: lilbee-0.3.1.tar.gz
- Upload date:
- Size: 2.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05cbbdd576d840f3133984910db3372c0edf2e788934bc4421391c16863d3556
|
|
| MD5 |
ef08de782c03454d9fa238cdcf571dd6
|
|
| BLAKE2b-256 |
32c443569ccfc8cb1d5846bd4bedd320f22fcbce6c7d5f137e0e5f0906ba5b20
|
Provenance
The following attestation bundles were made for lilbee-0.3.1.tar.gz:
Publisher:
publish.yml on tobocop2/lilbee
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lilbee-0.3.1.tar.gz -
Subject digest:
05cbbdd576d840f3133984910db3372c0edf2e788934bc4421391c16863d3556 - Sigstore transparency entry: 1057849684
- Sigstore integration time:
-
Permalink:
tobocop2/lilbee@72bbee0fc105a5085b5c358b4d4ba9d1e680db40 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/tobocop2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@72bbee0fc105a5085b5c358b4d4ba9d1e680db40 -
Trigger Event:
release
-
Statement type:
File details
Details for the file lilbee-0.3.1-py3-none-any.whl.
File metadata
- Download URL: lilbee-0.3.1-py3-none-any.whl
- Upload date:
- Size: 31.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d790a20b3ab6becc8b08974f4b312d34c877ee668f47d1ac15bec2d4bc0c7bde
|
|
| MD5 |
f459c556a2127760b52f2c13c5e56a4c
|
|
| BLAKE2b-256 |
91f25a4ee116061974aef1a5ed6c37a75825a0a3a61483a0dff26df50d3c469e
|
Provenance
The following attestation bundles were made for lilbee-0.3.1-py3-none-any.whl:
Publisher:
publish.yml on tobocop2/lilbee
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lilbee-0.3.1-py3-none-any.whl -
Subject digest:
d790a20b3ab6becc8b08974f4b312d34c877ee668f47d1ac15bec2d4bc0c7bde - Sigstore transparency entry: 1057849688
- Sigstore integration time:
-
Permalink:
tobocop2/lilbee@72bbee0fc105a5085b5c358b4d4ba9d1e680db40 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/tobocop2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@72bbee0fc105a5085b5c358b4d4ba9d1e680db40 -
Trigger Event:
release
-
Statement type: