A lightweight, completely local, zero-API, tree-based RAG framework
Project description
🗺️ NaviDoc
NaviDoc is a lightweight, completely local, zero-API, tree-based RAG framework designed to navigate document structures intelligently. Instead of blindly chopping your files into vector chunks, NaviDoc maps your documents into a logical structural tree hierarchy and uses local LLMs to precisely steer and navigate to answers.
✨ Features
- 🔒 100% Private & Offline: Your documents never leave your machine. Zero cloud APIs, zero telemetry.
- 🌳 Tree-Based Navigation: Mimics human navigation by following document structures (headers, font sizes) instead of standard proximity vector chunks.
- ⚡ High Precision: Pinpoints specific structural sections, avoiding context contamination or context blowouts.
- 📄 Multi-Format Support: Supports Markdown, PDF (with font-size analysis), DOCX (with style detection), and PPTX.
- 💾 Index Persistence: Save your indexed tree structures to JSON and reload them instantly.
- 💬 Chat SDK: Maintain conversation history with your documents SDK-style.
🚀 Getting Started
1. Prerequisites
NaviDoc requires Ollama to host your local LLM engine.
- Download and install Ollama from ollama.com.
- Pull a smart, small model (we recommend
phi3orllama3):ollama pull phi3
- Ensure the Ollama service is running in the background before running NaviDoc.
2. Installation
Install NaviDoc via pip:
pip install navidoc
Or using uv:
uv add navidoc
💡 Usage Examples
🔍 One-off Query
from navidoc import NaviDoc
# Initialize (defaults to phi3 or NAVIDOC_MODEL_NAME env var)
engine = NaviDoc()
# Ingest and structurally index any local document
status = engine.ingest("your_document.pdf")
print(status)
# Query your document offline
response = engine.query("What are the exact system requirements?")
print(response)
💬 Multi-turn Chat (SDK Style)
from navidoc import NaviDoc
engine = NaviDoc()
engine.ingest("manual.docx")
# First turn
print(engine.chat("How do I install the battery?"))
# Second turn (remembers context and history!)
print(engine.chat("Where can I buy a replacement?"))
# Clear history if needed
engine.clear_history()
💾 Save & Fast Load Index
Avoid re-parsing large documents by saving the tree index.
from navidoc import NaviDoc
engine = NaviDoc()
# First time: Parse and Save
engine.ingest("massive_report.pdf")
engine.save_index("storage/indices/massive_report.json")
# Second time: Instant Load in milliseconds
engine.load_index("storage/indices/massive_report.json")
response = engine.query("What is the revenue?")
⚙️ Configuration
Environment Variables
You can configure NaviDoc without changing your code by setting environment variables:
NAVIDOC_MODEL_NAME: Set the default Ollama model to use (Default:phi3).
How to change it:
- Windows (PowerShell):
$env:NAVIDOC_MODEL_NAME="llama3" - Linux/Mac:
export NAVIDOC_MODEL_NAME="llama3"
📜 License
NaviDoc is open-source software distributed completely free under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters