A lightweight, completely local, zero-API, tree-based RAG framework
Project description
🗺️ NaviDoc: The Ultimate Local RAG Framework
NaviDoc is a lightweight, completely local, zero-API, tree-based RAG (Retrieval-Augmented Generation) framework designed to navigate document structures with human-like intelligence.
Stop blindly chopping your documents into arbitrary flat chunks. NaviDoc maps your files into a logical structural tree hierarchy and uses local LLMs or ultra-fast static embeddings to precisely steer and navigate to answers.
🚀 Why NaviDoc is Different (The "Crazy" Features)
🧠 1. Intelligent "Vectorless" Tree Navigation
- 🌳 True Hierarchy: NaviDoc mimics human reading. It follows headers, font sizes, and sections to build a mental map of your document.
- 🛡️ Dead-End Protection: Our smart navigation algorithm includes LLM verification. If it navigates to a section that turns out to be irrelevant, it falls back to parent content rather than giving a hallucinated answer!
⚡ 2. Blazing Fast Hybrid Mode
- 🚀 Model2Vec Integration: Use the cutting-edge
potion-base-32Mmodel for lightning-fast tree navigation instead of heavy LLM calls. Up to 500x faster on CPU! - 📉 History Limit Control: Configurable chat history limits (
max_history) to prevent context blowouts and keep your local LLM fast and responsive. - 🗜️ Context Compression: Auto-summarizes large sections using pure Python (no LLM load) to prevent context overflow!
- 🤖 Qwen 2.5 (1.5B) by Default: Uses the ultra-smart and efficient
qwen2.5:1.5bmodel as the default LLM for fast responses on any hardware!
📄 3. Enterprise Document Support
- Multi-Format Mastery: Native support for Markdown, PDF (with advanced font-size analysis), DOCX (with style detection), and PPTX.
- 🖼️ OCR Weaponry: Ingest images (
.png,.jpg,.jpeg) via seamless GLM-OCR integration! - 🗄️ Auto-Scaling SQLite Tree: Support for massive files! If a file is larger than 10MB, NaviDoc automatically switches to a self-referencing SQLite tree structure to save memory!
🔒 4. 100% Privacy & Zero Cost
- 🔒 100% Private: Your documents never leave your machine. Zero cloud APIs, zero telemetry, zero data leaks.
- 💬 Persistent Chat Memory: Backed by a localized SQLite database to maintain conversation memory across sessions (SDK-style).
🌐 5. Visual Web Interface (NEW!)
- 🌐 Local Web UI: Launch a beautiful, premium web interface to chat with your documents using
navidoc ui(accessible athttp://127.0.0.1:7860by default)! (Powered by Gradio). - ⚙️ Configurable Port: Change the port by setting the
NAVIDOC_PORTenvironment variable (e.g.,NAVIDOC_PORT=8080 navidoc ui).
🛠️ Installation
NaviDoc comes with all batteries included! Core dependencies like Model2Vec, Sentence-Transformers, and GLM-OCR are installed automatically!
Using pip:
pip install navidoc
Using uv (Highly Recommended):
uv add navidoc
⌨️ Master the CLI
NaviDoc comes with a powerful CLI that acts as a bridge between you and your local AI environment:
| Command | Description |
|---|---|
navidoc install-ollama |
Auto-downloads and installs Ollama for your OS (Windows/Linux). |
navidoc doctor |
Checks the status of all dependencies (Ollama, Model2Vec, OCR). |
navidoc run <model> |
Directly run an Ollama model. |
navidoc pull <model> |
Pull a model from the Ollama library. |
navidoc list |
List all installed Ollama models. |
navidoc ollama <args> |
Forward any command directly to the Ollama service. |
If using uv, you can run any command directly without installing globally:
uv run navidoc doctor
💻 Quick Start (Code Examples)
1. Basic Ingestion & Query
from navidoc import NaviDoc
# Initialize (Defaults to 'phi3' model)
engine = NaviDoc()
# For ultra-fast navigation using Model2Vec embeddings
# engine = NaviDoc(use_embeddings=True)
# Ingest a document (Auto-detects format)
engine.ingest("enterprise_guide.pdf")
# Query your document offline
response = engine.query("What are the exact system requirements?")
print(response)
2. Multi-Turn Persistent Chat (SDK Style)
NaviDoc remembers conversations across sessions using a local SQLite database!
from navidoc import NaviDoc
# Initialize with a specific session ID
engine = NaviDoc(session_id="project_alpha_chat")
engine.ingest("project_plan.docx")
# First turn
print(engine.chat("Who is the project manager?"))
# Second turn (maintains history)
print(engine.chat("What are their primary responsibilities?"))
🤝 Contributing & Open Source
We are building the future of local, private document understanding and we want your help! Whether you want to add new parsers, optimize the tree navigation, or improve the docs — all contributions are welcome.
Feel free to open issues or submit PRs on our GitHub Repository! 🚀
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters