Skip to main content

A lightweight, completely local, zero-API, tree-based RAG framework

Project description

🗺️ NaviDoc

NaviDoc is a lightweight, completely local, zero-API, tree-based RAG framework designed to navigate document structures intelligently. Instead of blindly chopping your files into vector chunks, NaviDoc maps your documents into a logical structural tree hierarchy and uses local LLMs to precisely steer and navigate to answers.


✨ Features

  • 🔒 100% Private & Offline: Your documents never leave your machine. Zero cloud APIs, zero telemetry.
  • 🌳 Tree-Based Navigation: Mimics human navigation by following document structures (headers, font sizes) instead of standard proximity vector chunks.
  • ⚡ High Precision: Pinpoints specific structural sections, avoiding context contamination or context blowouts.
  • 📄 Multi-Format Support: Supports Markdown, PDF (with font-size analysis), DOCX (with style detection), and PPTX.
  • 💾 Index Persistence: Save your indexed tree structures to JSON and reload them instantly.
  • 💬 Chat SDK: Maintain conversation history with your documents SDK-style.

🚀 Getting Started

1. Prerequisites

NaviDoc requires Ollama to host your local LLM engine.

  1. Download and install Ollama from ollama.com.
  2. Pull a smart, small model (we recommend phi3 or llama3):
    ollama pull phi3
    
  3. Ensure the Ollama service is running in the background before running NaviDoc.

2. Installation

Install NaviDoc via pip:

pip install navidoc

Or using uv:

uv add navidoc

💡 Usage Examples

🔍 One-off Query

from navidoc import NaviDoc

# Initialize (defaults to phi3 or NAVIDOC_MODEL_NAME env var)
engine = NaviDoc()

# Ingest and structurally index any local document
status = engine.ingest("your_document.pdf")
print(status)

# Query your document offline
response = engine.query("What are the exact system requirements?")
print(response)

💬 Multi-turn Chat (SDK Style)

from navidoc import NaviDoc

engine = NaviDoc()
engine.ingest("manual.docx")

# First turn
print(engine.chat("How do I install the battery?"))

# Second turn (remembers context and history!)
print(engine.chat("Where can I buy a replacement?"))

# Clear history if needed
engine.clear_history()

💾 Save & Fast Load Index

Avoid re-parsing large documents by saving the tree index.

from navidoc import NaviDoc

engine = NaviDoc()

# First time: Parse and Save
engine.ingest("massive_report.pdf")
engine.save_index("storage/indices/massive_report.json")

# Second time: Instant Load in milliseconds
engine.load_index("storage/indices/massive_report.json")
response = engine.query("What is the revenue?")

⚙️ Configuration

Environment Variables

You can configure NaviDoc without changing your code by setting environment variables:

  • NAVIDOC_MODEL_NAME: Set the default Ollama model to use (Default: phi3).

How to change it:

  • Windows (PowerShell): $env:NAVIDOC_MODEL_NAME="llama3"
  • Linux/Mac: export NAVIDOC_MODEL_NAME="llama3"

📜 License

NaviDoc is open-source software distributed completely free under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

navidoc-0.1.0.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

navidoc-0.1.0-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page