A production-grade, local-first Agentic RAG library using structural document navigation.

These details have not been verified by PyPI

Project links

Project description

ApexRAG

Production-grade, local-first Agentic RAG Library. Replaces vector similarity search with structural, agentic navigation of documents.

🧠 The Core Idea

Traditional RAG embeds text into vectors and finds the "closest" chunks. This creates retrieval hallucinations — the model returns semantically-similar-but-wrong content because it has no understanding of document structure.

ApexRAG takes a fundamentally different approach:

Parse the document into a structural tree (based on headings) and extract page numbers.
Synthesize a 30-word Semantic Map for every node using a local LLM.
Navigate the tree with an LLM agent that reads summaries and decides which branch to enter — trying multiple candidates if necessary.
Verify the exact leaf node answers the query via a strict secondary LLM check (99.999% accuracy).
Return the exact leaf node content — not a blended, hallucinated average.

Query: "What were Q3 revenues?"
         │
    Root (Annual Report)
    ├── Chapter 1: Executive Summary  ← LLM: "Not here"
    └── Chapter 2: Revenue Analysis   ← LLM: "Enter this"
        ├── Q1 Revenue                ← LLM: "Not Q3"
        ├── Q2 Revenue                ← LLM: "Not Q3"
        └── Q3 Revenue                ← LLM: "This is it!" → Return content

📁 Project Structure

apex_rag/
├── src/
│   ├── __init__.py       # Public API exports
│   ├── api.py            # FastAPI App & UI dashboard
│   ├── client.py         # Thread-safe user-facing ApexIndex class
│   ├── ingestion.py      # Document parsing & tree synthesis
│   ├── navigation.py     # Recursive LLM navigation agent
│   ├── storage.py        # SQLAlchemy async ORM & PageIndexEntry
│   └── utils.py          # ReasoningTrace, retry decorator, helpers
├── tests/
│   ├── test_tree.py      # Parser & storage unit tests
│   └── test_search.py    # Navigation agent unit tests (no Ollama needed)
├── examples/
│   └── basic_usage.py    # End-to-end demo
├── pyproject.toml
└── docker-compose.yml

⚡ Quick Start

1. Install

# Clone and set up
cd ApexRAG
pip install -e ".[dev]"

2. Start Ollama

ollama serve
ollama pull llama3.1   # or phi3, mistral, etc.

3. Ingest & Query

import asyncio
from src.client import ApexIndex

async def main():
    async with await ApexIndex.create(
        db_url="sqlite+aiosqlite:///apex.db",
        model="llama3.1",
    ) as index:
        # Ingest a PDF
        doc_id = await index.ingest("path/to/your/report.pdf")

        # Query it
        result = await index.query(
            "What are the Q3 revenue figures?",
            doc_id,
        )

        if result:
            print(result.content)
            print(f"Found at path: {result.path}")
            print(f"Navigation trace: {result.trace}")

asyncio.run(main())

4. Start the FastAPI Server & Visual Index Dashboard

uvicorn src.api:app --reload

Open your browser to:

Dashboard: http://localhost:8000
API Docs: http://localhost:8000/docs

From the dashboard, you can click on an ingested document to view its full structural tree and its book-style alphabetical page index!

🏗️ Architecture Deep Dive

Ingestion Engine (`ingestion.py`)

Step	Description
Convert	`markitdown` or `docling` converts PDF/DOCX → Markdown
Parse	Regex walks ATX headings (`#`, `##`, `###`) to build `ParsedSection` tree
Persist	Nodes written to DB with LTree-style path (`1.2.3`)
Synthesize	Ollama generates 30-word summaries in parallel (bounded by semaphore)

Storage Layer (`storage.py`)

DocumentNode table:
  id          BIGINT PRIMARY KEY
  doc_id      VARCHAR(255)       -- logical document identifier
  parent_id   BIGINT FK (self)   -- NULL for root nodes
  path        VARCHAR(512)       -- "1.2.3" LTree-style
  title       VARCHAR(512)       -- section heading
  summary     TEXT               -- 30-word Semantic Map
  content     TEXT               -- leaf content (NULL for intermediate)
  metadata    TEXT (JSON)        -- page numbers, char count, source file
  depth       INTEGER            -- nesting level (0 = root)
  position    INTEGER            -- sibling order
  created_at  TIMESTAMP

Supports both sqlite+aiosqlite:// (local) and postgresql+asyncpg:// (production).

Navigation Agent (`navigation.py`)

find(query, doc_id)
  └── _navigate(current_node)
        ├── [Leaf?] → return content immediately
        ├── fetch children
        ├── _ask_llm(query, child_summaries)
        │     └── "Which child ID contains the answer?"
        ├── [ID returned] → recurse into chosen child
        │     └── [child returns None] → try siblings
        └── [NONE returned] → backtrack to parent

LLM Response Parsing is robust — 4-tier fallback:

Strict json.loads()
Regex extraction from prose-wrapped JSON
Explicit "NONE" keyword detection
Heuristic: scan for any valid child ID number in the response

High Accuracy (99.999%) Verification: At the leaf level, a second LLM prompt strictly verifies if the leaf content answers the query. If it fails, the agent backtracks and explores the fallback candidates (second-best choices) up the tree.

Reasoning Trace (`utils.py`)

Every navigation decision is printed with color-coded indicators:

━━━ ApexRAG Navigation Start ━━━
Query : What are the Q3 revenue figures?
Root  : node_id=1

  ↳ ENTER node=1 path=1
    Covers the full annual financial report for 2024…
    ⟳ EXPLORE node=1 → evaluating 2 child summaries
    ✔ AGENT → node=3  reason: Revenue Analysis contains quarterly breakdown
    ↳ ENTER node=3 path=1.2
      ⟳ EXPLORE node=3 → evaluating 4 child summaries
      ✔ AGENT → node=6  reason: Q3 Revenue section is exactly what's needed
      ★ LEAF REACHED node=6
        preview: Q3 revenue was $165M. Growth slowed slightly…

━━━ Navigation Complete ━━━  result=SUCCESS  elapsed=3.41s

🧪 Testing

# Run all tests (no Ollama required)
pytest

# With coverage
pytest --cov=src --cov-report=term-missing

# Specific test file
pytest tests/test_search.py -v

Tests use an in-memory SQLite database and mock LLM responses — zero external dependencies.

🐳 Production Deployment

# Copy and edit environment
cp .env.example .env

# Start everything (Ollama + PostgreSQL + API)
docker-compose up -d

# Pull the model inside the Ollama container
docker exec apex_ollama ollama pull llama3.1

Environment variables:

Variable	Default	Description
`APEX_DB_URL`	`sqlite+aiosqlite:///apex.db`	SQLAlchemy async DB URL
`APEX_OLLAMA_HOST`	`http://localhost:11434`	Ollama server URL
`APEX_MODEL`	`llama3.1`	Ollama model for navigation
`APEX_LOG_LEVEL`	`INFO`	Logging verbosity

🔧 Configuration Reference

await ApexIndex.create(
    db_url="postgresql+asyncpg://user:pass@host/db",  # Production DB
    ollama_host="http://localhost:11434",
    model="llama3.1",              # Navigation model
    summariser_model="phi3",       # Cheaper model for ingestion summaries
    max_concurrent_summaries=8,    # Parallelism (tune to your GPU VRAM)
    parser_backend="markitdown",   # "markitdown" | "docling" | "plaintext"
    trace_enabled=True,            # Color-coded console output
    db_echo=False,                 # SQL query logging
)

📋 Roadmap

FastAPI REST API wrapper (/documents/ingest/file, /query, /documents)
Book-style Page Index and Visual tree dashboard
Unlimited navigation depth with backtrack and verification
Streaming query responses via SSE
Multi-document cross-reference queries
Alembic migrations for schema versioning
Support for docling table extraction (structured data cells as leaf nodes)

📄 License

MIT License — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.8

May 13, 2026

This version

0.1.5

May 12, 2026

0.1.4

May 12, 2026

0.1.3

May 12, 2026

0.1.2

May 12, 2026

0.1.1

May 12, 2026

0.1.0

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apex_rag-0.1.5.tar.gz (44.1 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

apex_rag-0.1.5-py3-none-any.whl (37.1 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file apex_rag-0.1.5.tar.gz.

File metadata

Download URL: apex_rag-0.1.5.tar.gz
Upload date: May 12, 2026
Size: 44.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for apex_rag-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`b2f400b5f8ae06ede0951cb5478cf8e6342c0e2a174d33791427cea186b6c97f`
MD5	`169ada35a17c922c7ec8de574ff76f76`
BLAKE2b-256	`218c46ed8893d1bf65c7358c3cc86e2ebc85e55dfbab73d705ac7728ce499173`

See more details on using hashes here.

File details

Details for the file apex_rag-0.1.5-py3-none-any.whl.

File metadata

Download URL: apex_rag-0.1.5-py3-none-any.whl
Upload date: May 12, 2026
Size: 37.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for apex_rag-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`917ccbe624a1f0738120c9aaa8903f59c4e24bdf3ce28ce065d9793d6dbd49d0`
MD5	`03c37a309e6e69991d06d66b209d319b`
BLAKE2b-256	`1e93347ff0bfd462575a1820718578f0b5678c4b54c766e6c65b550823d084ef`

See more details on using hashes here.

apex-rag 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ApexRAG

🧠 The Core Idea

📁 Project Structure

⚡ Quick Start

1. Install

2. Start Ollama

3. Ingest & Query

4. Start the FastAPI Server & Visual Index Dashboard

🏗️ Architecture Deep Dive

Ingestion Engine (ingestion.py)

Storage Layer (storage.py)

Navigation Agent (navigation.py)

Reasoning Trace (utils.py)

🧪 Testing

🐳 Production Deployment

🔧 Configuration Reference

📋 Roadmap

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Ingestion Engine (`ingestion.py`)

Storage Layer (`storage.py`)

Navigation Agent (`navigation.py`)

Reasoning Trace (`utils.py`)