Tree-based, vectorless document RAG framework. Connect any LLM via URL/API key.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mithungowda.b

These details have not been verified by PyPI

Project description

TreeDex

Tree-based, vectorless document RAG framework.

Index any document into a navigable tree structure, then retrieve relevant sections using any LLM. No vector databases, no embeddings — just structured tree retrieval.

How It Works

How TreeDex Works

Load — Extract pages from any supported format
Index — LLM analyzes page groups and extracts hierarchical structure
Build — Flat sections become a tree with page ranges and embedded text
Query — LLM selects relevant tree nodes for your question
Return — Get context text, source pages, and reasoning

Why TreeDex instead of Vector DB?

TreeDex vs Vector DB

Supported LLM Providers

LLM Providers

TreeDex works with every major AI provider out of the box. Pick what works for you:

One-liner backends (zero config)

Backend	Provider	Default Model	Dependencies
`GeminiLLM`	Google	gemini-2.0-flash	`google-generativeai`
`OpenAILLM`	OpenAI	gpt-4o	`openai`
`ClaudeLLM`	Anthropic	claude-sonnet-4-20250514	`anthropic`
`MistralLLM`	Mistral AI	mistral-large-latest	`mistralai`
`CohereLLM`	Cohere	command-r-plus	`cohere`
`GroqLLM`	Groq	llama-3.3-70b-versatile	None (stdlib)
`TogetherLLM`	Together AI	Llama-3-70b-chat-hf	None (stdlib)
`FireworksLLM`	Fireworks	llama-v3p1-70b-instruct	None (stdlib)
`OpenRouterLLM`	OpenRouter	claude-sonnet-4	None (stdlib)
`DeepSeekLLM`	DeepSeek	deepseek-chat	None (stdlib)
`CerebrasLLM`	Cerebras	llama-3.3-70b	None (stdlib)
`SambanovaLLM`	SambaNova	Llama-3.1-70B-Instruct	None (stdlib)
`HuggingFaceLLM`	HuggingFace	Mistral-7B-Instruct	None (stdlib)
`OllamaLLM`	Ollama (local)	llama3	None (stdlib)

Universal backends

Backend	Use case	Dependencies
`OpenAICompatibleLLM`	Any OpenAI-compatible endpoint (URL + key)	None (stdlib)
`LiteLLM`	100+ providers via litellm library	`litellm`
`FunctionLLM`	Wrap any `callable(str) -> str`	None
`BaseLLM`	Subclass to build your own	None

Quick Start

Install

# pip
pip install treedex

# uv (faster)
uv pip install treedex

# With optional LLM SDK
pip install treedex[gemini]      # Google Gemini
pip install treedex[openai]      # OpenAI
pip install treedex[claude]      # Anthropic Claude
pip install treedex[mistral]     # Mistral AI
pip install treedex[cohere]      # Cohere
pip install treedex[litellm]     # LiteLLM (100+ providers)
pip install treedex[all]         # Everything

# From source
pip install git+https://github.com/mithun50/TreeDex.git

# Development
git clone https://github.com/mithun50/TreeDex.git
cd TreeDex
pip install -e ".[dev]"

Pick your LLM and go

from treedex import TreeDex

# --- Google Gemini ---
from treedex import GeminiLLM
llm = GeminiLLM(api_key="YOUR_KEY")

# --- OpenAI ---
from treedex import OpenAILLM
llm = OpenAILLM(api_key="sk-...")

# --- Claude ---
from treedex import ClaudeLLM
llm = ClaudeLLM(api_key="sk-ant-...")

# --- Groq (free, fast) ---
from treedex import GroqLLM
llm = GroqLLM(api_key="gsk_...")

# --- Together AI ---
from treedex import TogetherLLM
llm = TogetherLLM(api_key="...")

# --- DeepSeek ---
from treedex import DeepSeekLLM
llm = DeepSeekLLM(api_key="...")

# --- Fireworks ---
from treedex import FireworksLLM
llm = FireworksLLM(api_key="...")

# --- OpenRouter (access any model) ---
from treedex import OpenRouterLLM
llm = OpenRouterLLM(api_key="...", model="anthropic/claude-sonnet-4")

# --- Cerebras ---
from treedex import CerebrasLLM
llm = CerebrasLLM(api_key="...")

# --- SambaNova ---
from treedex import SambanovaLLM
llm = SambanovaLLM(api_key="...")

# --- Mistral AI ---
from treedex import MistralLLM
llm = MistralLLM(api_key="...")  # pip install mistralai

# --- Cohere ---
from treedex import CohereLLM
llm = CohereLLM(api_key="...")  # pip install cohere

# --- HuggingFace ---
from treedex import HuggingFaceLLM
llm = HuggingFaceLLM(api_key="hf_...", model="mistralai/Mistral-7B-Instruct-v0.3")

# --- Local Ollama ---
from treedex import OllamaLLM
llm = OllamaLLM(model="llama3")

# Index and query (same for ALL providers)
index = TreeDex.from_file("document.pdf", llm=llm)
result = index.query("What is the main argument?")
print(result.context)
print(result.pages_str)  # "pages 5-8, 12-15"

Any OpenAI-compatible endpoint

from treedex import OpenAICompatibleLLM

# Works with ANY service that speaks OpenAI format
llm = OpenAICompatibleLLM(
    base_url="https://your-provider.com/v1",
    api_key="...",
    model="model-name"
)

100+ providers via LiteLLM

from treedex import LiteLLM

# pip install litellm
llm = LiteLLM("gpt-4o")                                    # OpenAI
llm = LiteLLM("anthropic/claude-sonnet-4-20250514")         # Claude
llm = LiteLLM("groq/llama-3.3-70b-versatile")              # Groq
llm = LiteLLM("together_ai/meta-llama/Llama-3-70b-chat-hf")# Together
llm = LiteLLM("bedrock/anthropic.claude-3-sonnet")          # AWS Bedrock
llm = LiteLLM("vertex_ai/gemini-pro")                       # Google Vertex
llm = LiteLLM("azure/gpt-4o")                               # Azure OpenAI

Wrap any function

from treedex import FunctionLLM

# Wrap any callable(str) -> str
llm = FunctionLLM(lambda prompt: my_custom_api(prompt))

# Or a named function
def call_my_model(prompt: str) -> str:
    return requests.post(url, json={"prompt": prompt}).json()["text"]

llm = FunctionLLM(call_my_model)

Build your own backend

from treedex import BaseLLM

class MyLLM(BaseLLM):
    def generate(self, prompt: str) -> str:
        # Your logic here — call any API, local model, etc.
        return my_api_call(prompt)

llm = MyLLM()
index = TreeDex.from_file("doc.pdf", llm=llm)

Swap LLM at query time

# Build index with one LLM
index = TreeDex.from_file("doc.pdf", llm=gemini_llm)

# Query with a different one — same index, different brain
result = index.query("...", llm=groq_llm)

Supported Document Formats

Format	Loader	Extra Dependencies
PDF	`PDFLoader`	`pymupdf`
TXT / MD	`TextLoader`	None
HTML	`HTMLLoader`	None (stdlib)
DOCX	`DOCXLoader`	`python-docx`

Use auto_loader(path) for automatic format detection, or pass a specific loader:

from treedex import TreeDex, TextLoader

index = TreeDex.from_file("notes.txt", llm=llm, loader=TextLoader())

API Reference

`TreeDex`

Method	Description
`TreeDex.from_file(path, llm, ...)`	Build index from a file
`TreeDex.from_pages(pages, llm, ...)`	Build from pre-extracted pages
`TreeDex.from_tree(tree, pages, llm?)`	Create from existing tree
`index.query(question, llm?)`	Retrieve relevant sections
`index.save(path)`	Save index to JSON
`TreeDex.load(path, llm?)`	Load index from JSON
`index.show_tree()`	Print tree structure
`index.stats()`	Get index statistics
`index.find_large_sections(...)`	Find oversized nodes

`QueryResult`

Property	Type	Description
`.context`	`str`	Concatenated text from relevant sections
`.node_ids`	`list[str]`	IDs of selected tree nodes
`.page_ranges`	`list[tuple]`	`[(start, end), ...]` page ranges
`.pages_str`	`str`	Human-readable: `"pages 5-8, 12-15"`
`.reasoning`	`str`	LLM's explanation for selection

LLM Backends

Backend	Needs SDK?	One-liner
`GeminiLLM(api_key)`	Yes	`GeminiLLM("key")`
`OpenAILLM(api_key)`	Yes	`OpenAILLM("sk-...")`
`ClaudeLLM(api_key)`	Yes	`ClaudeLLM("sk-ant-...")`
`MistralLLM(api_key)`	Yes	`MistralLLM("key")`
`CohereLLM(api_key)`	Yes	`CohereLLM("key")`
`GroqLLM(api_key)`	No	`GroqLLM("gsk_...")`
`TogetherLLM(api_key)`	No	`TogetherLLM("key")`
`FireworksLLM(api_key)`	No	`FireworksLLM("key")`
`OpenRouterLLM(api_key)`	No	`OpenRouterLLM("key")`
`DeepSeekLLM(api_key)`	No	`DeepSeekLLM("key")`
`CerebrasLLM(api_key)`	No	`CerebrasLLM("key")`
`SambanovaLLM(api_key)`	No	`SambanovaLLM("key")`
`HuggingFaceLLM(api_key)`	No	`HuggingFaceLLM("hf_...")`
`OllamaLLM(model)`	No	`OllamaLLM("llama3")`
`LiteLLM(model)`	Yes	`LiteLLM("gpt-4o")`
`FunctionLLM(fn)`	No	`FunctionLLM(my_fn)`
`OpenAICompatibleLLM(url, model)`	No	Any endpoint
`BaseLLM` (subclass)	No	Your own logic

Benchmarks

TreeDex vs Vector DB vs Naive Chunking

Comparison Benchmark

Real benchmark on the same document (NCERT Electromagnetic Waves, 14 pages, 10 queries). All three methods retrieve from the same content — only the indexing and retrieval approach differs. Auto-generated by CI on every push.

TreeDex Stats

Benchmarks

Feature	TreeDex	Vector RAG	Naive Chunking
Page Attribution	Exact source pages	Approximate	None
Structure Preserved	Full tree hierarchy	None	None
Index Format	Human-readable JSON	Opaque vectors	Text chunks
Embedding Model	Not needed	Required	Not needed
Infrastructure	None (JSON file)	Vector DB required	None
Core Dependencies	2 (pymupdf, tiktoken)	5-8+	2-5

Run your own: python benchmarks/run_benchmark.py --help or python benchmarks/compare_vectordb.py --help

Architecture

Running Tests

# Install dev dependencies
pip install -e ".[dev]"

# Run all tests
pytest

# With coverage
pytest --cov=treedex

# Run specific test file
pytest tests/test_core.py -v

License

MIT License — Mithun Gowda B

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mithungowda.b

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.5

Mar 22, 2026

0.1.4

Mar 1, 2026

0.1.2

Mar 1, 2026

0.1.1

Mar 1, 2026

This version

0.1.0

Mar 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

treedex-0.1.0.tar.gz (59.5 kB view details)

Uploaded Mar 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

treedex-0.1.0-py3-none-any.whl (19.1 kB view details)

Uploaded Mar 1, 2026 Python 3

File details

Details for the file treedex-0.1.0.tar.gz.

File metadata

Download URL: treedex-0.1.0.tar.gz
Upload date: Mar 1, 2026
Size: 59.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for treedex-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`4cf092ceac261d6395d1af662fce1b5ce5bd88b33bbe0c041f215e0548a10c3d`
MD5	`ce3e92b3e53582fe052fb991f0519c2b`
BLAKE2b-256	`9a725b1c5f17e3ce8457ded3916ae290ed2b0425c7e198269e476b48a85e4bf4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for treedex-0.1.0.tar.gz:

Publisher: publish.yml on mithun50/TreeDex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: treedex-0.1.0.tar.gz
- Subject digest: 4cf092ceac261d6395d1af662fce1b5ce5bd88b33bbe0c041f215e0548a10c3d
- Sigstore transparency entry: 1005781939
- Sigstore integration time: Mar 1, 2026
Source repository:
- Permalink: mithun50/TreeDex@ecb6cf87e5da2514f9fcda9b5422d9c12880b2f5
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/mithun50
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ecb6cf87e5da2514f9fcda9b5422d9c12880b2f5
- Trigger Event: release

File details

Details for the file treedex-0.1.0-py3-none-any.whl.

File metadata

Download URL: treedex-0.1.0-py3-none-any.whl
Upload date: Mar 1, 2026
Size: 19.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for treedex-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ce7a6036d0a428d28a3b798fed91b4390cbc2239ef3c37a58cca84def0b7bf4b`
MD5	`d799a6770b782d1eed561f61072f880e`
BLAKE2b-256	`cdbc908c08137c866e2327bb6c56f50d8443eb5a496b710ec6b54fcc3c221a3a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for treedex-0.1.0-py3-none-any.whl:

Publisher: publish.yml on mithun50/TreeDex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: treedex-0.1.0-py3-none-any.whl
- Subject digest: ce7a6036d0a428d28a3b798fed91b4390cbc2239ef3c37a58cca84def0b7bf4b
- Sigstore transparency entry: 1005781940
- Sigstore integration time: Mar 1, 2026
Source repository:
- Permalink: mithun50/TreeDex@ecb6cf87e5da2514f9fcda9b5422d9c12880b2f5
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/mithun50
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ecb6cf87e5da2514f9fcda9b5422d9c12880b2f5
- Trigger Event: release

treedex 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

TreeDex

How It Works

Why TreeDex instead of Vector DB?

Supported LLM Providers

One-liner backends (zero config)

Universal backends

Quick Start

Install

Pick your LLM and go

Any OpenAI-compatible endpoint

100+ providers via LiteLLM

Wrap any function

Build your own backend

Swap LLM at query time

Supported Document Formats

API Reference

TreeDex

QueryResult

LLM Backends

Benchmarks

TreeDex vs Vector DB vs Naive Chunking

TreeDex Stats

Architecture

Running Tests

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`TreeDex`

`QueryResult`