Agentic RAG Framework built with LangGraph and Ollama

These details have not been verified by PyPI

Project links

Repository

Project description

AdaptiveRAG — Agentic RAG Framework for Local LLMs

AdaptiveRAG architecture diagram showing setup and query graphs

Self-optimising, fully local Retrieval-Augmented Generation built with LangGraph. AdaptiveRAG analyses your knowledge base, auto-tunes the pipeline, and routes every query through the best retrieval strategy — all without sending data to any external API.

What makes it agentic
How it works
Installation
Quick start
Configuration
Supported document formats
Validation queries
API reference
Project structure
Contributing
License

What makes it agentic

Most RAG pipelines execute the same fixed sequence regardless of what you ask. AdaptiveRAG uses LLM-driven decision nodes at every step so the path through the graph changes per query and per knowledge base.

Capability	Fixed RAG pipeline	AdaptiveRAG
Chunking strategy	Hard-coded	Chosen per document type (sentence / paragraph / code)
Chunk size	Fixed	Auto-tuned against your actual documents
Query expansion	None	HyDE (hypothetical document embedding) for vague queries
Retrieval passes	Single	Multi-hop follow-up when first pass is insufficient
Result reranking	None	Cross-encoder reranking for analytical / comparison queries
Answer quality	Not checked	Critic node scores the answer; retries with a new strategy if confidence is low
Parameter tuning	Manual	Optimizer agent tunes chunk size, top-k, temperature, and reranking automatically
Privacy	Requires external API	100% local — no data leaves your machine

How it works

AdaptiveRAG is composed of two LangGraph state machines.

Setup graph — runs once at startup

load docs ──► profile KB ──► plan config ──► index ──► evaluate ──► orchestrate ──┐
                                                                         ▲          │
                                                                    critique ◄── tune_*
                                                                              (chunk / retrieval /
                                                                               generation / reranking)

Profile — the LLM classifies domain, structure type, and complexity of your documents
Plan — heuristic config is derived from the profile (chunk size, strategy, top-k, temperature)
Index — documents are chunked and embedded into ChromaDB
Evaluate — answers are scored against validation queries using cosine similarity
Orchestrate — the LLM picks which parameter to tune next and loops until scores plateau

Query graph — runs for every question

classify ──► strategize ──► expand ──► retrieve ──► retrieval critic ──┐
    ▲                                                        │           │
    │                                                    multihop ◄──── ┘
    │                                                        │
    └──── retry ◄──── reflect ◄──── generate ◄──── rerank ◄─┘

Classify — query type detected (factual / analytical / code / comparison / summarisation)
Strategize — LLM decides which tools to use (HyDE, rerank, multihop, top-k)
Retrieve — vector search, optionally with expanded queries
Critic — retrieval quality is scored; if too low, a follow-up multi-hop query is issued
Generate — answer produced using the style matching the query type
Reflect — answer critic checks groundedness and completeness; retries if below threshold

Installation

pip install adaptiverag

With cross-encoder reranking (recommended for analytical or comparison queries):

pip install "adaptiverag[reranker]"

Prerequisite: Ollama must be running locally. Any missing models are pulled automatically the first time build_rag() is called — no manual ollama pull required.

Quick start

1. Add documents

Create a knowledge_base/ folder and drop in your files (.txt, .pdf, .md, .docx):

knowledge_base/
├── report.pdf
├── notes.md
└── spec.txt

2. Run

from adaptiverag import build_rag

# Indexes knowledge_base/, auto-tunes the pipeline, returns a ready instance
rag = build_rag()

result = rag.ask("What are the main findings?")
print(result)                # the answer (str(result) also works)
print(result.confidence)     # 0.0 – 1.0 self-assessed confidence
print(result.strategy)       # why the agent chose this retrieval path
print(result.trace)          # full step-by-step reasoning log

3. CLI

adaptiverag

Interactive prompt with the same agentic graph — type trace to see the last query's reasoning.

Configuration

rag = build_rag(
    llm_model        = "gemma4:latest",              # any Ollama chat model (auto-pulled)
    embed_model      = "nomic-embed-text:latest",    # any Ollama embedding model (auto-pulled)
    kb_path          = "./knowledge_base",           # path to your documents
    val_queries_path = "./validation_queries.json",  # optional — auto-generated from KB if omitted
)

Restrict retrieval to a single source file

# keyword prefix
result = rag.ask("from:report.pdf Summarise the methodology")

# or the parameter
result = rag.ask("Summarise the methodology", source_filter="report.pdf")

Supported Ollama models

Any model available at ollama.com/library works. Recommended:

Role	Model
LLM (routing + answers)	`gemma4`, `llama3.2`, `mistral`, `qwen2.5`
Embeddings	`nomic-embed-text`, `mxbai-embed-large`

Supported document formats

Format	Extension	Notes
Plain text	`.txt`	UTF-8
PDF	`.pdf`	Text-based; scanned PDFs not supported
Markdown	`.md`	Code blocks, headings, and links stripped cleanly
Word	`.docx`	Requires `python-docx` (included)

Mixed formats in the same folder are fully supported.

Validation queries

The setup graph tunes pipeline parameters by scoring generated answers against expected answers. Provide your own queries for best results:

[
  {
    "query": "What problem does this research solve?",
    "expected_answer": "The research addresses the challenge of ..."
  },
  {
    "query": "What method is used for data collection?",
    "expected_answer": "Data was collected through ..."
  }
]

Pass the path via val_queries_path. If you omit it:

AdaptiveRAG checks for ./validation_queries.json
If not found, the LLM auto-generates queries from your documents and saves them to that path
You can then open the file, edit or extend the queries, and they will be used on the next run

API reference

`build_rag(...) → AdaptiveRAG`

Parameter	Type	Default	Description
`llm_model`	`str`	`"gemma4:latest"`	Ollama model for routing and answer generation
`embed_model`	`str`	`"nomic-embed-text:latest"`	Ollama model for embeddings
`kb_path`	`str \| None`	`"./knowledge_base"`	Folder containing your documents
`val_queries_path`	`str \| None`	`"./validation_queries.json"`	Validation Q&A file (auto-generated if missing)

`AdaptiveRAG.ask(question, source_filter=None) → QueryResult`

Parameter	Type	Description
`question`	`str`	Natural-language question. Prefix with `from:<file>` to filter by source.
`source_filter`	`str \| None`	Restrict retrieval to a single filename

`QueryResult` fields

Field	Type	Description
`answer`	`str`	The generated answer (`str(result)` also works)
`confidence`	`float`	Self-assessed confidence, 0.0 – 1.0
`retries`	`int`	Number of reflection retries used
`strategy`	`str`	One-line explanation of the retrieval strategy chosen
`trace`	`list[str]`	Complete step-by-step decision log

Project structure

adaptiverag/
├── core/
│   ├── config.py        # constants and defaults
│   ├── models.py        # KBProfile, PipelineConfig dataclasses
│   └── runtime.py       # shared runtime singleton (RT)
├── components/
│   ├── chunker.py       # content-aware chunking strategies
│   ├── embedder.py      # Ollama embedding wrapper
│   ├── retriever.py     # ChromaDB retrieval
│   └── reranker.py      # cross-encoder reranking (optional)
├── pipeline/
│   ├── tools.py         # LangChain tools (retrieve, rerank, HyDE, generate)
│   ├── kb_analysis.py   # KB profiling and heuristic config planning
│   └── file_loader.py   # document loading (.txt, .pdf, .md, .docx)
├── graphs/
│   ├── setup_graph.py   # build-time LangGraph agent
│   └── query_graph.py   # per-query LangGraph agent
├── api.py               # public Python API (build_rag, AdaptiveRAG, QueryResult)
└── main.py              # CLI entry point

Requirements

Python ≥ 3.10
Ollama running at http://localhost:11434
Dependencies installed automatically via pip: langgraph, langchain-ollama, chromadb, pypdf, python-docx, numpy, tqdm

Contributing

Contributions are welcome. Please open an issue first to discuss what you would like to change.

git clone https://github.com/navid72m/adaptiveRAG.git
cd adaptiveRAG
pip install -e ".[dev]"

License

MIT © navid72m

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

This version

1.0.2

May 19, 2026

1.0.1

May 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adaptiverag-1.0.2.tar.gz (29.3 kB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

adaptiverag-1.0.2-py3-none-any.whl (26.8 kB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file adaptiverag-1.0.2.tar.gz.

File metadata

Download URL: adaptiverag-1.0.2.tar.gz
Upload date: May 19, 2026
Size: 29.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for adaptiverag-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`389bdeaf34d2548d5bbcfefd3e4d8d2887ea44a257a33d08dfca52a7113c34d1`
MD5	`7648c7db7cf9c24f1d55d655332a8fce`
BLAKE2b-256	`cd7d7f826893007a68864e3c88cc8d1e0dc97c485582ad1b909775b3371af65c`

See more details on using hashes here.

File details

Details for the file adaptiverag-1.0.2-py3-none-any.whl.

File metadata

Download URL: adaptiverag-1.0.2-py3-none-any.whl
Upload date: May 19, 2026
Size: 26.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for adaptiverag-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6fe8c6ac6c5bcdc3e1fadc0dcd8de82c76edc24ea5963c741fed33b064e1ca40`
MD5	`3127535aac0e9c7313467c35a03f999a`
BLAKE2b-256	`2104ab5fb4f2ebf3e90a2045a66968e8eddd1f10e51b97cff1d1eaab4ed850a1`

See more details on using hashes here.

adaptiverag 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

AdaptiveRAG — Agentic RAG Framework for Local LLMs

Table of Contents

What makes it agentic

How it works

Setup graph — runs once at startup

Query graph — runs for every question

Installation

Quick start

1. Add documents

2. Run

3. CLI

Configuration

Restrict retrieval to a single source file

Supported Ollama models

Supported document formats

Validation queries

API reference

build_rag(...) → AdaptiveRAG

AdaptiveRAG.ask(question, source_filter=None) → QueryResult

QueryResult fields

Project structure

Requirements

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`build_rag(...) → AdaptiveRAG`

`AdaptiveRAG.ask(question, source_filter=None) → QueryResult`

`QueryResult` fields