Agentic RAG Framework built with LangGraph and Ollama
Project description
adaptiverag
Agentic RAG that thinks before it retrieves.
adaptiverag is a fully local, self-optimising Retrieval-Augmented Generation framework built on LangGraph and Ollama. It runs two autonomous agent graphs — one that analyses and indexes your knowledge base at startup, and one that routes every query through the best possible retrieval strategy at runtime.
Why adaptiverag?
Most RAG pipelines run the same fixed sequence for every query. adaptiverag treats retrieval as a decision problem:
| Fixed pipeline | adaptiverag |
|---|---|
| Same chunk size for all docs | Analyses doc structure, picks chunk strategy automatically |
| Fixed top-k for every query | LLM-chosen top-k per query based on type and complexity |
| No query expansion | Uses HyDE expansion for vague queries |
| Single-pass retrieval | Multi-hop follow-up retrieval when first pass is insufficient |
| No quality check | Critic node scores the answer; retries with a new strategy if confidence is low |
| Manual parameter tuning | Optimizer agent tunes chunk size, top-k, temperature, and reranking automatically |
How it works
Setup graph (runs once at startup)
load docs → profile KB → plan config → index → evaluate → orchestrate ──┐
↑ │
critique ←── tune_*
The orchestrator LLM analyses the knowledge base profile and current scores, then decides which parameter to tune next. It loops until scores stop improving or the iteration budget runs out.
Query graph (runs per query)
classify → strategize → expand → retrieve → retrieval_critic ──┐
↑ ↓ │
retry ← reflect ← generate ← rerank ← multihop ──┘
The strategist LLM picks tools (HyDE, multi-hop, reranker) based on query type. The answer critic scores the result and routes back for a retry if confidence is below the threshold.
Installation
pip install adaptiverag
# Optional: cross-encoder reranking (improves precision on complex queries)
pip install "adaptiverag[reranker]"
Requires Ollama running locally. Any missing models are pulled automatically on first run — no manual ollama pull needed.
Quick start
from adaptiverag import build_rag
# Indexes ./knowledge_base, auto-tunes the pipeline, returns a ready instance
rag = build_rag()
result = rag.ask("What are the main findings?")
print(result) # prints the answer
print(result.confidence) # 0.0 – 1.0
print(result.strategy) # one-line explanation of what the agent chose
print(result.trace) # full step-by-step reasoning trace
Custom paths and models
rag = build_rag(
llm_model = "llama3.2:latest", # any Ollama model
embed_model = "nomic-embed-text:latest",
kb_path = "/path/to/your/documents",
val_queries_path = "/path/to/validation.json", # optional — auto-generated if omitted
)
Restrict retrieval to one file
result = rag.ask("Summarise the methodology", source_filter="paper.pdf")
# or using the inline prefix:
result = rag.ask("from:paper.pdf Summarise the methodology")
CLI
adaptiverag
Supported document formats
| Format | Extension |
|---|---|
| Plain text | .txt |
.pdf |
|
| Markdown | .md |
| Word | .docx |
Drop files into your knowledge_base/ folder. Mixed formats are supported.
Validation queries
The optimizer tunes pipeline parameters by scoring answers against expected answers. You can provide your own:
[
{
"query": "What problem does this paper solve?",
"expected_answer": "The paper addresses the challenge of ..."
}
]
Pass the path via val_queries_path. If you omit it, adaptiverag generates queries automatically from your documents using the LLM and saves them to ./validation_queries.json for you to review and edit.
QueryResult fields
| Field | Type | Description |
|---|---|---|
answer |
str |
The generated answer (str(result) also works) |
confidence |
float |
Self-assessed confidence, 0.0 – 1.0 |
retries |
int |
Number of reflection retries needed |
strategy |
str |
One-line explanation of the agent's retrieval strategy |
trace |
list[str] |
Step-by-step log of every decision made |
Requirements
- Python ≥ 3.10
- Ollama running locally (
http://localhost:11434) - At least one chat model and one embedding model available in Ollama (auto-pulled if missing)
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adaptiverag-1.0.1.tar.gz.
File metadata
- Download URL: adaptiverag-1.0.1.tar.gz
- Upload date:
- Size: 20.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f454843297a8219ff12b3b5519133e82741b8f4225700df00209320de1f90dd
|
|
| MD5 |
e9f037e1d79893a81f3e24995fd565a7
|
|
| BLAKE2b-256 |
4daee15246bd3e3b79e0d6146282456d909f8b77ef92a733e78f64947c90c44b
|
File details
Details for the file adaptiverag-1.0.1-py3-none-any.whl.
File metadata
- Download URL: adaptiverag-1.0.1-py3-none-any.whl
- Upload date:
- Size: 24.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b6e3c08b9172490e9440920a8b55325198176e1f4fc9d6c62943119722fa73c
|
|
| MD5 |
a4075e8f8a5417e7f036041b46be9b29
|
|
| BLAKE2b-256 |
59f9c9007fb49e4d68bc9cabe58fc03a3eddf48805fb6ceb0a25b6d42f7b153f
|