ConsistentRAG: Improving factual consistency in RAG through knowledge graph grounding and multi-agent refinement
Project description
ConsistentRAG
ConsistentRAG: Improving Factual Consistency in RAG via Graph-Based Retrieval and Ranking
A modular Python framework that improves factual consistency in Retrieval-Augmented Generation by grounding queries in a live knowledge graph. It uses a suite of graph algorithms to retrieve and rank structurally sound reasoning paths, which are then refined through a multi-agent loop where specialized AI "critics" collaboratively improve the answer across multiple iterations.
Try It in 5 Minutes (No Infrastructure Needed)
The fastest way to see ConsistentRAG in action uses KG-only mode — no vector database, no Docker, just Python + an LLM API key.
# 1. Clone and install
git clone https://github.com/pinkfloydsito/consistent-rag.git
cd consistent-rag
uv sync
# 2. Set your API key (pick one)
export OPENAI_API_KEY="sk-..." # OpenAI, Azure, or any OpenAI-compatible endpoint
# OR
export DEEPSEEK_API_KEY="sk-..." # DeepSeek (cheaper, no credit card needed)
# 3. Run the simplest example
uv run python examples/kg_rag_example.py
That's it. The example builds a knowledge graph from text and answers questions using graph reasoning.
Prerequisites
API Keys (Required)
ConsistentRAG needs an LLM for extraction, answering, and critique. Set at least one of these:
| Priority | Variable | Example | Notes |
|---|---|---|---|
| 1st | OPENAI_API_KEY |
sk-... |
OpenAI, Azure, or any OpenAI-compatible provider |
| 2nd | DEEPSEEK_API_KEY |
sk-... |
Fallback if OpenAI key is absent |
| — | OPENAI_API_BASE |
https://api.openai.com/v1 |
Override the API endpoint (e.g., for Azure or local LLMs) |
Copy .env.example to .env and fill in your keys:
cp .env.example .env
# Edit .env with your keys
Infrastructure (Only for specific modes)
| What | When needed | How to start |
|---|---|---|
| Qdrant | Online modes (backend="qdrant") |
make docker-up |
| PostgreSQL | Experiment tracking / integration tests | make docker-up (same command) |
The make docker-up command starts both Qdrant (port 6333) and PostgreSQL (port 5432). You don't need them for KG-only or FAISS-offline modes.
Installation
# With uv (recommended)
uv sync --extra dev
# With pip
pip install -e ".[dev]"
Quick Start: Three Paths
Path A: "Just want to try it" — KG-Only, No Servers
No vector DB. No Docker. Just a graph built from text.
from consistent_rag import ConsistentRAG
rag = ConsistentRAG(pipeline="kg_only", strategy="ppr")
context = """
TechStart was founded in 2019 by Alice Johnson in San Francisco.
The company focuses on AI-powered analytics for retail businesses.
"""
result = rag.query("Who founded TechStart?", context=context)
print(result.answer)
# Output: "TechStart was founded by Alice Johnson."
What's happening: The pipeline extracts triples from text → builds a NetworkX graph → uses PPR to rank reasoning paths → generates an answer.
See examples/kg_rag_example.py for a full runnable script.
Path B: "I have documents to index" — FAISS + KG, Fully Local
Good for: offline use, no external services, saving/loading indices.
from consistent_rag import ConsistentRAG
rag = ConsistentRAG(
pipeline="hybrid",
backend="faiss",
strategy="ppr",
embedding_model="all-MiniLM-L6-v2", # local embeddings, no API key
)
# Index your documents
docs = [
"Pydantic AI is a Python agent framework...",
"DeepSeek develops large language models...",
]
rag.index_documents(docs, recreate=True, build_kg=True)
# Query
result = rag.query("What is Pydantic AI?")
print(result.answer)
# Save for later
rag.save_index("./my_index")
See examples/agentic_rag_example.py for a complete example.
Path C: "Production setup" — Qdrant + Full Pipeline
Good for: large document collections, online dynamic KG, adaptive strategies.
# 1. Start infrastructure
make docker-up
from consistent_rag import ConsistentRAG
rag = ConsistentRAG(
pipeline="hybrid",
backend="qdrant",
collection_name="my_docs",
strategy="adaptive", # auto-selects best strategy per query
)
# Index documents (KG built incrementally)
rag.index_documents(docs, build_kg=True)
# Query with full agentic loop
result = rag.query("What is the relationship between X and Y?")
print(f"Answer: {result.answer}")
print(f"Iterations: {result.iterations_used}")
print(f"Critic score: {result.final_score:.2f}")
print(f"Seeds: {result.seeds_used}")
See examples/basic_rag_example.py for vector-only mode and examples/kg_ppr_rag_example.py for KG+PPR with Qdrant.
Streamlit Visualization
Run the interactive demo to watch the pipeline execute step-by-step:
uv run streamlit run streamlit_app.py
Features:
- Live query input with strategy selection
- Per-iteration subgraph visualization
- Critic scores, weight deltas, and convergence tracking
- Benchmark dataset selection (FaithEval, SQuAD, etc.)
- Reasoning path evolution charts
Configuration
All parameters are passed to ConsistentRAG() or PipelineConfig:
from consistent_rag import ConsistentRAG
rag = ConsistentRAG(
# Pipeline mode
pipeline="hybrid", # "vector_only", "kg_only", or "hybrid"
backend="faiss", # "faiss" or "qdrant" (vector pipelines only)
strategy="ppr", # "ppr", "nhops", "random_walk", "hybrid", "adaptive"
# Model settings
llm_model="gpt-4o-mini", # or "deepseek-chat", etc.
embedding_model="all-MiniLM-L6-v2",
# Agentic loop
max_iterations=3, # Max answer-critic iterations
improvement_threshold=0.8, # Stop if critic score >= this
# Graph algorithm
ppr_alpha=0.5, # PPR damping factor
max_paths=20, # Max reasoning paths in context
path_max_hops=3, # Max hops per path
)
See consistent_rag/pipeline/config.py for the full parameter list.
Pipeline Configurations
| Configuration | Type | Vector | KG | Graph Strategy |
|---|---|---|---|---|
baseline_no_tool |
Single-pass | No | No | — |
baseline_vector |
Single-pass | Yes | No | — |
baseline_kg |
Single-pass | No | Yes | PPR |
baseline_hybrid |
Single-pass | Yes | Yes | PPR |
agentic_vector_only |
Multi-agent | Yes | No | — |
agentic_kg_nhops |
Multi-agent | Yes | Yes | N-Hops BFS |
agentic_kg_ppr |
Multi-agent | Yes | Yes | PPR |
agentic_kg_random_walk |
Multi-agent | Yes | Yes | Random Walk |
agentic_kg_hybrid |
Multi-agent | Yes | Yes | Hybrid Semantic |
agentic_kg_adaptive |
Multi-agent | Yes | Yes | Adaptive (Full) |
Run any configuration via the evaluation CLI:
uv run python -m consistent_rag.evaluate_all \
--approaches agentic_kg_ppr \
--benchmark faitheval \
--limit 10 --verbose
Architecture
ConsistentRAG operates as a four-phase pipeline:
+-------------------------------------------------------------+
| Pipeline Layer |
| (AdaptiveHybrid / OnlineDynamic / OfflineKG / Baseline) |
+-------------------------------------------------------------+
| Orchestration Layer |
| (AdaptiveRouter Orchestrator / Multi-Agent Loop) |
+-------------------------------------------------------------+
| Agent Layer |
| Seed Agent | Router Agent | Answer Agent | Critic |
+-------------------------------------------------------------+
| Core Services Layer |
| Retriever | Graph Store | Multi-Algorithm Engine | LLM |
+-------------------------------------------------------------+
| Infrastructure Layer |
| Qdrant | NetworkX | FAISS | DeepSeek API (Default) |
| | OpenAI API (Configurable) |
+-------------------------------------------------------------+
Four Phases
- Dual-Mode Indexing & KG Construction — Offline (FAISS + pre-built graph) or Online (Qdrant + on-the-fly graph)
- Multi-Algorithmic Graph Retrieval Engine — PPR, Directed Random Walks, Hybrid Semantic Traversal, N-Hops BFS
- Adaptive Strategy Routing — Router Agent dynamically selects the optimal graph algorithm per query
- Agentic Iterative Refinement — Answer + Critic loop with context pruning, structural expansion, and instructional augmentation
Running Evaluations
The evaluation system runs any combination of the 10 pipeline configurations against the 5 benchmarks. Results are persisted to CSV (and optionally PostgreSQL) with full resume support.
# Quick smoke test: 1 config, 1 benchmark, 2 samples
uv run python -m consistent_rag.evaluate_all \
--approaches baseline_no_tool \
--benchmark faitheval \
--limit 2 --verbose
# Run a specific config against all benchmarks
uv run python -m consistent_rag.evaluate_all \
--approaches agentic_kg_ppr \
--benchmark all \
--limit 50 \
--csv results/agentic_kg_ppr.csv
# Run all configs against all benchmarks (full thesis experiment matrix)
uv run python -m consistent_rag.evaluate_all \
--approaches all \
--benchmark all \
--csv results/full_matrix.csv \
--metrics all
# Resume an interrupted run
uv run python -m consistent_rag.evaluate_all \
--approaches all \
--benchmark all \
--csv results/full_matrix.csv \
--resume
# With PostgreSQL persistence
uv run python -m consistent_rag.evaluate_all \
--approaches all \
--benchmark faitheval \
--csv results/faitheval.csv \
--postgres postgresql://consistent_rag:consistent_rag@localhost:5432/consistent_rag
CLI Options
| Flag | Description |
|---|---|
--approaches |
Comma-separated config names or all |
--benchmark |
Comma-separated benchmark names or all |
--limit |
Max samples per benchmark (default: 5) |
--metrics |
basic, llm, deepeval, or all |
--csv |
Output CSV path (auto-generated if omitted) |
--resume |
Skip completed samples in existing CSV |
--no-cache |
Re-extract triplets instead of using cache |
--postgres |
PostgreSQL URL for experiment tracking |
--verbose |
Print per-sample progress |
Build Offline Index
uv run python scripts/build_offline_index.py
Benchmarks
| Benchmark | Focus | Samples |
|---|---|---|
| FaithEval | Factual consistency (unanswerable, inconsistent, counterfactual) | ~15K |
| MuSiQuE | Multi-hop reasoning across documents | ~25K |
| TimeQA | Temporal reasoning and evolving facts | ~20K |
| SQuAD | Single-hop extractive QA (control baseline) | ~100K |
| FinanceBench | Domain-specific financial document QA (SEC filings) | 150 |
MCP Tool
ConsistentRAG can be used as an MCP tool by any compatible agent:
# Start the MCP server
python -m consistent_rag.mcp_server
The server exposes a query tool with parameters:
question(required): The question to answercontext(optional): Pre-provided context (skips retrieval)mode:online_dynamicoroffline_staticstrategy:adaptive,ppr,random_walk,hybrid,nhopstop_k: Number of documents to retrieve
Troubleshooting
"No API key found"
ValueError: No OpenAI or DeepSeek API key found
Fix: Set OPENAI_API_KEY or DEEPSEEK_API_KEY in your environment or .env file.
"Connection refused" to localhost:6333
qdrant_client.http.exceptions.ResponseHandlingException
Fix: Qdrant is not running. Start it with make docker-up. If you're using FAISS or KG-only mode, you don't need Qdrant.
"Module not found" after installation
Fix: Make sure you're in the correct virtual environment:
uv sync --extra dev
source .venv/bin/activate # or .venv\Scripts\activate on Windows
DeepSeek format errors
DeepSeek occasionally rejects message formats. The pipeline automatically retries once. If it persists, switch to OpenAI or another provider.
Out of memory during indexing
For large document collections, reduce max_triples_per_doc (default: 40) or use Qdrant instead of FAISS:
rag = ConsistentRAG(
pipeline="hybrid",
backend="qdrant", # better for large collections
max_triples_per_doc=20,
)
Development
make lint # Format check + linting
make format # Auto-fix formatting
make test-unit # Run unit tests (no external services)
make test # Run all tests
make ci # Full CI pipeline
make docker-up # Start Qdrant + PostgreSQL
make docker-down # Stop infrastructure
Project Structure
consistent_rag/
├── api.py # Public API (ConsistentRAG, QueryResult)
├── pipeline/
│ ├── config.py # PipelineConfig (unified configuration)
│ ├── pipeline.py # Main pipeline orchestrator
│ └── pipeline_factory.py # Pipeline configuration factory
├── llm.py # Universal LLM client (OpenAI-compatible)
├── embeddings.py # Embedding service (local + API)
├── retrievers/ # Vector retrievers (Qdrant, FAISS)
├── knowledge_graph/
│ ├── networkx_graph_store.py # NetworkX graph backend
│ ├── ppr_engine.py # Personalized PageRank
│ ├── algorithms.py # Random Walk, Hybrid, N-Hops
│ ├── extractor.py # LLM-based triple extraction
│ └── lsh_index.py # Locality Sensitive Hashing
├── agents/
│ ├── answer/ # Answer generation agent
│ ├── critic/ # Critic agent with structured feedback
│ └── retrieval/ # Retrieval orchestrator
├── benchmarks/ # Dataset loaders
├── evaluation/ # Metrics and experiment runner
└── examples/ # Runnable example scripts
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file consistent_rag-0.1.6.tar.gz.
File metadata
- Download URL: consistent_rag-0.1.6.tar.gz
- Upload date:
- Size: 643.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3211e9a51c1425fd4457a17fb43197d57ac1e6dbcbe3d288c698cba9e5e0d010
|
|
| MD5 |
561d4ec7a4ecc2b353aad5a593bd3b74
|
|
| BLAKE2b-256 |
bd9b7abbdab3f391cbd11de523cafda8429b6ab01c0b5aad99076ce9b5633c1c
|
Provenance
The following attestation bundles were made for consistent_rag-0.1.6.tar.gz:
Publisher:
release.yml on pinkfloydsito/consistent-rag
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
consistent_rag-0.1.6.tar.gz -
Subject digest:
3211e9a51c1425fd4457a17fb43197d57ac1e6dbcbe3d288c698cba9e5e0d010 - Sigstore transparency entry: 1393423170
- Sigstore integration time:
-
Permalink:
pinkfloydsito/consistent-rag@e043356962f0049d34c4a8d3e8bedef5bbcf9c37 -
Branch / Tag:
refs/tags/v0.1.6 - Owner: https://github.com/pinkfloydsito
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e043356962f0049d34c4a8d3e8bedef5bbcf9c37 -
Trigger Event:
push
-
Statement type:
File details
Details for the file consistent_rag-0.1.6-py3-none-any.whl.
File metadata
- Download URL: consistent_rag-0.1.6-py3-none-any.whl
- Upload date:
- Size: 154.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50bd797d220683db8d594ec12aff92f2cd809effee3bf314125b80b4a354a1b1
|
|
| MD5 |
53d0fb08f786deda0cd2644a5f2f85ee
|
|
| BLAKE2b-256 |
be5406dad1acb27a7b8d53da6a4bf226e73978aac6bccd4390c61aac712fe538
|
Provenance
The following attestation bundles were made for consistent_rag-0.1.6-py3-none-any.whl:
Publisher:
release.yml on pinkfloydsito/consistent-rag
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
consistent_rag-0.1.6-py3-none-any.whl -
Subject digest:
50bd797d220683db8d594ec12aff92f2cd809effee3bf314125b80b4a354a1b1 - Sigstore transparency entry: 1393423201
- Sigstore integration time:
-
Permalink:
pinkfloydsito/consistent-rag@e043356962f0049d34c4a8d3e8bedef5bbcf9c37 -
Branch / Tag:
refs/tags/v0.1.6 - Owner: https://github.com/pinkfloydsito
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e043356962f0049d34c4a8d3e8bedef5bbcf9c37 -
Trigger Event:
push
-
Statement type: