Skip to main content

ConsistentRAG: Improving factual consistency in RAG through knowledge graph grounding and multi-agent refinement

Project description

ConsistentRAG

Tests Python 3.10+ License: MIT

ConsistentRAG: Improving Factual Consistency in RAG via Graph-Based Retrieval and Ranking

A modular Python framework that improves factual consistency in Retrieval-Augmented Generation by grounding queries in a live knowledge graph. It uses a suite of graph algorithms to retrieve and rank structurally sound reasoning paths, which are then refined through a multi-agent loop where specialized AI "critics" collaboratively improve the answer across multiple iterations.


Try It in 5 Minutes (No Infrastructure Needed)

The fastest way to see ConsistentRAG in action uses KG-only mode — no vector database, no Docker, just Python + an LLM API key.

# 1. Clone and install
git clone https://github.com/pinkfloydsito/consistent-rag.git
cd consistent-rag
uv sync

# 2. Set your API key (pick one)
export OPENAI_API_KEY="sk-..."           # OpenAI, Azure, or any OpenAI-compatible endpoint
# OR
export DEEPSEEK_API_KEY="sk-..."         # DeepSeek (cheaper, no credit card needed)

# 3. Run the simplest example
uv run python examples/kg_rag_example.py

That's it. The example builds a knowledge graph from text and answers questions using graph reasoning.


Prerequisites

API Keys (Required)

ConsistentRAG needs an LLM for extraction, answering, and critique. Set at least one of these:

Priority Variable Example Notes
1st OPENAI_API_KEY sk-... OpenAI, Azure, or any OpenAI-compatible provider
2nd DEEPSEEK_API_KEY sk-... Fallback if OpenAI key is absent
OPENAI_API_BASE https://api.openai.com/v1 Override the API endpoint (e.g., for Azure or local LLMs)

Copy .env.example to .env and fill in your keys:

cp .env.example .env
# Edit .env with your keys

Infrastructure (Only for specific modes)

What When needed How to start
Qdrant Online modes (backend="qdrant") make docker-up
PostgreSQL Experiment tracking / integration tests make docker-up (same command)

The make docker-up command starts both Qdrant (port 6333) and PostgreSQL (port 5432). You don't need them for KG-only or FAISS-offline modes.


Installation

# With uv (recommended)
uv sync --extra dev

# With pip
pip install -e ".[dev]"

Quick Start: Three Paths

Path A: "Just want to try it" — KG-Only, No Servers

No vector DB. No Docker. Just a graph built from text.

from consistent_rag import ConsistentRAG

rag = ConsistentRAG(pipeline="kg_only", strategy="ppr")

context = """
TechStart was founded in 2019 by Alice Johnson in San Francisco.
The company focuses on AI-powered analytics for retail businesses.
"""

result = rag.query("Who founded TechStart?", context=context)
print(result.answer)
# Output: "TechStart was founded by Alice Johnson."

What's happening: The pipeline extracts triples from text → builds a NetworkX graph → uses PPR to rank reasoning paths → generates an answer.

See examples/kg_rag_example.py for a full runnable script.

Path B: "I have documents to index" — FAISS + KG, Fully Local

Good for: offline use, no external services, saving/loading indices.

from consistent_rag import ConsistentRAG

rag = ConsistentRAG(
    pipeline="hybrid",
    backend="faiss",
    strategy="ppr",
    embedding_model="all-MiniLM-L6-v2",  # local embeddings, no API key
)

# Index your documents
docs = [
    "Pydantic AI is a Python agent framework...",
    "DeepSeek develops large language models...",
]
rag.index_documents(docs, recreate=True, build_kg=True)

# Query
result = rag.query("What is Pydantic AI?")
print(result.answer)

# Save for later
rag.save_index("./my_index")

See examples/agentic_rag_example.py for a complete example.

Path C: "Production setup" — Qdrant + Full Pipeline

Good for: large document collections, online dynamic KG, adaptive strategies.

# 1. Start infrastructure
make docker-up
from consistent_rag import ConsistentRAG

rag = ConsistentRAG(
    pipeline="hybrid",
    backend="qdrant",
    collection_name="my_docs",
    strategy="adaptive",  # auto-selects best strategy per query
)

# Index documents (KG built incrementally)
rag.index_documents(docs, build_kg=True)

# Query with full agentic loop
result = rag.query("What is the relationship between X and Y?")

print(f"Answer: {result.answer}")
print(f"Iterations: {result.iterations_used}")
print(f"Critic score: {result.final_score:.2f}")
print(f"Seeds: {result.seeds_used}")

See examples/basic_rag_example.py for vector-only mode and examples/kg_ppr_rag_example.py for KG+PPR with Qdrant.


Streamlit Visualization

Run the interactive demo to watch the pipeline execute step-by-step:

uv run streamlit run streamlit_app.py

Features:

  • Live query input with strategy selection
  • Per-iteration subgraph visualization
  • Critic scores, weight deltas, and convergence tracking
  • Benchmark dataset selection (FaithEval, SQuAD, etc.)
  • Reasoning path evolution charts

Configuration

All parameters are passed to ConsistentRAG() or PipelineConfig:

from consistent_rag import ConsistentRAG

rag = ConsistentRAG(
    # Pipeline mode
    pipeline="hybrid",          # "vector_only", "kg_only", or "hybrid"
    backend="faiss",            # "faiss" or "qdrant" (vector pipelines only)
    strategy="ppr",             # "ppr", "nhops", "random_walk", "hybrid", "adaptive"

    # Model settings
    llm_model="gpt-4o-mini",    # or "deepseek-chat", etc.
    embedding_model="all-MiniLM-L6-v2",

    # Agentic loop
    max_iterations=3,           # Max answer-critic iterations
    improvement_threshold=0.8,  # Stop if critic score >= this

    # Graph algorithm
    ppr_alpha=0.5,              # PPR damping factor
    max_paths=20,               # Max reasoning paths in context
    path_max_hops=3,            # Max hops per path
)

See consistent_rag/pipeline/config.py for the full parameter list.


Pipeline Configurations

Configuration Type Vector KG Graph Strategy
baseline_no_tool Single-pass No No
baseline_vector Single-pass Yes No
baseline_kg Single-pass No Yes PPR
baseline_hybrid Single-pass Yes Yes PPR
agentic_vector_only Multi-agent Yes No
agentic_kg_nhops Multi-agent Yes Yes N-Hops BFS
agentic_kg_ppr Multi-agent Yes Yes PPR
agentic_kg_random_walk Multi-agent Yes Yes Random Walk
agentic_kg_hybrid Multi-agent Yes Yes Hybrid Semantic
agentic_kg_adaptive Multi-agent Yes Yes Adaptive (Full)

Run any configuration via the evaluation CLI:

uv run python -m consistent_rag.evaluate_all \
    --approaches agentic_kg_ppr \
    --benchmark faitheval \
    --limit 10 --verbose

Architecture

ConsistentRAG operates as a four-phase pipeline:

+-------------------------------------------------------------+
|                     Pipeline Layer                          |
| (AdaptiveHybrid / OnlineDynamic / OfflineKG / Baseline)     |
+-------------------------------------------------------------+
|                   Orchestration Layer                        |
|    (AdaptiveRouter Orchestrator / Multi-Agent Loop)          |
+-------------------------------------------------------------+
|                      Agent Layer                            |
|   Seed Agent | Router Agent | Answer Agent | Critic         |
+-------------------------------------------------------------+
|                  Core Services Layer                        |
|   Retriever | Graph Store | Multi-Algorithm Engine | LLM    |
+-------------------------------------------------------------+
|                  Infrastructure Layer                       |
|   Qdrant | NetworkX | FAISS | DeepSeek API (Default)        |
|                       | OpenAI API (Configurable)           |
+-------------------------------------------------------------+

Four Phases

  1. Dual-Mode Indexing & KG Construction — Offline (FAISS + pre-built graph) or Online (Qdrant + on-the-fly graph)
  2. Multi-Algorithmic Graph Retrieval Engine — PPR, Directed Random Walks, Hybrid Semantic Traversal, N-Hops BFS
  3. Adaptive Strategy Routing — Router Agent dynamically selects the optimal graph algorithm per query
  4. Agentic Iterative Refinement — Answer + Critic loop with context pruning, structural expansion, and instructional augmentation

Running Evaluations

The evaluation system runs any combination of the 10 pipeline configurations against the 5 benchmarks. Results are persisted to CSV (and optionally PostgreSQL) with full resume support.

# Quick smoke test: 1 config, 1 benchmark, 2 samples
uv run python -m consistent_rag.evaluate_all \
    --approaches baseline_no_tool \
    --benchmark faitheval \
    --limit 2 --verbose

# Run a specific config against all benchmarks
uv run python -m consistent_rag.evaluate_all \
    --approaches agentic_kg_ppr \
    --benchmark all \
    --limit 50 \
    --csv results/agentic_kg_ppr.csv

# Run all configs against all benchmarks (full thesis experiment matrix)
uv run python -m consistent_rag.evaluate_all \
    --approaches all \
    --benchmark all \
    --csv results/full_matrix.csv \
    --metrics all

# Resume an interrupted run
uv run python -m consistent_rag.evaluate_all \
    --approaches all \
    --benchmark all \
    --csv results/full_matrix.csv \
    --resume

# With PostgreSQL persistence
uv run python -m consistent_rag.evaluate_all \
    --approaches all \
    --benchmark faitheval \
    --csv results/faitheval.csv \
    --postgres postgresql://consistent_rag:consistent_rag@localhost:5432/consistent_rag

CLI Options

Flag Description
--approaches Comma-separated config names or all
--benchmark Comma-separated benchmark names or all
--limit Max samples per benchmark (default: 5)
--metrics basic, llm, deepeval, or all
--csv Output CSV path (auto-generated if omitted)
--resume Skip completed samples in existing CSV
--no-cache Re-extract triplets instead of using cache
--postgres PostgreSQL URL for experiment tracking
--verbose Print per-sample progress

Build Offline Index

uv run python scripts/build_offline_index.py

Benchmarks

Benchmark Focus Samples
FaithEval Factual consistency (unanswerable, inconsistent, counterfactual) ~15K
MuSiQuE Multi-hop reasoning across documents ~25K
TimeQA Temporal reasoning and evolving facts ~20K
SQuAD Single-hop extractive QA (control baseline) ~100K
FinanceBench Domain-specific financial document QA (SEC filings) 150

MCP Tool

ConsistentRAG can be used as an MCP tool by any compatible agent:

# Start the MCP server
python -m consistent_rag.mcp_server

The server exposes a query tool with parameters:

  • question (required): The question to answer
  • context (optional): Pre-provided context (skips retrieval)
  • mode: online_dynamic or offline_static
  • strategy: adaptive, ppr, random_walk, hybrid, nhops
  • top_k: Number of documents to retrieve

Troubleshooting

"No API key found"

ValueError: No OpenAI or DeepSeek API key found

Fix: Set OPENAI_API_KEY or DEEPSEEK_API_KEY in your environment or .env file.

"Connection refused" to localhost:6333

qdrant_client.http.exceptions.ResponseHandlingException

Fix: Qdrant is not running. Start it with make docker-up. If you're using FAISS or KG-only mode, you don't need Qdrant.

"Module not found" after installation

Fix: Make sure you're in the correct virtual environment:

uv sync --extra dev
source .venv/bin/activate  # or .venv\Scripts\activate on Windows

DeepSeek format errors

DeepSeek occasionally rejects message formats. The pipeline automatically retries once. If it persists, switch to OpenAI or another provider.

Out of memory during indexing

For large document collections, reduce max_triples_per_doc (default: 40) or use Qdrant instead of FAISS:

rag = ConsistentRAG(
    pipeline="hybrid",
    backend="qdrant",  # better for large collections
    max_triples_per_doc=20,
)

Development

make lint          # Format check + linting
make format        # Auto-fix formatting
make test-unit     # Run unit tests (no external services)
make test          # Run all tests
make ci            # Full CI pipeline
make docker-up     # Start Qdrant + PostgreSQL
make docker-down   # Stop infrastructure

Project Structure

consistent_rag/
├── api.py                       # Public API (ConsistentRAG, QueryResult)
├── pipeline/
│   ├── config.py                # PipelineConfig (unified configuration)
│   ├── pipeline.py              # Main pipeline orchestrator
│   └── pipeline_factory.py      # Pipeline configuration factory
├── llm.py                       # Universal LLM client (OpenAI-compatible)
├── embeddings.py                # Embedding service (local + API)
├── retrievers/                  # Vector retrievers (Qdrant, FAISS)
├── knowledge_graph/
│   ├── networkx_graph_store.py  # NetworkX graph backend
│   ├── ppr_engine.py            # Personalized PageRank
│   ├── algorithms.py            # Random Walk, Hybrid, N-Hops
│   ├── extractor.py             # LLM-based triple extraction
│   └── lsh_index.py             # Locality Sensitive Hashing
├── agents/
│   ├── answer/                  # Answer generation agent
│   ├── critic/                  # Critic agent with structured feedback
│   └── retrieval/               # Retrieval orchestrator
├── benchmarks/                  # Dataset loaders
├── evaluation/                  # Metrics and experiment runner
└── examples/                    # Runnable example scripts

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

consistent_rag-0.1.7.tar.gz (689.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

consistent_rag-0.1.7-py3-none-any.whl (167.1 kB view details)

Uploaded Python 3

File details

Details for the file consistent_rag-0.1.7.tar.gz.

File metadata

  • Download URL: consistent_rag-0.1.7.tar.gz
  • Upload date:
  • Size: 689.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for consistent_rag-0.1.7.tar.gz
Algorithm Hash digest
SHA256 4540c7bf29aded6038d75de82d377e339d47d9fef3259a9493e9bdc793fd4b9e
MD5 4b25dc0af1cac509879d622fb0cdaa0b
BLAKE2b-256 8b1dc39195203d133e58cfc0e3a727e7a40eef9bc2e45ebf23ecc37d8fc5e3d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for consistent_rag-0.1.7.tar.gz:

Publisher: release.yml on pinkfloydsito/consistent-rag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file consistent_rag-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: consistent_rag-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 167.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for consistent_rag-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 41388ab275628b23e671dae5582dca6b67443339a434b04dd5207d3338d1a89c
MD5 97c3d0079983bac190ec108ef1ed60b3
BLAKE2b-256 6eca2cea2ec01a02a0cfeb6a3d714ad6da2616bf631e9f2615973e367ac9d297

See more details on using hashes here.

Provenance

The following attestation bundles were made for consistent_rag-0.1.7-py3-none-any.whl:

Publisher: release.yml on pinkfloydsito/consistent-rag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page