Skip to main content

Diagnose RAG pipeline failures by type and location. Four Soils classification. Epistemic mismatch detection.

Project description

rag-pathology

RAGAS gives you a score. rag-pathology tells you what's wrong.

Diagnoses RAG pipeline failures by type and location. Four Soils classification. Epistemic mismatch detection. Zero dependencies.

pip install rag-pathology

The Problem

Your RAG pipeline scores 0.6 on RAGAS. Now what? Is it a retrieval problem? A generation problem? Are you retrieving facts when the user asked for procedures? RAGAS won't tell you. Neither will DeepEval, TruLens, or Promptfoo.

rag-pathology diagnoses the specific pathology at each stage of your pipeline.

Four Soils Classification

Every query is classified into one of four failure types (inspired by Mark 4:3-8):

Soil Meaning Fix
PATH Total retrieval miss — relevant docs exist but weren't retrieved Fix embeddings, chunk size, or query expansion
ROCKY Good retrieval but generation ignores the context Strengthen grounding prompt, add citations
THORNY Good retrieval + generation, but noisy context corrupts output Add reranking, reduce top-k
GOOD Successful RAG No action needed

Quick Start

from rag_pathology import RAGDiagnoser, RAGQuery, Chunk

diagnoser = RAGDiagnoser("my_pipeline")

query = RAGQuery(
    query="What is Ghana's GDP growth rate?",
    retrieved_chunks=[
        Chunk("Ghana GDP growth is 6.0% in 2025", score=0.95),
        Chunk("Recipe for jollof rice", score=0.1),
    ],
    generated_answer="Ghana's GDP growth rate is 6.0%.",
)

diagnosis = diagnoser.diagnose_query(query)
print(diagnosis.soil_type)        # SoilType.GOOD
print(diagnosis.failure_stage)    # FailureStage.NONE
print(diagnosis.evidence)         # "Healthy RAG: relevance=0.45, grounding=0.67..."

# Pipeline-level diagnosis
pipeline = diagnoser.pipeline_diagnosis()
print(pipeline.summary())
print(pipeline.overall_health)    # 0.67
print(pipeline.recommendations)   # ["33% of queries are THORNY..."]

Epistemic Mismatch Detection

Detects when your retrieval returns the wrong type of knowledge:

  • User asks "How do I register a company?" (PROCEDURAL)
  • RAG retrieves "Companies in Ghana must register with RGD" (FACTUAL)
  • Structurally relevant, epistemically wrong type

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_pathology-0.1.0.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rag_pathology-0.1.0-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file rag_pathology-0.1.0.tar.gz.

File metadata

  • Download URL: rag_pathology-0.1.0.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for rag_pathology-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3ee81578a5f60596464dc11f0c492fa08965857f6def704a4021cefdbd870505
MD5 73adb0d075080572645ee6bbd1ab021b
BLAKE2b-256 e2363363d1583435d7d94a6d6f6777ebd9e2b5d96e04bf3ad94e971c123294c0

See more details on using hashes here.

File details

Details for the file rag_pathology-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rag_pathology-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for rag_pathology-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8358bc29ca98a8166efec57ba8c0b0c55a6f5e2de9a9b735aa3d426db1241b90
MD5 93dba9925c137c1c0841575725e7f9a5
BLAKE2b-256 871e3dd78e98728f3a9fb86e6a9c367cba4b5d786ddcf97b6a9f6b22e60b23db

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page