Heuristic quality metrics for RAG retrieval and grounded answers. Python port of @mukundakatta/rag-quality-kit.

These details have not been verified by PyPI

Project links

Project description

rag-quality-kit

Heuristic quality metrics for RAG retrieval and grounded answers. Zero runtime dependencies, pure-Python.

Python port of @mukundakatta/rag-quality-kit. The JS sibling has the original heuristics; this README sticks to the Python API.

Install

pip install rag-quality-kit

Usage

from rag_quality_kit import score, missing_evidence

question = "Who wrote Hamlet and when was it first performed?"
contexts = [
    {"id": "doc-1", "text": "Hamlet is a tragedy by William Shakespeare, written around 1600."},
    {"id": "doc-2", "text": "Records suggest Hamlet was first performed in 1602."},
]
answer = "Hamlet was written by Shakespeare and first performed in 1602."

r = score(question, contexts, answer)
r.groundedness         # 0..1 -- answer terms that appear in any context
r.context_relevance    # 0..1 -- question terms covered by the contexts
r.answer_relevance     # 0..1 -- question terms covered by the answer
r.conciseness          # 0..1 -- 1.0 if answer is roughly question-sized, decays as it balloons
r.overall              # unweighted mean of the four

missing_evidence(answer, contexts)   # -> list[str] of answer terms not in any context

Metrics

Metric	Range	Behavior
`groundedness`	0..1	Fraction of answer terms found in any context.
`context_relevance`	0..1	Fraction of (longer) question terms covered by the contexts. Mirrors the JS `retrievalCoverage`.
`answer_relevance`	0..1	Fraction of question terms that the answer addresses.
`conciseness`	0..1	1.0 when the answer is up to ~2x the question's term count; linearly decays to 0 at 10x.
`overall`	0..1	Unweighted mean of the four.

All metrics are heuristic and token-overlap based -- fast, deterministic, no LLM calls. For evaluation-grade scoring layer an LLM judge on top.

API differences from the JS sibling

Python signature is score(question, contexts, answer) (positional) instead of scoreRag({ query, answer, contexts }).
Returns a QualityResult dataclass instead of a plain object.
Metric names: context_relevance (was retrievalCoverage), groundedness is unchanged. Adds two extra heuristics: answer_relevance and conciseness. The aggregate is overall (was score) and now averages all four.
Drops the citationCoverage metric -- it's heavily citation-format dependent and best owned by the calling app. Use missing_evidence(answer, contexts) for an analogous signal.

See the JS sibling for the original heuristics and broader design notes.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_quality_kit-0.1.0.tar.gz (6.1 kB view details)

Uploaded Apr 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rag_quality_kit-0.1.0-py3-none-any.whl (5.7 kB view details)

Uploaded Apr 27, 2026 Python 3

File details

Details for the file rag_quality_kit-0.1.0.tar.gz.

File metadata

Download URL: rag_quality_kit-0.1.0.tar.gz
Upload date: Apr 27, 2026
Size: 6.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rag_quality_kit-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`bf20a1ca372b8bfb3c5f404cc82eccf47199bdddd155c198687b5561c73f284e`
MD5	`2d1b7e870346e62e1dd7e40d2a3534e7`
BLAKE2b-256	`923b0ce9dcf77e1c5adb0aef3fc423e656cfd81e2f160855eeaf50739f0429ca`

See more details on using hashes here.

File details

Details for the file rag_quality_kit-0.1.0-py3-none-any.whl.

File metadata

Download URL: rag_quality_kit-0.1.0-py3-none-any.whl
Upload date: Apr 27, 2026
Size: 5.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for rag_quality_kit-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`46a0b61162a9da136c8d502513f1ccc9ae4eefb6d833077c7aa7ee731b59ef3b`
MD5	`522142a7fa58d3aa0d2865e425f808b2`
BLAKE2b-256	`59d0fbe4099fd936448524dd8dda2cfcc4e93dee26af141156d611d15eaa6945`

See more details on using hashes here.

rag-quality-kit 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

rag-quality-kit

Install

Usage

Metrics

API differences from the JS sibling

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes