A claim-support / faithfulness scorer for Inspect AI — does the transcript actually substantiate the claimed answer?

These details have not been verified by PyPI

Project links

Project description

inspect-claim-support

A claim-support (faithfulness / groundedness) scorer for Inspect AI, packaged as a standalone extension.

claim_support assesses whether a claimed answer is actually substantiated by the conversation transcript — not whether it is correct in absolute terms. It is a model-graded scorer with a rubric that maps SUPPORTED / PARTIAL / UNSUPPORTED onto Inspect's CORRECT / PARTIAL / INCORRECT. A grader parse failure (the grader model not emitting a parseable verdict) is treated as a scoring-instrument failure and returns Score.unscored(), keeping the sample out of the accuracy denominator rather than recording it as a non-answer from the model under test.

Why it earns its place: absence isn't support

The rubric refuses to let absence of evidence pass as support. A negative claim like "I made no network calls" only scores SUPPORTED if the transcript is actually capable of showing that class of event. If the transcript cannot expose the relevant events, the claim is PARTIAL or UNSUPPORTED — never SUPPORTED. This surfaces overclaims instead of laundering them through a plausible rationale.

The scorer assesses support against the Inspect transcript only (transcript-visible events), not against actual runtime truth in the environment.

Install

pip install inspect-claim-support

Use

from inspect_ai import Task
from inspect_claim_support import claim_support

task = Task(
    dataset=...,
    solver=...,
    scorer=claim_support(),   # optionally: claim_support(model="openai/gpt-4o")
)

Once installed, the scorer is also resolvable by its namespaced registry name inspect_claim_support/claim_support via Inspect's setuptools entry point.

Parameters

template — grading template (defaults to a SUPPORTED / PARTIAL / UNSUPPORTED rubric with the absence-isn't-support boundary built in).
model — model to use for grading (defaults to the model being evaluated).

Origin & credit

This scorer originated as UKGovernmentBEIS/inspect_ai#4166 (addressing issue #4143). The Inspect maintainers judged that it better fits an external package than Inspect core, so it is distributed here. The implementation uses only Inspect's public API (the internal chat_history helper is reimplemented locally for transcript rendering).

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Jun 25, 2026

0.1.0 yanked

Jun 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inspect_claim_support-0.1.1.tar.gz (6.4 kB view details)

Uploaded Jun 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

inspect_claim_support-0.1.1-py3-none-any.whl (6.6 kB view details)

Uploaded Jun 25, 2026 Python 3

File details

Details for the file inspect_claim_support-0.1.1.tar.gz.

File metadata

Download URL: inspect_claim_support-0.1.1.tar.gz
Upload date: Jun 25, 2026
Size: 6.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for inspect_claim_support-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`4754ae28b36d2ed9022f47c14e823e1857aeb5fe02630f69a34b267ae03d493b`
MD5	`a3cb8041b5661c6fb9a1ee6cae802b81`
BLAKE2b-256	`61f532af5f3112a543728790ec1f20950d2b18070098fb025ec6ed595e35d765`

See more details on using hashes here.

File details

Details for the file inspect_claim_support-0.1.1-py3-none-any.whl.

File metadata

Download URL: inspect_claim_support-0.1.1-py3-none-any.whl
Upload date: Jun 25, 2026
Size: 6.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for inspect_claim_support-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ad5a56a80e8d2135ac66dfdc16328ff7784b9af4cba304a1588344c1c3fb8ec2`
MD5	`fc53362f79467965cb182432ba9c7d9a`
BLAKE2b-256	`c7f8b7c53ee38ef5d5a2dff4c7ba933347857caff230e0ede8344a15e7fdc5ee`

See more details on using hashes here.

inspect-claim-support 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

inspect-claim-support

Why it earns its place: absence isn't support

Install

Use

Parameters

Origin & credit

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes