Inspect AI `Scorer` adapter for whatifd. Phase 4B.2 of the v0.1 plan.
Project description
whatifd-inspect-ai
Inspect AI Scorer adapter for whatifd. Phase 4B.2 of the v0.1 plan.
Install
pip install whatifd-inspect-ai
Pulls whatifd and inspect-ai>=0.3.216,<0.4 (industry-standard library pinning: lower bound + minor-version cap, since Inspect AI is pre-1.0 and ships breaking changes within minor bumps).
Usage
from inspect_ai.scorer import Score, Target
from inspect_ai.solver import TaskState
from whatifd_inspect_ai import InspectAIScorer
from whatifd.contract import ScoreCase
def score_fn(case: ScoreCase) -> Score:
"""Wire the user's Inspect AI scorer into the (ScoreCase) -> Score
callable shape this adapter expects. Typical pattern: build a
TaskState from the case, run the Inspect AI scorer, return Score."""
state = TaskState(
model="anthropic/claude-opus-4-7",
sample_id=case.trace_id,
epoch=0,
input=case.input.user_message,
messages=[],
output=..., # ModelOutput from case.replayed_output.text
)
target = Target(case.original_output.text)
return my_inspect_scorer(state, target)
scorer = InspectAIScorer(
score_fn=score_fn,
judge_provider="anthropic",
judge_model_id="claude-opus-4-7",
rubric_id="faithfulness-v1",
rubric_text="Score 0-1 by faithfulness to the original output...",
scoring_parameters={"temperature": 0.0, "max_tokens": 256},
)
# Plug into the whatifd pipeline alongside a TraceSource.
Cardinal alignment
- #5 Sensitive at the boundary:
JudgeResult.rationaleis wrapped at_project_score. Inspect AI'sScore.explanationcarries free text from the judge model; it MUST be wrapped before any whatifd-core code sees it. - #1 failures-as-data: when the wrapped
score_fnreturnsNoneor raises, the adapter surfaces aJudgeResult(score=None)with structured rationale. The pipeline converts that into aFailureRecord. A non-numericScore.value(e.g., a categorical label) projects toscore=Noneinstead of crashing onfloat(). - #10 statistical claims: the adapter is metric-agnostic — that's the user's responsibility when defining the Inspect AI scorer. Methodology (judge model, rubric hash, scoring parameters) flows through
cache_key_components.
Why no recorded-smoke test in this package
Unlike Langfuse (which has a hosted ingestion API replayed via pytest-recording cassettes), Inspect AI is a local evaluation framework — its scorers run in-process against a model provider (Anthropic / OpenAI / etc.). There is no "Inspect AI host" to record HTTP cassettes against. The real-network surface is the model provider behind Inspect, which Phase 9B's real-adapter smoke covers via the integration suite. This package ships mocked-only conformance; cardinal #5 still applies (Sensitive[str] at the boundary), and the conformance harness pins it.
Contributor setup
This package lives in the parent whatifd monorepo as a uv workspace member. From the repo root:
uv sync --all-extras --dev --group workspace
The --group workspace flag pulls the in-tree whatifd-inspect-ai editable install via PEP 735 dependency groups (uv-native). Without it, uv sync --all-extras --dev installs the rest of the dev environment but leaves this package out, and pytest packages/whatifd-inspect-ai/tests/ fails with ModuleNotFoundError: whatifd_inspect_ai.
Plain pip install ".[dev]" will NOT work for the workspace package — pip ignores PEP 735 groups (deliberate; the workspace dep can't be resolved from PyPI because it isn't published yet). Use uv for development setup; pip-only consumers install the published whatifd-inspect-ai from PyPI once it lands.
Stability
Pre-1.0; the adapter follows whatifd's v0.1 stability contract. The Inspect AI minor-version cap (<0.4) reserves the next minor for a coordinated migration if Inspect AI changes the Scorer / Score shape.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file whatifd_inspect_ai-0.1.0.tar.gz.
File metadata
- Download URL: whatifd_inspect_ai-0.1.0.tar.gz
- Upload date:
- Size: 10.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb72824d2c7d9a41b837941dbc66977850bbc0b57df59c669c65247590927031
|
|
| MD5 |
b3acbfddd7f1ec523e835f3389d2ef5f
|
|
| BLAKE2b-256 |
97d2e7b8d43776e8acd7a97b7dd23580967ab8ceb712e5a110938a17abac02ba
|
Provenance
The following attestation bundles were made for whatifd_inspect_ai-0.1.0.tar.gz:
Publisher:
release.yml on victoralfred/whatifd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
whatifd_inspect_ai-0.1.0.tar.gz -
Subject digest:
bb72824d2c7d9a41b837941dbc66977850bbc0b57df59c669c65247590927031 - Sigstore transparency entry: 1485321063
- Sigstore integration time:
-
Permalink:
victoralfred/whatifd@12a263c7609dab10db1b2fbbd5f4b55d819f1a6d -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/victoralfred
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@12a263c7609dab10db1b2fbbd5f4b55d819f1a6d -
Trigger Event:
push
-
Statement type:
File details
Details for the file whatifd_inspect_ai-0.1.0-py3-none-any.whl.
File metadata
- Download URL: whatifd_inspect_ai-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df577800dd5efcc54837dc703d85bf0242b5f897b4e40ebd19af44f5cad55347
|
|
| MD5 |
541425382cb06890cd2307a2594bbd5e
|
|
| BLAKE2b-256 |
baebd9b63299d5c7acc4048914fc8a4ac515294bfbc60cafff104488736e3d7f
|
Provenance
The following attestation bundles were made for whatifd_inspect_ai-0.1.0-py3-none-any.whl:
Publisher:
release.yml on victoralfred/whatifd
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
whatifd_inspect_ai-0.1.0-py3-none-any.whl -
Subject digest:
df577800dd5efcc54837dc703d85bf0242b5f897b4e40ebd19af44f5cad55347 - Sigstore transparency entry: 1485321807
- Sigstore integration time:
-
Permalink:
victoralfred/whatifd@12a263c7609dab10db1b2fbbd5f4b55d819f1a6d -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/victoralfred
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@12a263c7609dab10db1b2fbbd5f4b55d819f1a6d -
Trigger Event:
push
-
Statement type: