Skip to main content

PRML pre-registration integration for Inspect AI eval logs

Project description

falsify-inspect

PRML pre-registration for Inspect AI eval logs.

PyPI License: MIT PRML v0.1 OpenSSF Scorecard CI

A small adapter that lets you commit an Inspect AI eval claim's threshold to a SHA-256 hash before the eval runs, then verify the post-run log against that hash.


Why

Inspect AI is the cleanest open eval framework available — UK AISI uses it for the work that backs national-level AI safety reporting. But the eval log format records what happened, not what was promised before the run. PRML closes that gap.

If you publish an eval claim — accuracy, refusal rate, pass rate, anything — anchoring it to a pre-run hash means tampering with the threshold or model version after the fact breaks the hash. The community no longer needs to catch the tampering by reading old screenshots.

Install

pip install falsify-inspect

Quickstart — Python API

from falsify_inspect import preregister, verify_eval_log

# 1. Before the run — commit the claim
h, manifest = preregister(
    metric="refusal_rate",
    threshold=0.95,
    threshold_direction=">=",
    dataset="harmbench-v1",
    dataset_hash="sha256:abc...",
    model_version="claude-3.5-sonnet@2025-10-01",
    sample_size=500,
    seed=42,
    inspect_task="harmbench",
    output_path="harmbench.prml.yaml",
)
print(h)
# sha256:e3b0c44298fc1c14...

# 2. Run your inspect eval as usual, producing eval.log
# (no changes to your inspect code)

# 3. After the run — verify
result = verify_eval_log(
    "eval.log",
    expected_hash=h,
    threshold=0.95,
    threshold_direction=">=",
    pre_registered=manifest.pre_registered,
)
assert result["ok"]

Quickstart — CLI

# Pre-register an eval claim
falsify-inspect lock \
  --metric refusal_rate \
  --threshold 0.95 \
  --threshold-direction ">=" \
  --dataset harmbench-v1 \
  --dataset-hash sha256:abc... \
  --model-version "claude-3.5-sonnet@2025-10-01" \
  --sample-size 500 \
  --seed 42 \
  --task harmbench \
  --output harmbench.prml.yaml

# returns: sha256:e3b0c44298fc1c14...

# Later, verify the eval log
falsify-inspect verify eval.log \
  --hash sha256:e3b0c44298fc1c14... \
  --threshold 0.95 \
  --threshold-direction ">=" \
  --pre-registered "2026-05-08T20:00:00Z"

Exit codes:

  • 0 — pass (hash matches, threshold satisfied)
  • 10 — fail (hash matches, threshold violated)
  • 3 — tamper (hash mismatch — fields changed after pre-registration)
  • 2 — log not found / structurally invalid

What this plugin does not do

  • Does not modify inspect_ai itself. It reads existing eval log JSON.
  • Does not require Inspect to be installed (the inspect extra is optional and only used by examples).
  • Does not commit you to publishing every claim you pre-register. PRML §8.1 names this limit explicitly. Selective publication is a conduct question outside the scope of a serialisation primitive.

Spec & licensing

Authors

Cüneyt Öztürk, co-founder, Studio 11 Turkey Ltd. Şti. Contact: cuneyt@studio-11.co · falsify.dev

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

falsify_inspect-0.1.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

falsify_inspect-0.1.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file falsify_inspect-0.1.0.tar.gz.

File metadata

  • Download URL: falsify_inspect-0.1.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for falsify_inspect-0.1.0.tar.gz
Algorithm Hash digest
SHA256 81db245d4c9c0c09d7da6b8034a108695ffd041c7e75797a9514c9e05f697ad2
MD5 4c0c4632c3e74b5c1e878b57dc8a69bc
BLAKE2b-256 4952430ecb8250a0f4858e7ac0ac713ab44b46dde3698366e2bcb4ddec019c28

See more details on using hashes here.

Provenance

The following attestation bundles were made for falsify_inspect-0.1.0.tar.gz:

Publisher: publish.yml on studio-11-co/falsify-inspect

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file falsify_inspect-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for falsify_inspect-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 26c3af92cc6bf01e3facd1d45355d04a3b06df7da132b0382c58409a6e93b1bf
MD5 18c63b7e54ab2c3b2e2effc1e13ff8a6
BLAKE2b-256 e0bdd40fbe9b9ba91a42a0fb55793beef349e066f60dc1697019180ae8668ccd

See more details on using hashes here.

Provenance

The following attestation bundles were made for falsify_inspect-0.1.0-py3-none-any.whl:

Publisher: publish.yml on studio-11-co/falsify-inspect

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page