ai-crucible

Diagnostic adversarial game for frontier LLMs — a policy-enforced kernel that mediates a Designer/Solver/Judge cycle, scores against a hidden oracle, and curates a Lab/Arena/Regression catalog.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mikeyfrilot

These details have not been verified by PyPI

Project description

ai-crucible

Python 3.11–3.13 Coverage 96%

A diagnostic adversarial game for frontier LLMs — a measurement instrument that happens to be fun.

One Claude session (Designer) crafts puzzles targeting real, currently-observed capability gaps. Another (Solver) attempts them. A policy-enforced kernel mediates, scores against a hidden oracle, and curates a catalog through a Lab → Arena → Regression lifecycle. Puzzles are grounded in empirical signal — real GitHub issues, academic literature, observed failures in the field — not synthetic.

What makes it different

Capability, not "cheating." AI Crucible distinguishes elegance and novelty (rewarded) from answer-bypass (penalized). Lateral thinking is a capability to measure, not a vice to punish.
The instrument measures itself. Prompt framing is a first-class measured arm — the kernel runs the same puzzle under neutral / self_referential / social_standings framings and reports its own prompt-effect as a diagnostic.
A sealed measurement boundary. Motivation and measurement never share a context window; the hidden oracle is graded out-of-band by a different model family with the agent's reasoning hidden. The model can't game what it can't perceive.
Reliability by consistency. pass^k (all k independent trials succeed), Wilson intervals, and cross-family judge panels — built to report distributions, not point estimates.
A living catalog. Solved puzzles are demoted to Regression, never deleted, so the catalog becomes a capability-evolution timeline as the frontier moves.

Threat model (summary)

AI Crucible's primary adversary is the model under measurement, which may try to reach the answer key or to detect and game the eval. It touches: puzzle prompts/setup, the Solver's own confined working directory, and the model-provider API calls the operator configures. It does not touch: the oracle/answer key (graded out-of-band on a separate host the Solver cannot reach) or motivation "chrome" (rank/standings — never injected into the scored context). Permissions: model-provider keys via environment variables at runtime; no bundled secrets, no telemetry, no outbound calls of its own. Full disclosure — including where a boundary is defense-in-depth rather than a hard guarantee — is in SECURITY.md.

Architecture

AI Crucible is a thin policy layer on Inspect AI (UK AISI), not a from-scratch harness. A single AttemptState object is threaded Designer → Solver → (Critic) → Judge through one generate choke point, so every model and tool call is observable.

Module	Responsibility
`puzzle_loader`	Loads a puzzle directory (`meta.json` / `prompt` / `setup_script`) into Solver-visible state. Never touches the oracle.
`sandbox`	Narrow `exec` / `read_file` / `write_file` channel into a locked, network-less container.
`roles`	The five role slots (Designer / Solver / Critic / Judge / CohortSolver). Only Solver gets tools; Critic is interface-reserved, default-off.
`budget_governor`	Per-class tool-call + wall-clock budgets, displayed to the agent, enforced kernel-side; hard-kill on pathological loops.
`oracle_scorer`	Out-of-band grading: solved-and-no-regression against the hidden oracle (SWE-bench pattern).
`judge_panel`	Cross-family panel of model-scorers + reducer (PoLL) for novelty validation and bypass detection.
`trace_writer`	Per-attempt transcript in the Inspect `EvalLog` shape; large blobs stored by digest.
`observability`	Per-attempt → per-puzzle → per-model rollups; `pass^k` native.
`attestation`	Cryptographic provenance (cosign + event-store) behind a typed subprocess boundary.

The sealed boundary runs in three tiers — Tier 1 scored context (deployment-shaped, framing-neutral), Tier 2 engagement framing (probed for contamination each release), Tier 3 chrome (rank/leaderboard — human-facing UI only, never in a context the model solves in). The full design rationale, with citations, is in docs/research-grounding.md.

Install

# As a Python library + CLI (PyPI):
pip install ai-crucible          # or: uv pip install ai-crucible
ai-crucible --help

# Or zero-prerequisite via npx — downloads a verified binary, no Python needed:
npx @dogfood-lab/ai-crucible --help

Research preview (v0.2.x). The judge panel's alt-test ω is still a circular model-jury bootstrap until a human-labeling round runs, so seated judges are provisional and the composed panel escalates to a Claude Designer below quorum. See the scorecard for the honest, non-cosmetic gate results.

Quick start (from source)

AI Crucible uses uv for environment and dependency management. Python 3.11+.

# Create the venv and install the dev + stats extras
uv sync --extra dev --extra stats

# Run the test suite (with the coverage gate)
uv run pytest --cov=ai_crucible --cov-report=term-missing

# Lint
uv run ruff check .

# One command: lint + tests + build + smoke
bash verify.sh

Documentation

Handbook — guides, architecture, and reference.
docs/research-grounding.md — design rationale, with citations.
docs/gameplan.md — roadmap and open questions.
SECURITY.md — threat model + honest residual-risk disclosure.

License

MIT. Public and pre-1.0 — see the CHANGELOG for version status.

_{Built by MCP Tool Shop · part of the dogfood-lab workshop for testing in the AI era.}

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mikeyfrilot

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Jun 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_crucible-0.2.0.tar.gz (3.1 MB view details)

Uploaded Jun 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ai_crucible-0.2.0-py3-none-any.whl (212.4 kB view details)

Uploaded Jun 2, 2026 Python 3

File details

Details for the file ai_crucible-0.2.0.tar.gz.

File metadata

Download URL: ai_crucible-0.2.0.tar.gz
Upload date: Jun 2, 2026
Size: 3.1 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ai_crucible-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`0794584abee4c47f8104a92555fab1e526c358af8b52204072d9f2e0ebd9aaf6`
MD5	`ba84e7bcd6378476be497e95fac58ffa`
BLAKE2b-256	`2669753d243aff4f5b56bc04d52f6cf8536f79c4625538927bc9eb1658c9d92f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_crucible-0.2.0.tar.gz:

Publisher: release.yml on dogfood-lab/ai-crucible

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ai_crucible-0.2.0.tar.gz
- Subject digest: 0794584abee4c47f8104a92555fab1e526c358af8b52204072d9f2e0ebd9aaf6
- Sigstore transparency entry: 1702203060
- Sigstore integration time: Jun 2, 2026
Source repository:
- Permalink: dogfood-lab/ai-crucible@453e8a7bf1ec478f82fef2e79788dfe3c02c17d7
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/dogfood-lab
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@453e8a7bf1ec478f82fef2e79788dfe3c02c17d7
- Trigger Event: release

File details

Details for the file ai_crucible-0.2.0-py3-none-any.whl.

File metadata

Download URL: ai_crucible-0.2.0-py3-none-any.whl
Upload date: Jun 2, 2026
Size: 212.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ai_crucible-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`edb4c67a5885fcf2c6b0f41ab1d53eb8a08720bb91d43cb7a123df8df5543d00`
MD5	`5738448e600633d66acd5f06b1972099`
BLAKE2b-256	`257b52ac6d003a7ef9ad2fe8d79a62372551ad20fa9e9ce8a829545b26951475`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_crucible-0.2.0-py3-none-any.whl:

Publisher: release.yml on dogfood-lab/ai-crucible

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ai_crucible-0.2.0-py3-none-any.whl
- Subject digest: edb4c67a5885fcf2c6b0f41ab1d53eb8a08720bb91d43cb7a123df8df5543d00
- Sigstore transparency entry: 1702203114
- Sigstore integration time: Jun 2, 2026
Source repository:
- Permalink: dogfood-lab/ai-crucible@453e8a7bf1ec478f82fef2e79788dfe3c02c17d7
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/dogfood-lab
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@453e8a7bf1ec478f82fef2e79788dfe3c02c17d7
- Trigger Event: release

ai-crucible 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

What makes it different

Threat model (summary)

Architecture

Install

Quick start (from source)

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance