A read-only linter and A-F maturity grader for coding-agent harnesses (Claude Code, Codex).

These details have not been verified by PyPI

Project links

Project description

Harness Scorecard

A read-only linter and A–F maturity grader for coding-agent harnesses. Point it at a Claude Code or Codex setup — Claude Code's hooks, permissions, rules/*.md, agents, and CLAUDE.md, or Codex's config.toml (sandbox, approval policy, trust levels), hooks.json, and AGENTS.md — and it returns a graded scorecard: the overall maturity grade, the specific gaps, and the guards that are missing, each with rationale. The harness type is auto-detected.

"Harness engineering" became a named discipline in 2026 and everyone is assembling harnesses with no way to tell if theirs is any good. The rubric is the product: every check traces to a documented red-team failure mode, not generic advice.

What makes the grade real

Most config "linters" credit a harness for declaring a rule. This one models the effective enforcement floor. The headline example:

autoMode.hard_deny is inert when permissions.defaultMode == "bypassPermissions".

A naive scorer reads a rich hard_deny block and awards an A. Harness Scorecard reads the mode, discounts the inert block, and grades against what actually fires — permissions.deny globs plus the PreToolUse hooks. See docs/rubric.md for the full model, including capability gates that cap the grade when a critical hole is present (you can't score an A with readable credentials, no matter how many cheap checks pass).

Usage

# Grade a harness directory (e.g. your ~/.claude)
harness-scorecard scan ~/.claude

# JSON for tooling, plus a self-contained HTML scorecard
harness-scorecard scan ~/.claude --format json --html scorecard.html

# SARIF 2.1.0 for CI / GitHub code scanning, failing the run below grade C
harness-scorecard scan ~/.claude --sarif harness.sarif --min-grade C

--min-grade {A,B,C,D,F} sets the bar (default B). Exit codes: 0 meets the bar · 1 below the bar · 2 no harness found.

GitHub Action

Grade your harness in CI and upload the findings to code scanning:

- uses: saagpatel/harness-scorecard@v1
  with:
    path: .claude
    min-grade: B

The action writes SARIF and uploads it (requires security-events: write) even when the grade fails the build, so findings always reach code scanning. A complete workflow — permissions, weekly scheduling, SARIF upload — is in examples/github-workflow.yml.

Guarantees

Read-only. It never writes to the harness it audits.
Privacy-preserving. All output redacts secrets, tokens, emails, and absolute home paths. Nothing leaves the machine.
Dependency-free runtime. The scorer ships stdlib-only — a tool that grades supply-chain hygiene should carry the smallest surface itself.

Scope (v1)

Implements all ten rubric dimensions end-to-end for both Claude Code and Codex: secret protection, egress/exfiltration control, tool-surface & inbound-injection defense, destructive-action & git safety, harness self-protection & integrity, verification gates, subagent isolation & governance, recovery/rollback safety, memory/provenance hygiene, and observability/audit trail (the critical gated trio is D1/D4/D5). Each harness has its own adapter and check suite over the shared scoring engine; the bypass-aware effective floor maps to Codex's sandbox_mode = "danger-full-access" + approval_policy = "never" just as it does to Claude Code's bypassPermissions. The rubric is versioned and emitted in every report.

Development

uv sync --frozen                                      # install dev tooling from the lockfile
uv run --no-sync python -m unittest discover -s tests # tests (stdlib runner, zero extra deps)
uv run --no-sync ruff check src/ tests/               # lint
uv run --no-sync ty check src/                        # type check

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.11.0

Jun 28, 2026

1.10.0

Jun 28, 2026

1.9.0

Jun 28, 2026

1.8.0

Jun 28, 2026

1.7.0

Jun 28, 2026

1.6.0

Jun 28, 2026

1.5.0

Jun 28, 2026

1.4.0

Jun 28, 2026

1.3.0

Jun 28, 2026

1.2.0

Jun 28, 2026

1.1.0

Jun 28, 2026

This version

1.0.1

Jun 28, 2026

1.0.0

Jun 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harness_scorecard-1.0.1.tar.gz (37.9 kB view details)

Uploaded Jun 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

harness_scorecard-1.0.1-py3-none-any.whl (57.5 kB view details)

Uploaded Jun 28, 2026 Python 3

File details

Details for the file harness_scorecard-1.0.1.tar.gz.

File metadata

Download URL: harness_scorecard-1.0.1.tar.gz
Upload date: Jun 28, 2026
Size: 37.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for harness_scorecard-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`ebace55869ab6d5a4aacb5349c1d1f84cbd52287700da422a6c4f75cc6710895`
MD5	`4019b7d31922c07f85473be5b5923759`
BLAKE2b-256	`83843245f0a8f9fd9e771f1db3333c167fc1bfa1afff61a1b830366035c4592d`

See more details on using hashes here.

File details

Details for the file harness_scorecard-1.0.1-py3-none-any.whl.

File metadata

Download URL: harness_scorecard-1.0.1-py3-none-any.whl
Upload date: Jun 28, 2026
Size: 57.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for harness_scorecard-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bbf8e38f2aaa3ed8d291078a0350bd1e656e22d27b44793b1be540e9cef7eeb2`
MD5	`78fdeeee9d0424fda3c7750182104024`
BLAKE2b-256	`0131196e36106306c34f804f0c82a9241380bff7f89c7edae5abad6b29c6d044`

See more details on using hashes here.

harness-scorecard 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Harness Scorecard

What makes the grade real

Usage

GitHub Action

Guarantees

Scope (v1)

Development

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes