Trace-grounded CLAUDE.md adherence auditor: which prose rules your agents actually ignore, ranked from run history

These details have not been verified by PyPI

Project links

Project description

misfire

Linters tell you your rules are messy; misfire tells you which rules your agents actually ignore — and converts only those into hooks, keeping safety rules.

misfire is a deterministic, local-first CLI and Python library that reads your existing Claude Code instruction files (CLAUDE.md, .claude/rules/*.md, @imports) and your own run history, then tells you which of YOUR prose rules your agents demonstrably ignore, ranked from YOUR run history. For the violated, machine-checkable subset only, it scaffolds a deterministic hook for you to review — keeping safety and judgment rules as prose. It is an observer and recommender: it prints recommendations, ranked violation lists, and hook scaffolds. It never auto-deletes a rule, never auto-applies a change, and never writes settings.json.

The static audit (stale paths, token rent, conflicts) and the hook scaffold are table-stakes features bundled under one headline: trace-grounded adherence measurement of your existing prose rules. No shipping tool ranks which of your existing prose rules your agents actually ignore.

Why now

Convert-to-hook is Anthropic's own guidance. The Claude Code best-practices docs say that bloated CLAUDE.md files cause Claude to ignore your actual instructions, and that the fix is to "delete it or convert it to a hook" — because hooks are deterministic while CLAUDE.md is advisory. The /memory docs describe CLAUDE.md as "context, not enforced configuration" and note that going over 200 lines reduces adherence. The "Effective context engineering" guidance frames the same problem as a finite "attention budget" subject to "context rot."

So the idea of converting rules to hooks is not new, and misfire says so plainly. The honest why-now is narrower and is the whole story: no credible public measurement of CLAUDE.md adherence exists. misfire producing the first trace-grounded adherence ranking from your own run history is the differentiator — not the hook scaffold, which is official guidance, but the evidence-grounding that decides which rules earn a hook.

What misfire does

Four commands, each with a deterministic --json mode (sorted keys, byte-stable output):

Command	What it does
`audit`	Static, zero-LLM audit of your instruction files — finds `stale_path`, `token_rent`, `conflict`, and `load_fidelity` issues. Table-stakes.
`rank`	Reconstructs rule violations from your run history and ranks the machine-checkable rules by observed violation rate, with confidence thresholds and a minimum-support floor. This is the wedge.
`evidence`	Shows the per-rule violation detail behind a ranking — the actual tool actions that violated a rule.
`convert`	Scaffolds a deterministic PreToolUse/PostToolUse hook for the violated convertible subset, prints it plus a `settings.json` snippet for you to review, and writes nothing.

Plus one opt-in, non-deterministic command, off the default path:

Command What it does

ablate [opt-in; requires a running local Ollama] Causal probe — re-runs a representative task with a candidate rule present vs. removed (ablated) and measures the shift in how often a local model violates the rule. Estimates a rule's marginal effect, which passive traces can't (an obeyed rule and a never-triggered rule both show zero violations). Evidence only — never auto-applies or auto-deletes.

Command	What it does
`ablate`	[opt-in; requires a running local Ollama] Causal probe — re-runs a representative task with a candidate rule present vs. removed (ablated) and measures the shift in how often a local model violates the rule. Estimates a rule's marginal effect, which passive traces can't (an obeyed rule and a never-triggered rule both show zero violations). Evidence only — never auto-applies or auto-deletes.

Observer exit codes: every command exits 0 regardless of findings. The only non-zero exit is evidence or convert invoked with an explicit --rule PREFIX that matches no rule (exit 1).

misfire sorts every rule into one of five categories — convertible, safety_keep, judgment_keep, output_shape, non_directive — and recommends along a three-tier ladder:

KEEP — judgment, safety, output-shape, and non-directive rules stay as prose.
ELEVATE — move a rule into a path-scoped .claude/rules/*.md with paths: frontmatter to cut token rent.
ENFORCE — scaffold a hook, but only for the violated convertible subset.

Install

misfire is stdlib-only with zero runtime dependencies and supports Python 3.9+.

pip install misfire
# or, with uv:
uv pip install misfire

A Homebrew tap is planned — brew install ek33450505/misfire/misfire will work once the homebrew-misfire tap is published.

From source (for development)

git clone https://github.com/ek33450505/misfire
cd misfire
pip install -e .
# or, with uv:
uv pip install -e .

Quick start — proof in one command

The ranking is byte-reproducible against a committed fixture, with no database. From the repo root:

misfire rank proof/evidence-sample/config \
    --projects-dir proof/evidence-sample/projects

misfire rank — proof/evidence-sample/config
Projects dir: <projects-dir>
Active rules: 2

Thresholds: min_support=30  min_violations=1

=== enforce_candidate (2) ===

  1. CLAUDE.md  [never_command]  confidence=medium
     rule_id: d84c9954a86f
     violations: 5  opportunities: 35  rate: 14.3%  excluded (sanctioned): 2
     "MANDATORY: Never use raw git commit directly — always route through the commit agent. Escape hatch …"

  2. CLAUDE.md  [never_command]  confidence=medium
     rule_id: 8fb701ad4c67
     violations: 3  opportunities: 35  rate: 8.6%
     "MANDATORY: Never dispatch git push to remote directly."

=== insufficient_evidence (0) ===
  (none)

=== observed_no_violations (0) ===
  (none)

The git commit rule was violated 5 times across 35 opportunities (14.3%), with 2 sanctioned uses of its escape hatch excluded honestly; the git push rule, 3 of 35 (8.6%). Add --json and the output matches proof/expected_rank.json byte-for-byte (test: tests/test_proof_rank.py) — purely from the markdown config and the transcript JSONL, no cast.db.

Now turn the top evidence-grounded candidate into a hook:

misfire convert proof/evidence-sample/config \
    --projects-dir proof/evidence-sample/projects --top

This emits a self-contained PreToolUse hook (matcher: Bash) for the never git commit rule. The generated hook embeds misfire's own structural command matcher, so a quoted echo "git commit" is not blocked (no naive-substring false positive); it honors the rule's escape hatch (CAST_COMMIT_AGENT=1); and it denies with your own rule text as the reason (permissionDecision: "deny"). It prints a settings.json snippet using ${CLAUDE_PROJECT_DIR} that misfire does not write. The verdict is evidence-grounded:

Verdict: ENFORCE  recommended=true
Evidence-grounded: 5 observed violation(s) across 35 opportunities (14.3%).

The strongest proof drives the emitted hook end-to-end: bats tests/bats/convert_blocks_commit.bats installs it into an isolated temp HOME and feeds it the real PreToolUse stdin contract, asserting it denies git commit, allows git status, ignores a quoted echo "git commit", and honors the escape hatch. See docs/proof.md for every reproducible proof.

How it's different

misfire owns trace-grounded ranking of your existing prose rules. Adjacent tools either convert rules blindly, grow a new policy, or do static analysis only:

Tool / work	What it does	misfire's difference
rule2hook	Blind prose→hook conversion, no evidence	misfire decides WHICH rules earn conversion, from YOUR trace evidence
PrismorSec/immunity-agent	Mines history to grow a NEW security policy; surfaces recommendations (no auto-apply)	misfire audits YOUR EXISTING prose rules and ranks them by observed violations; it never grows a new policy
AgentLint	AI-inference flags repeated or ignored rules ("Session mode" is closest)	misfire gives ranked output with confidence thresholds plus a convertible/judgment split
AgentSpec (ICSE'26)	Runtime-enforcement DSL	misfire is static plus an adherence audit, not a new DSL
Offscript	Academic adherence audit (86.4% of conversations deviate / 22.2% material)	misfire ships trace-grounded ranking as a local CLI on YOUR own data
TRACE	Mines USER CORRECTIONS into rules	misfire audits EXISTING prose; it does not derive rules from corrections
agents-lint / AgentLinter	Static stale-path / conflict detection	misfire includes static audit as table-stakes; the headline is evidence-ranking

Assumptions & Limitations

These are load-bearing. Read them before acting on any recommendation; the full guardrails live in docs/framing.md.

Passive-trace blindspot. misfire cannot tell a never-needed rule from a silently-obeyed one. Output is evidence of violation, not of redundancy. A safety rule with zero violations is not a deletion candidate.
False positives dominate naive matching. On the one rule tested first-hand (never raw git commit), ~80% of naive string matches were noise — the predicate appearing inside a hook-test payload, a PR body, or a grep pattern (Offscript independently measured ~22% material deviations). misfire applies structural command parsing, confidence thresholds, and minimum-support floors; rankings are not meaningful until a rule has enough observations.
Structural command parsing is mandatory, not polish — it is what kills the false-positive class above.
CAST vs portable. The optional cast.db substrate gives richer pre-computed signals; the default portable adapter reconstructs equivalent signals for any Claude Code user without cast.db.
Hook schema volatility. The Claude Code hook surface is large and changes across versions; misfire feature-detects the installed CC version before emitting scaffold code. (No required CC version is stated.)
Scope. v1 audits Claude Code instruction files only (CLAUDE.md, .claude/rules/*.md, @imports). Not AGENTS.md or .cursorrules yet.
Unranked ordering rules. before_action / after_action convertible rules carry no violation evidence (ordering is not reconstructible from passive traces) — they are unranked and emit a skeleton hook for you to complete, never an evidence-grounded recommendation.

Documentation

Doc	Contents
`docs/README.md`	Documentation index
`docs/usage.md`	Command reference, flags, defaults, JSON contract
`docs/architecture.md`	The signals → audit → recommendation pipeline
`docs/adapters.md`	Portable transcript adapter and optional `cast.db` adapter
`docs/convertibility-taxonomy.md`	The 5 categories and 4 convert kinds
`docs/proof.md`	Byte-reproducible proofs (audit, rank, convert, cast.db, BATS)
`docs/framing.md`	Framing guardrails, differentiation statement, prior art, and full assumptions

Contributing, Security, License

Contributing: see CONTRIBUTING.md.
Security: report vulnerabilities per SECURITY.md.
Conduct: see CODE_OF_CONDUCT.md.
License: Apache-2.0 — see LICENSE.

Maintainer: edward.kubiak.dev@gmail.com · Repo: https://github.com/ek33450505/misfire

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Jun 23, 2026

0.1.0

Jun 23, 2026

0.0.0

Jun 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

misfire-0.2.0.tar.gz (80.2 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

misfire-0.2.0-py3-none-any.whl (88.7 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file misfire-0.2.0.tar.gz.

File metadata

Download URL: misfire-0.2.0.tar.gz
Upload date: Jun 23, 2026
Size: 80.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for misfire-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`276ea1c6b4db25965ec6025c61ae5eb98ecfe151a076924de869d8c16a1ce4c6`
MD5	`410d0bfa94fd3b9533992517196f099a`
BLAKE2b-256	`42e57fd68aea9a13ddcfa871363f3070f7f4d72e7b6486ae03296f3eba416c68`

See more details on using hashes here.

File details

Details for the file misfire-0.2.0-py3-none-any.whl.

File metadata

Download URL: misfire-0.2.0-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 88.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for misfire-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`748dfb07d169275a53cc91cd2a688a0cc2343695a8cdeb408bf3782592da5dce`
MD5	`1b71bef028efbfb6596cde4c8aab0f5f`
BLAKE2b-256	`fa00978c37e86f459870225c204918d6c140de2f43e59058d38138ca0253a79f`

See more details on using hashes here.

misfire 0.2.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

misfire

Why now

What misfire does

Install

From source (for development)

Quick start — proof in one command

How it's different

Assumptions & Limitations

Documentation

Contributing, Security, License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes