Audit your Claude Code agent architecture. Detect autonomy risks, observability gaps, and rule coverage issues.

These details have not been verified by PyPI

Project links

Project description

Oaken AI Claude Agent Auditor

Audit your Claude Code agent architecture. Detect autonomy risks, observability gaps, and rule coverage issues before they cost you.

Built by Oaken AI based on the Stanford CS230 guest lecture on building with LLMs and Claude Code's internal architecture.

⚡ This tool is free and open-source.
If it saved you time, buy me a coffee — it fuels more free AI tools.

What It Does

Scans your Claude Code workspace and generates a visual HTML report showing:

Architecture Score (0-100) based on agent safety and observability patterns
Autonomy Risk — LOW / MEDIUM / HIGH based on permission settings and safety controls
Observability Coverage — which hooks are present vs. missing (tracing, session logging, memory preservation)
Problem Type Coverage — do your rules address domain knowledge gaps, context window limits, hallucinations, and difficulty of control?
Rule Overlap Detection — finds redundant rules wasting context budget
Agent Setup Analysis — skills, orchestration patterns, subagent delegation

Why This Matters

Agentic Claude Code systems can silently develop dangerous patterns: unconstrained autonomy, no tracing, overlapping rules that dilute each other, missing safeguards for hallucination or context overflow.

Common problems this tool catches:

Unconstrained autonomy: bypassPermissions or dontAsk mode with no deny rules — Claude can delete files, push code, send messages without confirmation
No agent tracing: Multi-step Task tool runs with no PostToolUse hooks — you can't debug what went wrong
Missing session logging: No Stop hook — session decisions are lost after every conversation
No memory preservation: PreCompact hook absent — important context silently dropped during long sessions
Rule redundancy: Two rules with 80% keyword overlap loaded every session, canceling each other out
Narrow problem coverage: Rules address hallucination control but ignore context limits — agent hits token walls silently

Requirements

Python 3.10+
Zero dependencies (pure Python stdlib)
Does NOT need to be installed inside your Claude Code workspace
Read-only analysis. Never modifies your files.

Install

pip install claude-agent-auditor

Usage

Run from anywhere. Point it at any Claude Code project directory.

# Scan current directory
claude-agent-auditor

# Scan a specific project
claude-agent-auditor /path/to/your/project

# Generate report and open in browser
claude-agent-auditor /path/to/your/project --open

# Also export raw metrics as JSON
claude-agent-auditor --json

# Save reports to a custom directory
claude-agent-auditor --output ./my-reports/

The tool looks for .claude/ in the target directory (and ~/.claude/ for global settings). It scans settings.json, rules, hooks, and skills. The report is saved to agent-audit/ inside the target directory by default.

Pro tip: After reviewing the report, feed it to your Claude Code instance:

"Read agent-audit/audit.html and implement the HIGH priority recommendations"

Claude can self-modify your settings, rules, and hooks based on the findings. Always review the changes before accepting.

Example Output

  Scanning: /home/user/my-project
  Output:   /home/user/my-project/agent-audit/

  Architecture Score:   58/100
  Autonomy Risk:        MEDIUM ⚠
  Observability:        33%
  Problem Types:        2/4 covered
  Agent Rules:          4
  Issues:               5
  Recommendations:      4

What It Checks

Autonomy & Permissions

Analyzes settings.json permission configuration:

Risk	Condition
HIGH	`bypassPermissions` mode or `dontAsk` with no deny/ask rules
MEDIUM	`dontAsk` with some safety rules, or broad allows with no deny rules
LOW	Balanced configuration with explicit deny/ask rules

Observability Hooks

Checks for six hooks across three priority tiers:

Priority	Hook	Purpose
CRITICAL	PostToolUse: Task	Trace every agent sub-task
CRITICAL	Stop	Log session decisions before they're lost
IMPORTANT	PreCompact	Preserve critical context before compaction
IMPORTANT	SessionStart	Initialize state and restore context
USEFUL	PostToolUse: Write\|Edit	Audit file changes
USEFUL	PostToolUse: Bash	Log all executed commands

Problem Type Coverage (Stanford CS230 Framework)

Checks whether your rules address the four fundamental LLM problems identified in the Stanford CS230 guest lecture:

Domain Knowledge Gaps — Does the agent know enough? (RAG, context injection, domain rules)
Context Window Limits — Does it handle long conversations? (compaction, memory, summarization)
Hallucinations — Does it verify before acting? (verification, grounding, skepticism)
Difficulty of Control — Can you constrain its behavior? (deny rules, ask rules, scope limits)

Rule Architecture

Counts total rules and identifies agent-aware rules
Detects overlapping rule pairs using Jaccard similarity (threshold: 35%)
Reports which overlap pairs are wasting context budget

Agent Setup

Whether skills directory exists and how many skills are defined
Whether rules reference the Task tool (agent delegation)
Whether orchestration patterns are present (spawn, delegate, dispatch)
Whether subagent behavior rules exist

The Patterns

This tool checks your workspace against agent architecture patterns from Claude Code's framework and the Stanford CS230 LLM engineering principles:

Autonomy Gates — Every permission expansion should have a corresponding safety rule
Observability First — If you don't have traces, you can't debug your agent system
Problem Coverage — Rules should explicitly address all four LLM failure modes
Non-Overlapping Rules — Redundant rules dilute each other and waste context budget
Orchestration Awareness — Multi-agent systems need explicit delegation and coordination rules

Also Check

If you haven't optimized your workspace's context and memory yet, start there first:

pip install claude-workspace-optimizer
claude-workspace-optimizer /path/to/your/project --open

The Workspace Optimizer handles memory visibility, context bloat, and rule tiering — foundational issues before agent architecture.

About Oaken AI

Oaken AI builds AI automation systems for businesses. From workspace optimization to full production AI pipelines.

Disclaimer

This tool is provided as-is with no warranty. Oaken AI and its contributors accept zero responsibility for any changes made to your workspace based on this tool's output. The report contains recommendations, not instructions. Always review changes before applying them. Back up your workspace before making modifications.

Author

Built by Benjamin Brown at Oaken AI.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.1

Apr 17, 2026

0.3.0

Apr 17, 2026

0.2.1

Apr 17, 2026

0.2.0

Apr 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_agent_auditor-0.3.1.tar.gz (978.9 kB view details)

Uploaded Apr 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

claude_agent_auditor-0.3.1-py3-none-any.whl (43.4 kB view details)

Uploaded Apr 17, 2026 Python 3

File details

Details for the file claude_agent_auditor-0.3.1.tar.gz.

File metadata

Download URL: claude_agent_auditor-0.3.1.tar.gz
Upload date: Apr 17, 2026
Size: 978.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for claude_agent_auditor-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`54a57eb4de6aee733435c3406192ecdd61354c054b5a2d512d80ef372ac85275`
MD5	`5e1972d96c97d83807b157425521fe7b`
BLAKE2b-256	`2cd4828ff963c2640538d08bb256de03070047b88e5234372b25670a16e36384`

See more details on using hashes here.

File details

Details for the file claude_agent_auditor-0.3.1-py3-none-any.whl.

File metadata

Download URL: claude_agent_auditor-0.3.1-py3-none-any.whl
Upload date: Apr 17, 2026
Size: 43.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for claude_agent_auditor-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e5dd68c16dbc198a2b8d458b250c6bf1f92f5075d60df3527276b215970149bb`
MD5	`feca353badcd8b8e2e488526e0a30b34`
BLAKE2b-256	`80ce76ae493a04dd7dc0f6b03df9761e0ab5defc61b30995d7d05bdd717372b1`

See more details on using hashes here.

claude-agent-auditor 0.3.1

Navigation

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Oaken AI Claude Agent Auditor

What It Does

Why This Matters

Requirements

Install

Usage

Example Output

What It Checks

Autonomy & Permissions

Observability Hooks

Problem Type Coverage (Stanford CS230 Framework)

Rule Architecture

Agent Setup

The Patterns

Also Check

About Oaken AI

Disclaimer

Author

License

Project details

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes