Skip to main content

Claude Code extension that watches agent tool calls and suggests structured alternatives

Project description

Kibitzer

The person watching your chess game who can't help offering opinions.

Kibitzer is a Claude Code extension that watches how agents use tools and suggests better alternatives. It enforces path protection per mode, intercepts bash commands that have structured alternatives, and coaches agents toward more effective tool usage — all without an LLM in the decision loop.

PyPI Python License Docs

Install

pip install kibitzer
cd your-project/
kibitzer init --hooks --mcp

This registers PreToolUse/PostToolUse hooks in .claude/settings.json and starts an MCP server with two tools the agent can call: ChangeToolMode and GetFeedback.

For richer coaching with Fledgling conversation analytics:

pip install kibitzer[fledgling]

What it does

Path protection

Each mode defines which paths the agent can write to. The path guard checks every Edit, Write, and NotebookEdit call — including absolute paths from Claude Code.

Mode        Writable            Use case
─────────── ─────────────────── ───────────────────────────
free        everything          prototyping, no guardrails
implement   src/, lib/          normal dev — tests protected
test        tests/, test/       writing tests — source protected
docs        docs/, README.md    documentation only
explore     nothing             read-only investigation

When a write is denied, the agent sees why and how to fix it:

Path 'tests/test_auth.py' is not writable in the current mode (writable: ['src/', 'lib/']).
Use the ChangeToolMode tool to switch modes.

In testing, agents consistently read this message and call ChangeToolMode to switch — no documentation or pre-training needed.

Interception

Interceptor plugins watch Bash calls for commands that have structured alternatives:

Bash command Suggested alternative Plugin
git add -A && git commit -m '...' jetsam save jetsam
pytest tests/ blq run test blq
grep -rn 'def handler' src/ FindDefinitions(...) fledgling

Three interception modes form a ratchet — start in observe (log silently), graduate to suggest (show alternative), then redirect (deny bash, require structured tool). Each graduation is a one-line config change.

Coaching

The coach fires every N tool calls and detects patterns from ~250 experimental runs. Suggestions only reference tools the agent actually has — discovered from .mcp.json at runtime.

State-based patterns (always available):

  • Repeated edit failures — "Edit failed 3 times on src/handler.py. Try Read() first to see exact content."
  • Edit streak without tests — "You've made 7 edits without running tests." (mentions blq run test if blq is available)
  • Sequential file reads — "You've read 5 files one at a time." (mentions FindDefinitions if fledgling is available)
  • Bash-heavy usage — "You've run 6 bash commands without using structured tools."
  • Analysis loop — "You've spent 18 turns reading without changes. Start with the most confident fix."
  • Semantic tool underuse — "FindDefinitions shows all functions in one call." (only fires if fledgling is available)
  • Mode oscillation — "Frequent mode switches. Consider using free mode."

TDD patterns:

  • Test overfit — "test_auth.py has been edited 4 times. Stabilize test expectations before adjusting further."
  • Implement before test — "You edited source before writing tests. Consider starting with a failing test."

Fledgling-powered patterns (when fledgling is installed):

  • Repeated search patterns — "You've searched for 'def handle_request' 4 times via Grep."
  • Replaceable bash commands — "You've run 'grep' 3 times. FindDefinitions provides structured output."

All patterns are mode-aware: the analysis loop doesn't fire in explore mode (not editing is correct there), edit-without-test doesn't fire in docs mode (docs don't need tests).

Auto-transitions

The mode controller watches for failure patterns:

  • 3+ consecutive failures → auto-switch to explore
  • 20+ turns in explore → auto-switch back to implement

An oscillation guard prevents rapid switching: if the agent just left a mode (< 5 turns), it won't auto-switch back. After 6+ total switches, auto-transitions stop.

MCP tools

The agent can call two tools explicitly:

ChangeToolMode(mode, reason?) — Switch modes. Returns the new mode's writable paths and strategy.

GetFeedback(status?, suggestions?, intercepts?) — Check current status, get coaching suggestions, and see which bash commands have been intercepted.

Configuration

Override defaults in .kibitzer/config.toml:

# Monorepo: widen writable paths
[modes.implement]
writable = ["packages/core/src/", "packages/api/src/"]

# Add custom modes
[modes.deploy]
writable = ["infra/", "deploy/"]
strategy = "Verify before applying."

# Graduate jetsam to suggest mode
[plugins.jetsam]
mode = "suggest"

# More aggressive coaching
[coach]
frequency = 3

Python API

Use kibitzer from Python without the hook protocol:

from kibitzer import KibitzerSession

with KibitzerSession(project_dir=".") as session:
    # Check if a tool call is allowed
    result = session.before_call("Edit", {"file_path": "src/auth.py"})

    # Record a completed tool call
    session.after_call("Edit", {"file_path": "src/auth.py"}, success=True)

    # Batch-validate a program's planned calls
    violations = session.validate_calls([
        {"tool": "Edit", "input": {"file_path": "tests/foo.py"}},
    ])

Full API docs at kibitzer.readthedocs.io/python-api.

Coordinates with

Kibitzer suggests but never wraps these tools — each is independent:

  • blq — structured build/test capture
  • jetsam — git workflow acceleration
  • Fledgling — AST-aware code intelligence

None are required. Kibitzer degrades gracefully — path guard and coach work with nothing else installed. When tools are available, suggestions reference them specifically. When they're not, suggestions give generic advice.

Documentation

Full docs at kibitzer.readthedocs.io:

  • Modes — path protection, switching, auto-transitions
  • Coach — all patterns, experimental evidence, model dependency
  • Interceptors — the observe/suggest/redirect ratchet
  • Configuration — full config.toml reference, resilience, optional deps
  • Architecture — how the pieces fit together
  • Integration — blq, jetsam, fledgling, superpowers

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kibitzer-0.3.2.tar.gz (125.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kibitzer-0.3.2-py3-none-any.whl (35.8 kB view details)

Uploaded Python 3

File details

Details for the file kibitzer-0.3.2.tar.gz.

File metadata

  • Download URL: kibitzer-0.3.2.tar.gz
  • Upload date:
  • Size: 125.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kibitzer-0.3.2.tar.gz
Algorithm Hash digest
SHA256 5b83dce747bbcd61cb9d0ffb72b46861661c1eb50bd874743b049e09979d9f7f
MD5 da8aa13ff4f9d7f3d6238e78a52df636
BLAKE2b-256 f586ee4fbf4b98ef149d48f1a8755696418dcf1b36b1f5d64457b02a26b829d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for kibitzer-0.3.2.tar.gz:

Publisher: publish.yml on teaguesterling/kibitzer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kibitzer-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: kibitzer-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 35.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kibitzer-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 61912ba0742ee5338dcb042e516f8316faa94281ab6907086f560d9819d03af7
MD5 6d0d8ab73cb88f1521690b32358b9b00
BLAKE2b-256 43b95530928f53f408e7aab68ce8812671fbfd6a7d1da5d7ec403f9c56444ac6

See more details on using hashes here.

Provenance

The following attestation bundles were made for kibitzer-0.3.2-py3-none-any.whl:

Publisher: publish.yml on teaguesterling/kibitzer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page