Claude Code extension that watches agent tool calls and suggests structured alternatives
Project description
Kibitzer
The person watching your chess game who can't help offering opinions.
Kibitzer is a Claude Code extension that watches how agents use tools and suggests better alternatives. It enforces path protection per mode, intercepts bash commands that have structured alternatives, and coaches agents toward more effective tool usage — all without an LLM in the decision loop.
Install
pip install kibitzer
cd your-project/
kibitzer init --hooks --mcp
This registers PreToolUse/PostToolUse hooks in .claude/settings.json and starts an MCP server with two tools the agent can call: ChangeToolMode and GetFeedback.
For richer coaching with Fledgling conversation analytics:
pip install kibitzer[fledgling]
What it does
Path protection
Each mode defines which paths the agent can write to. The path guard checks every Edit, Write, and NotebookEdit call — including absolute paths from Claude Code.
Mode Writable Use case
─────────── ─────────────────── ───────────────────────────
free everything prototyping, no guardrails
implement src/, lib/ normal dev — tests protected
test tests/, test/ writing tests — source protected
docs docs/, README.md documentation only
explore nothing read-only investigation
When a write is denied, the agent sees why and how to fix it:
Path 'tests/test_auth.py' is not writable in the current mode (writable: ['src/', 'lib/']).
Use the ChangeToolMode tool to switch modes.
In testing, agents consistently read this message and call ChangeToolMode to switch — no documentation or pre-training needed.
Interception
Interceptor plugins watch Bash calls for commands that have structured alternatives:
| Bash command | Suggested alternative | Plugin |
|---|---|---|
git add -A && git commit -m '...' |
jetsam save |
jetsam |
pytest tests/ |
blq run test |
blq |
grep -rn 'def handler' src/ |
FindDefinitions(...) |
fledgling |
Three interception modes form a ratchet — start in observe (log silently), graduate to suggest (show alternative), then redirect (deny bash, require structured tool). Each graduation is a one-line config change.
Coaching
The coach fires every N tool calls and detects patterns from ~250 experimental runs. Suggestions only reference tools the agent actually has — discovered from .mcp.json at runtime.
State-based patterns (always available):
- Repeated edit failures — "Edit failed 3 times on src/handler.py. Try Read() first to see exact content."
- Edit streak without tests — "You've made 7 edits without running tests." (mentions
blq run testif blq is available) - Sequential file reads — "You've read 5 files one at a time." (mentions
FindDefinitionsif fledgling is available) - Bash-heavy usage — "You've run 6 bash commands without using structured tools."
- Analysis loop — "You've spent 18 turns reading without changes. Start with the most confident fix."
- Semantic tool underuse — "FindDefinitions shows all functions in one call." (only fires if fledgling is available)
- Mode oscillation — "Frequent mode switches. Consider using free mode."
TDD patterns:
- Test overfit — "test_auth.py has been edited 4 times. Stabilize test expectations before adjusting further."
- Implement before test — "You edited source before writing tests. Consider starting with a failing test."
Fledgling-powered patterns (when fledgling is installed):
- Repeated search patterns — "You've searched for 'def handle_request' 4 times via Grep."
- Replaceable bash commands — "You've run 'grep' 3 times. FindDefinitions provides structured output."
All patterns are mode-aware: the analysis loop doesn't fire in explore mode (not editing is correct there), edit-without-test doesn't fire in docs mode (docs don't need tests).
Auto-transitions
The mode controller watches for failure patterns:
- 3+ consecutive failures → auto-switch to
explore - 20+ turns in explore → auto-switch back to
implement
An oscillation guard prevents rapid switching: if the agent just left a mode (< 5 turns), it won't auto-switch back. After 6+ total switches, auto-transitions stop.
Composes with Claude Code's permission model
The Claude Code harness has its own permission system (per-tool allow/deny, prompting before sensitive actions). Kibitzer's modes layer on top of these permissions — not beside them. The two regulate at different granularities:
- Harness permissions: coarse, universal, set at install — "can this agent ever call Bash?"
- Kibitzer modes: fine, contextual, changed per task — "is Bash appropriate given the current mode?"
They compose via set intersection: a tool call succeeds iff the harness permits it AND (if kibitzer is active) the current mode permits it. Kibitzer can narrow the harness's permissions; it can never widen them.
When a call is denied, the response identifies which layer denied it so the agent can respond correctly — switching modes (kibitzer) or asking the user to grant permission (harness) are different remedies.
MCP tools
The agent can call two tools explicitly:
ChangeToolMode(mode, reason?) — Switch modes. Returns the new mode's writable paths and strategy.
GetFeedback(status?, suggestions?, intercepts?) — Check current status, get coaching suggestions, and see which bash commands have been intercepted.
Configuration
Override defaults in .kibitzer/config.toml:
# Monorepo: widen writable paths
[modes.implement]
writable = ["packages/core/src/", "packages/api/src/"]
# Add custom modes
[modes.deploy]
writable = ["infra/", "deploy/"]
strategy = "Verify before applying."
# Graduate jetsam to suggest mode
[plugins.jetsam]
mode = "suggest"
# More aggressive coaching
[coach]
frequency = 3
Python API
Use kibitzer from Python without the hook protocol:
from kibitzer import KibitzerSession
with KibitzerSession(project_dir=".") as session:
# Check if a tool call is allowed
result = session.before_call("Edit", {"file_path": "src/auth.py"})
# Record a completed tool call
session.after_call("Edit", {"file_path": "src/auth.py"}, success=True)
# Batch-validate a program's planned calls
violations = session.validate_calls([
{"tool": "Edit", "input": {"file_path": "tests/foo.py"}},
])
Full API docs at kibitzer.readthedocs.io/python-api.
Coordinates with
Kibitzer suggests but never wraps these tools — each is independent:
- blq — structured build/test capture
- jetsam — git workflow acceleration
- Fledgling — AST-aware code intelligence
None are required. Kibitzer degrades gracefully — path guard and coach work with nothing else installed. When tools are available, suggestions reference them specifically. When they're not, suggestions give generic advice.
Documentation
Full docs at kibitzer.readthedocs.io:
- Modes — path protection, switching, auto-transitions
- Coach — all patterns, experimental evidence, model dependency
- Interceptors — the observe/suggest/redirect ratchet
- Configuration — full config.toml reference, resilience, optional deps
- Architecture — how the pieces fit together
- Integration — blq, jetsam, fledgling, superpowers
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kibitzer-0.8.0.tar.gz.
File metadata
- Download URL: kibitzer-0.8.0.tar.gz
- Upload date:
- Size: 174.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0e83db097eb9badfa24d881cd3453bb19a4a272c3ac715064cbe3fefffae9b7
|
|
| MD5 |
fd6523b13ea350c2cf37308e6fd48c8d
|
|
| BLAKE2b-256 |
f50983e57c989d10d4b5bf9f1ef7a0d97a811cce11ba13d9b42318c508b019ec
|
Provenance
The following attestation bundles were made for kibitzer-0.8.0.tar.gz:
Publisher:
publish.yml on teaguesterling/kibitzer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kibitzer-0.8.0.tar.gz -
Subject digest:
c0e83db097eb9badfa24d881cd3453bb19a4a272c3ac715064cbe3fefffae9b7 - Sigstore transparency entry: 1644224238
- Sigstore integration time:
-
Permalink:
teaguesterling/kibitzer@816a4064d57f8fd8d3a0d43803e1919665b3c00a -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/teaguesterling
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@816a4064d57f8fd8d3a0d43803e1919665b3c00a -
Trigger Event:
push
-
Statement type:
File details
Details for the file kibitzer-0.8.0-py3-none-any.whl.
File metadata
- Download URL: kibitzer-0.8.0-py3-none-any.whl
- Upload date:
- Size: 54.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e7b270ad453c5311cfb36cfd89f5c5b423f374963e6ad45d691e66d59e5e89d
|
|
| MD5 |
650476376cddcf599c715cd252c2e8ea
|
|
| BLAKE2b-256 |
b6b48450d9aa2b0e55c1405871a511f9335fbe68467e8cb294cc4d2482796e38
|
Provenance
The following attestation bundles were made for kibitzer-0.8.0-py3-none-any.whl:
Publisher:
publish.yml on teaguesterling/kibitzer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kibitzer-0.8.0-py3-none-any.whl -
Subject digest:
7e7b270ad453c5311cfb36cfd89f5c5b423f374963e6ad45d691e66d59e5e89d - Sigstore transparency entry: 1644224526
- Sigstore integration time:
-
Permalink:
teaguesterling/kibitzer@816a4064d57f8fd8d3a0d43803e1919665b3c00a -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/teaguesterling
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@816a4064d57f8fd8d3a0d43803e1919665b3c00a -
Trigger Event:
push
-
Statement type: