Deterministic safety hooks for Claude Code
Project description
Sensorium 🧠🛡️
A few months ago, a developer asked Claude to clean up an old repo.
Claude ran rm -rf tests/ patches/ plan/ ~/.
That trailing ~/ expands to the home directory. It wiped years of files on their Mac. The post hit 1,500+ upvotes on r/ClaudeAI within hours — because everyone building with agents recognized exactly how it happened: not malice, just confidence with no checkpoint.
It's not an isolated story:
- A founder watched a Cursor agent find an unrelated API token, decide it had permission, and delete an entire production database and its backups — in 9 seconds. (Railway CEO personally helped restore it.)
- Developers on GitHub have documented Claude Code running
git reset --hard, wiping hours of uncommitted work, right after telling the user the operation was "safe." - A benchmark on 45 failing test suites found agents reporting "45/45 pass" when only 26 actually did — the other 19 quietly never ran the tests that would've said otherwise.
Same root cause every time: the model's confidence and the actual safety of the action are two different variables, and nothing was checking the second one.
So I built the boundary myself. Excited to share Sensorium — deterministic safety hooks for Claude Code. 👇
The problem nobody puts in the demo video
LLM agents are incredible at writing code. They are also, occasionally, incredible at:
- 🔥 running
rm -rfwith a trailing~/nobody meant to include - 🔥 running
git reset --hardon your uncommitted work, confidently - 🔥
curl -X DELETE-ing a resource because the docs made it sound safe - 🔥 declaring "all tests pass" without running the ones that don't
- 🔥 reapplying a stale cached manifest as if it were current state
None of this is malice. It's confidence without a checkpoint. And "just review every diff" doesn't scale when the agent is running fifty tool calls a session.
The insight
You don't need a second LLM watching the first one. You need sensors — small, deterministic, boring rules that wake up on a specific kind of change, check exactly what they care about, and say yes/no/wait.
No vibes. No judgment calls. Pattern matching and policy, all the way down.
Claude wants to use a tool
→ PreToolUse hook
→ Sensorium reads the tool input
→ sensors match
→ allow / block / warn
→ tool executes (or doesn't)
→ PostToolUse hook
→ Sensorium checks the resulting file content
→ writes an audit ledger
What this actually catches, out of the box
| Protects against | How | Gate |
|---|---|---|
rm -rf /, rm -rf ~, dd ... of=/dev/sd* |
filesystem.wipe |
unconditional block |
| Bulk delete without a backup | filesystem.bulk_delete |
needs backup_exists + dry_run_passed |
git reset --hard, git clean -fxd |
git.destructive |
needs backup_exists |
curl -X POST/PUT/DELETE/PATCH |
external_api.write |
needs a snapshot + rollback plan |
kubectl apply/delete, terraform apply/destroy, aws ... delete |
infra.mutation |
needs snapshot + dry-run + rollback |
Direct psql/mysql writes |
db.write |
needs snapshot + rollback plan |
| Reapplying a stale archive/cache as truth | data.apply_from_archive |
unconditional block |
| Skipped tests slipped in quietly | test_skip_introduced |
warn, shows up in the audit report |
--dry-run on the command bypasses the gate — sensors check for it explicitly.
Bring your own rules 🔧
This is the part I actually wanted to ship. Your project has opinions Sensorium can't guess — so tell it:
# .sensorium/sensors.yaml
sensors:
- name: no_force_push
description: Block git push --force (plain push still allowed)
tools: [Bash]
action: block
patterns:
- 'git\s+push\b.*(--force\b|-f\b)'
unless:
- '--dry-run'
message: |
Blocked: force-push detected. Use --force-with-lease and confirm
with a human first.
Drop it in, no restart, no config reload — the next tool call picks it up.
Sensor fields:
tools— which Claude Code tools trigger this sensor (Bash,Edit,Write,MultiEdit)on_file_change— glob patterns for file paths (PostToolUse, checks content)action—block(exit 2, Claude sees the message) orwarn(logged, shown in report)patterns— regex list matched against the Bash command or file pathunless— if any of these match, the sensor does not triggerblock_if_contains— regex matched against file content after an editrequire_contains— regex that must be present in file content (absence triggers the sensor)message— shown to Claude when blocked or warned
Install (2 minutes, I promise)
Step 1 — the CLI
pipx install agent-sensorium
# or: pip install agent-sensorium
Step 2 — wire up Claude Code
cd my-project
sensorium init claude-code # this project only
sensorium init claude-code --global # every project
Writes .claude/settings.json with the absolute path to the sensorium binary. Claude Code picks it up immediately.
Step 3 (optional) — your own rules
mkdir -p .sensorium
# add .sensorium/sensors.yaml, see above
That's it. Claude Code works exactly as before — Sensorium just quietly rides along on every tool call.
Receipts, not vibes
sensorium report # show session audit log
sensorium report --clear # show and reset
=== Sensorium Audit Report ===
Tools used: 12
Sensors triggered: 3
Blocked actions: 2
File violations: 1
--- Blocked Actions ---
2026-07-03T10:14:22 [filesystem.wipe] Bash: rm -rf /tmp/old-data
2026-07-03T10:17:05 [git.destructive] Bash: git reset --hard origin/main
--- File Sensor Violations ---
2026-07-03T10:19:11 [full_object_overwrite] src/apply.js
required: ['delta|changed_fields', 'precondition|current_hash']
--- Sensor Activity ---
external_api.write: 3x
filesystem.wipe: 2x
Every block, every warning, every proof registered — append-only, in .sensorium/state.jsonl. Nothing silently disappears.
The architecture, for the nerds (me too)
Sensorium follows the State-Delta pattern:
world change (Claude tool use)
→ typed delta/event (PreToolUse / PostToolUse)
→ matching sensors wake up
→ each sensor selects the narrow context it needs
→ deterministic policy evaluates
→ allow / block / warn
→ ledger records the fact
No broad rescans. No LLM judge. No polling. A sensor declares exactly what it listens for and what invariant it protects — that's the whole contract.
.sensorium/
state.jsonl # append-only event ledger
snapshots/ # file snapshots before edits (content-addressed)
sensors.yaml # your project-specific rules (optional)
The honest part
This is regex and policy, not a sandbox. It catches the direct, literal case — an agent typing a dangerous command in the open. It is not a defense against something actively trying to route around it (a script file, an interpreter, a clever quote). Self-reported proofs are exactly that — self-reported. Treat it as a seatbelt, not a cage, and you'll use it correctly.
Hook format reference
sensorium init claude-code writes this to .claude/settings.json:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash|Edit|Write|MultiEdit",
"hooks": [{ "type": "command", "command": "/path/to/sensorium hook pretool" }]
}
],
"PostToolUse": [
{
"matcher": "Bash|Edit|Write|MultiEdit",
"hooks": [{ "type": "command", "command": "/path/to/sensorium hook posttool" }]
}
],
"Stop": [
{
"hooks": [{ "type": "command", "command": "/path/to/sensorium hook stop" }]
}
]
}
}
Exit code 2 from pretool blocks the tool. Exit 0 allows it.
License
AGPL-3.0. See LICENSE.
The incidents this is built around
Not hypotheticals — these happened, and each one maps directly to a sensor above:
- Coding Agent Horror Stories: The
rm -rf ~/Incident — Docker's write-up of the r/ClaudeAI post - Cursor AI coding agent deletes entire production database and backups in nine seconds — TechRadar
- Claude Code's Silent Git Reset — dev.to
- Why AI Coding Agents Say All Tests Pass When They Actually Fail — the 45-task benchmark
If you're shipping agents with real filesystem/shell/API access and you're not doing this yet — you're one confidently-wrong tool call away from a bad afternoon.
Would love thoughts from anyone else building guardrails for agentic coding tools. 🙏
#AIagents #ClaudeCode #DeveloperTools #AgentSafety #OpenSource
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agent_sensorium-0.1.0.tar.gz.
File metadata
- Download URL: agent_sensorium-0.1.0.tar.gz
- Upload date:
- Size: 31.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b0bc18be75b557fca7e6d77ec28876af668ef16a363262992623b89b3c9bae8
|
|
| MD5 |
5b9304602e7b1ed64321cda24a74d165
|
|
| BLAKE2b-256 |
cae7d0368ab24c59407e7af2279ba6fa542f02c17760ffee3ceae8202560c073
|
File details
Details for the file agent_sensorium-0.1.0-py3-none-any.whl.
File metadata
- Download URL: agent_sensorium-0.1.0-py3-none-any.whl
- Upload date:
- Size: 41.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c39a7f5da84890a73f87c055720a09fcde37528f54c711035b104c1b9f7fea6
|
|
| MD5 |
b133817756d0426617b53b5cdff8774b
|
|
| BLAKE2b-256 |
6a105abb1c8e9479fe60cf556ce548d9678629ae2c62025a70c3a4fee2400703
|