Deterministic safety hooks for Claude Code

These details have not been verified by PyPI

Project links

Project description

Sensorium 🧠🛡️

A few months ago, a developer asked Claude to clean up an old repo.

Claude ran rm -rf tests/ patches/ plan/ ~/.

That trailing ~/ expands to the home directory. It wiped years of files on their Mac. The post hit 1,500+ upvotes on r/ClaudeAI within hours — because everyone building with agents recognized exactly how it happened: not malice, just confidence with no checkpoint.

It's not an isolated story:

A founder watched a Cursor agent find an unrelated API token, decide it had permission, and delete an entire production database and its backups — in 9 seconds. (Railway CEO personally helped restore it.)
Developers on GitHub have documented Claude Code running git reset --hard, wiping hours of uncommitted work, right after telling the user the operation was "safe."
A benchmark on 45 failing test suites found agents reporting "45/45 pass" when only 26 actually did — the other 19 quietly never ran the tests that would've said otherwise.

Same root cause every time: the model's confidence and the actual safety of the action are two different variables, and nothing was checking the second one.

So I built the boundary myself. Excited to share Sensorium — deterministic safety hooks for Claude Code. 👇

The problem nobody puts in the demo video

LLM agents are incredible at writing code. They are also, occasionally, incredible at:

🔥 running rm -rf with a trailing ~/ nobody meant to include
🔥 running git reset --hard on your uncommitted work, confidently
🔥 curl -X DELETE-ing a resource because the docs made it sound safe
🔥 declaring "all tests pass" without running the ones that don't
🔥 reapplying a stale cached manifest as if it were current state

None of this is malice. It's confidence without a checkpoint. And "just review every diff" doesn't scale when the agent is running fifty tool calls a session.

The insight

You don't need a second LLM watching the first one. You need sensors — small, deterministic, boring rules that wake up on a specific kind of change, check exactly what they care about, and say yes/no/wait.

No vibes. No judgment calls. Pattern matching and policy, all the way down.

Claude wants to use a tool
  → PreToolUse hook
  → Sensorium reads the tool input
  → sensors match
  → allow / block / warn
  → tool executes (or doesn't)
  → PostToolUse hook
  → Sensorium checks the resulting file content
  → writes an audit ledger

What this actually catches, out of the box

Protects against	How	Gate
`rm -rf /`, `rm -rf ~`, `dd ... of=/dev/sd*`	`filesystem.wipe`	unconditional block
Bulk delete without a backup	`filesystem.bulk_delete`	needs `backup_exists` + `dry_run_passed`
`git reset --hard`, `git clean -fxd`	`git.destructive`	needs `backup_exists`
`curl -X POST/PUT/DELETE/PATCH`	`external_api.write`	needs a snapshot + rollback plan
`kubectl apply/delete`, `terraform apply/destroy`, `aws ... delete`	`infra.mutation`	needs snapshot + dry-run + rollback
Direct `psql`/`mysql` writes	`db.write`	needs snapshot + rollback plan
Reapplying a stale archive/cache as truth	`data.apply_from_archive`	unconditional block
Skipped tests slipped in quietly	`test_skip_introduced`	warn, shows up in the audit report

--dry-run on the command bypasses the gate — sensors check for it explicitly.

Bring your own rules 🔧

This is the part I actually wanted to ship. Your project has opinions Sensorium can't guess — so tell it:

# .sensorium/sensors.yaml
sensors:
  - name: no_force_push
    description: Block git push --force (plain push still allowed)
    tools: [Bash]
    action: block
    patterns:
      - 'git\s+push\b.*(--force\b|-f\b)'
    unless:
      - '--dry-run'
    message: |
      Blocked: force-push detected. Use --force-with-lease and confirm
      with a human first.

Drop it in, no restart, no config reload — the next tool call picks it up.

Sensor fields:

tools — which Claude Code tools trigger this sensor (Bash, Edit, Write, MultiEdit)
on_file_change — glob patterns for file paths (PostToolUse, checks content)
action — block (exit 2, Claude sees the message) or warn (logged, shown in report)
patterns — regex list matched against the Bash command or file path
unless — if any of these match, the sensor does not trigger
block_if_contains — regex matched against file content after an edit
require_contains — regex that must be present in file content (absence triggers the sensor)
message — shown to Claude when blocked or warned

Install (2 minutes, I promise)

Step 1 — the CLI

pipx install agent-sensorium
# or: pip install agent-sensorium

Step 2 — wire up Claude Code

cd my-project
sensorium init claude-code            # this project only
sensorium init claude-code --global   # every project

Writes .claude/settings.json with the absolute path to the sensorium binary. Claude Code picks it up immediately.

Step 3 (optional) — your own rules

mkdir -p .sensorium
# add .sensorium/sensors.yaml, see above

That's it. Claude Code works exactly as before — Sensorium just quietly rides along on every tool call.

Receipts, not vibes

sensorium report          # show session audit log
sensorium report --clear  # show and reset

=== Sensorium Audit Report ===

Tools used:         12
Sensors triggered:  3
Blocked actions:    2
File violations:    1

--- Blocked Actions ---
  2026-07-03T10:14:22  [filesystem.wipe]  Bash: rm -rf /tmp/old-data
  2026-07-03T10:17:05  [git.destructive]  Bash: git reset --hard origin/main

--- File Sensor Violations ---
  2026-07-03T10:19:11  [full_object_overwrite]  src/apply.js
    required: ['delta|changed_fields', 'precondition|current_hash']

--- Sensor Activity ---
  external_api.write: 3x
  filesystem.wipe: 2x

Every block, every warning, every proof registered — append-only, in .sensorium/state.jsonl. Nothing silently disappears.

The architecture, for the nerds (me too)

Sensorium follows the State-Delta pattern:

world change (Claude tool use)
  → typed delta/event (PreToolUse / PostToolUse)
  → matching sensors wake up
  → each sensor selects the narrow context it needs
  → deterministic policy evaluates
  → allow / block / warn
  → ledger records the fact

No broad rescans. No LLM judge. No polling. A sensor declares exactly what it listens for and what invariant it protects — that's the whole contract.

.sensorium/
  state.jsonl      # append-only event ledger
  snapshots/       # file snapshots before edits (content-addressed)
  sensors.yaml     # your project-specific rules (optional)

The honest part

This is regex and policy, not a sandbox. It catches the direct, literal case — an agent typing a dangerous command in the open. It is not a defense against something actively trying to route around it (a script file, an interpreter, a clever quote). Self-reported proofs are exactly that — self-reported. Treat it as a seatbelt, not a cage, and you'll use it correctly.

Hook format reference

sensorium init claude-code writes this to .claude/settings.json:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Edit|Write|MultiEdit",
        "hooks": [{ "type": "command", "command": "/path/to/sensorium hook pretool" }]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash|Edit|Write|MultiEdit",
        "hooks": [{ "type": "command", "command": "/path/to/sensorium hook posttool" }]
      }
    ],
    "Stop": [
      {
        "hooks": [{ "type": "command", "command": "/path/to/sensorium hook stop" }]
      }
    ]
  }
}

Exit code 2 from pretool blocks the tool. Exit 0 allows it.

License

AGPL-3.0. See LICENSE.

The incidents this is built around

Not hypotheticals — these happened, and each one maps directly to a sensor above:

Coding Agent Horror Stories: The rm -rf ~/ Incident — Docker's write-up of the r/ClaudeAI post
Cursor AI coding agent deletes entire production database and backups in nine seconds — TechRadar
Claude Code's Silent Git Reset — dev.to
Why AI Coding Agents Say All Tests Pass When They Actually Fail — the 45-task benchmark

If you're shipping agents with real filesystem/shell/API access and you're not doing this yet — you're one confidently-wrong tool call away from a bad afternoon.

Would love thoughts from anyone else building guardrails for agentic coding tools. 🙏

#AIagents #ClaudeCode #DeveloperTools #AgentSafety #OpenSource

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jul 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_sensorium-0.1.0.tar.gz (31.5 kB view details)

Uploaded Jul 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_sensorium-0.1.0-py3-none-any.whl (41.2 kB view details)

Uploaded Jul 2, 2026 Python 3

File details

Details for the file agent_sensorium-0.1.0.tar.gz.

File metadata

Download URL: agent_sensorium-0.1.0.tar.gz
Upload date: Jul 2, 2026
Size: 31.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for agent_sensorium-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`9b0bc18be75b557fca7e6d77ec28876af668ef16a363262992623b89b3c9bae8`
MD5	`5b9304602e7b1ed64321cda24a74d165`
BLAKE2b-256	`cae7d0368ab24c59407e7af2279ba6fa542f02c17760ffee3ceae8202560c073`

See more details on using hashes here.

File details

Details for the file agent_sensorium-0.1.0-py3-none-any.whl.

File metadata

Download URL: agent_sensorium-0.1.0-py3-none-any.whl
Upload date: Jul 2, 2026
Size: 41.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for agent_sensorium-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4c39a7f5da84890a73f87c055720a09fcde37528f54c711035b104c1b9f7fea6`
MD5	`b133817756d0426617b53b5cdff8774b`
BLAKE2b-256	`6a105abb1c8e9479fe60cf556ce548d9678629ae2c62025a70c3a4fee2400703`

See more details on using hashes here.

agent-sensorium 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Sensorium 🧠🛡️

The problem nobody puts in the demo video

The insight

What this actually catches, out of the box

Bring your own rules 🔧

Install (2 minutes, I promise)

Receipts, not vibes

The architecture, for the nerds (me too)

The honest part

Hook format reference

License

The incidents this is built around

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes