Contextual blast-radius scoring for shell commands an AI agent is about to run
Project description
Blast Scope
A consequence engine for shell commands. Blast Scope scores what a command would actually do — before an AI agent (or you) runs it. It doesn't pattern-match syntax into a blocklist; it figures out the command's real target, observes that target with a safe, read-only probe, and returns a structured risk score with evidence.
The whole point is contextual blast radius. The same command gets a completely different score depending on what it would actually hit:
COMMAND SEVERITY WHY ADVICE
───────────────────────────────── ──────── ────────────────────────────────────────── ───────
rm -rf ./logs LOW 0 importers · regenerable · outside src proceed
rm -rf ./config CRITICAL 8 modules import it · high PageRank hub block
git reset --hard (clean tree) LOW nothing uncommitted to discard proceed
git reset --hard (4 dirty files) HIGH would throw away 4 files of uncommitted work confirm
git push --force (protected) CRITICAL would orphan commits on a protected branch block
docker volume rm cache (absent) LOW volume doesn't exist — nothing to remove proceed
docker volume rm pgdata (in use) CRITICAL holds data · in use · no image to rebuild block
pip uninstall flask (uv.lock) LOW regenerable — exact version pinned in lock proceed
DROP TABLE users (42 rows) CRITICAL schema + 42 rows · irreversible block
DELETE FROM logs (in txn) HIGH no WHERE — but inside a txn, ROLLBACK-able confirm
Two commands can be byte-identical and score four bands apart. That gap is the product.
Not a blocklist. Not a replacement for Shellfirm. Not a syscall monitor. It scores structural consequence — advisory, never blocking — and for the rare critical command it captures an undo snapshot first.
How it works
A command flows through a cheap funnel: almost everything is recognized as non-destructive in microseconds and exits silent. Only a flagged destructive candidate pays for a probe.
shell command
│ split chains (&& || ; |) · de-alias PowerShell · parse flags/targets
▼
┌──────────────────────────────────────────────────────────────────────┐
│ STAGE 1 · triage (near-free regex — runs on every command) │
│ which class? git · docker · pip/uv · sql · else filesystem │
│ destructive? `git status` → no. `git reset --hard` → yes ↓ │
└───────────────────────────────┬──────────────────────────────────────┘
destructive candidate │ (everything else exits here, silent)
▼
┌──────────────────────────────────────────────────────────────────────┐
│ ELIGIBILITY FILTER safe read-only probe? AND undo authorable? │
└──────────────┬──────────────────────────────────────┬─────────────────┘
yes, probe it │ no probe here / now │
▼ ▼
STAGE 2 · safe probe (read-only) heuristic estimate
git status · reflog · rev-list from a static per-class
docker inspect · ps · ls table — and LABELED
sqlite SELECT count(*) [mode=ro] "(estimated)" so you
pip/uv read lockfiles know it wasn't probed
│ │
└─────────────────────┬─────────────────────┘
▼
blast radius × reversibility (combined PER CLASS — no global formula)
filesystem also folds in: dependency-graph centrality + recoverability
▼
score 0.0–1.0 → severity (low / medium / high / critical)
▼
PreToolUse hook: silent (low/med) · advise (high) · advise + snapshot (critical)
The eligibility filter is the design boundary. A command class earns a live probe only when both hold: (1) its impact is observable by a strictly side-effect-free read (HTTP-GET sense — never mutate state to assess state), and (2) its undo story is well-known enough to encode in a static table. When a probe can't run here and now (no docker daemon, no DB driver, no creds), the tool degrades to a labeled estimate — it never guesses silently, and it never blocks.
See docs/heuristics.md for the per-class tables, the exact filesystem formula, and calibration.
The five command classes
| Class | Destructive ops it scores | Safe (read-only) probe | Reversibility signal |
|---|---|---|---|
| Filesystem | rm -rf, mv, > truncate |
dependency graph + git status | git-tracked? regenerable? secret? precious? |
| Git | reset --hard, push --force, branch -D, clean -fdx |
status · reflog · rev-list · rev-parse @{u} |
reflog window · remote ahead · protected branch |
| Docker | volume rm, system prune -a, rm -f |
volume inspect · ps -a · volume ls |
volume → none · container → recreatable from image |
| pip / uv | pip uninstall, uv pip uninstall |
read lockfile / manifest (no subprocess) | lockfile present → fully regenerable |
| SQL | DROP, TRUNCATE, DELETE without WHERE |
SQLite: SELECT count(*) mode=ro; transaction check |
inside a transaction? backup posture? |
New classes drop in behind one protocol (triage / probe_commands / assess)
in src/blast_scope/classes/; the probe surface each
declares is asserted read-only by the test suite, so no probe can ever mutate.
Status
v0.3.1 — calibrated multi-class guardrail with a precise dependency graph.
| Capability | Module |
|---|---|
| Flag/operand-sensitive command model (POSIX and PowerShell) | command_effects.py, command_parser.py |
| Recoverability classification (git state, secrets, regenerable, precious data) | recoverability.py |
| Dependency graph + weighted PageRank centrality, incremental indexing | graph_resolver.py, centrality.py |
| Two-axis, evidence-based filesystem scoring | risk_scorer.py |
| Command-class probes — git / docker / pip·uv / SQL, behind one protocol | classes/ |
| Out-of-graph path analyzers (infra / config-by-path) + git base | consequences.py, vcs.py, infra.py, config_refs.py |
| PreToolUse hook + tarball snapshot/undo | hook.py, snapshot.py |
| Eval harness + labeled corpus + calibration | eval.py, tests/fixtures/eval_corpus.jsonl |
Calibration. Two harnesses, both run-it-yourself:
- In-repo corpus (
tests/fixtures/eval_corpus.jsonl, 38 cases spanning every recoverability category, git working-tree state, infra/config,rm -rf .git, a graph-indexed central module, and the git/docker/pip/SQL classes) — 38/38 exact severity, gate F1 1.00, pinned bytests/test_eval.pywith headroom so changes can't silently regress. - SABER — 716 real coding-agent
workspaces. Against ~1725 safe commands, blast-scope's false-positive rate is
0.4%; on its core competency (
data_destruction) it catches 76.5% of injected attacks on realistic workspaces (82% with the dependency graph built). The per-category recall is deliberately uneven, and the table says so: blast-scope scores destructive consequence — filesystem/data loss plus git/docker/pip/SQL state. Network exfiltration and persistence are a different threat model, out of scope by design — not an unfinished corner. That's the boundary, drawn on purpose. Seebench/.
uv run python -m blast_scope.eval # in-repo corpus
python bench/saber_eval.py --tasks <saber>/dataset/data/tasks.jsonl # SABER
Installation
The fastest path for any MCP client is zero-install via uvx (no clone, no venv):
uvx blast-scope # runs the MCP server on stdio
Claude Code users — one line wires up both the MCP tools and the advisory hook:
/plugin marketplace add Atharva-Jayappa/blast-scope
/plugin install blast-scope
For development, or to pin a checkout:
git clone https://github.com/Atharva-Jayappa/blast-scope.git
cd blast-scope && uv sync --all-extras
Usage
As an MCP server
Add to your MCP client config (e.g. Claude Code settings.json):
{
"mcpServers": {
"blast-scope": { "command": "uvx", "args": ["blast-scope"], "type": "stdio" }
}
}
Tools exposed:
| Tool | Purpose |
|---|---|
assess_command(command, cwd?, project_root?) |
Score a (possibly chained) command. Returns score, severity, rationale, evidence, recoverability, affected nodes, and a per-segment chain breakdown. |
index_project(project_root) |
Force a dependency-graph rebuild (auto-built on first use otherwise). |
list_snapshots(project_root) |
List undo snapshots, newest first. |
restore_snapshot(snapshot_id, project_root) |
Undo a risky command by restoring its snapshot. |
As a PreToolUse hook (tiered advice + auto-snapshot)
Intercept Bash commands before they run — advisory, never blocking. Volume
scales with stakes: silent on low/medium, advise on high, advise +
snapshot on critical. The snapshot skips what's already recoverable
(git-clean, regenerable) and warns rather than tars anything over a hard size
cap, so the undo net stays fast and trustworthy. Add to .claude/settings.json:
{
"hooks": {
"PreToolUse": [
{ "matcher": "Bash",
"hooks": [{ "type": "command", "command": "python -m blast_scope.hook" }] }
]
}
}
Full details and the undo flow: docs/hook.md.
Example output
A filesystem command, scored against the dependency graph:
// assess_command("rm -rf ./config", project_root="/proj")
{
"score": 0.93,
"severity": "critical",
"recommendation": "block",
"recoverability": "untracked",
"rationale": "rm targets config. 8 direct importer(s), 14 total affected. not git-tracked. recursive deletion. CRITICAL risk.",
"evidence": [
"8 importer(s), 14 affected node(s)",
"high centrality (PageRank 0.91) — a hub other code routes through",
"untracked — not in git history",
"recursive — applies to every file underneath"
],
"affected_nodes": [ /* ... */ ],
"chain": [ /* per-segment breakdown */ ]
}
A command class that couldn't probe — note the labeled estimate (no Postgres driver, server possibly remote, so the tool refuses to guess silently):
// assess_command('psql -c "DROP TABLE users"')
{
"score": 0.9,
"severity": "critical",
"recommendation": "block",
"evidence": [
"drops users — its schema and all rows, irreversible (estimated — no read-only probe for postgres)"
]
}
// the same DROP against a local SQLite file probes for real:
// "drops users — its schema and 42 row(s), irreversible" (estimated: false)
Development
uv sync --all-extras
uv run pytest -q # full suite
uv run python -m blast_scope.eval # scoring accuracy report
Project structure
blast-scope/
├── src/blast_scope/
│ ├── server.py # MCP server + tools (assess, index, snapshots)
│ ├── command_parser.py # shell → structured intent (POSIX + PowerShell)
│ ├── command_effects.py # command/flag/operand → intent + weight
│ ├── recoverability.py # path → how recoverable if destroyed
│ ├── graph_resolver.py # paths → dependency-graph impact (+ PageRank)
│ ├── centrality.py # pure-Python weighted PageRank
│ ├── risk_scorer.py # signals → score + severity + evidence
│ ├── classes/ # command-class probes behind one protocol
│ │ ├── __init__.py # Candidate · ConsequenceClass · registry
│ │ ├── git.py # reflog / upstream-divergence / protected branch
│ │ ├── docker.py # volume / container / system-prune probes
│ │ ├── packages.py # pip·uv uninstall vs. lockfile presence
│ │ └── sql.py # DROP/TRUNCATE/DELETE — SQLite probe + estimates
│ ├── consequences.py # coordinator: class probes + path analyzers
│ ├── vcs.py / infra.py / config_refs.py # git base + path analyzers
│ ├── hook.py # PreToolUse advisory hook
│ ├── snapshot.py # tarball snapshot / restore / list
│ ├── eval.py # evaluation harness + metrics
│ └── vendor/crg/ # vendored from code-review-graph (MIT)
├── tests/ # 298 tests incl. eval regression guard
│ └── fixtures/eval_corpus.jsonl # labeled calibration corpus
└── docs/
├── heuristics.md # scoring model + per-class tables + calibration
└── hook.md # hook registration + undo
Roadmap
- Lift recall on the destruction classes (glob targets over tracked files,
find-based deletion variants) — the SABER per-category table is the worklist. - Optional live probes for Postgres/MySQL (in-process, read-only) once a driver policy is settled — today those engines degrade to labeled estimates.
- PowerShell-shell awareness in the hook path (the MCP tool already supports it).
- Optional richer interception modes beyond advisory.
See CLAUDE.md for the full spec, contracts, and design rules.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file blast_scope-0.3.1.tar.gz.
File metadata
- Download URL: blast_scope-0.3.1.tar.gz
- Upload date:
- Size: 196.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a7a69d4cd2f136a46fcce272e754ca771336aff27f13fcd25be968377067eb1
|
|
| MD5 |
0da7d64a80a5594b8eef6de3dc22599a
|
|
| BLAKE2b-256 |
8e6a749bf187c2159ce4e626bf28edb1f46b022453b443ad9bda4803b9b185fb
|
File details
Details for the file blast_scope-0.3.1-py3-none-any.whl.
File metadata
- Download URL: blast_scope-0.3.1-py3-none-any.whl
- Upload date:
- Size: 122.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70cf06ddf375effe582d89fbe7993873f96aa0ddf81c839928bb95f9ef6de305
|
|
| MD5 |
d94bc5be66b04c54c8724b6585de2be3
|
|
| BLAKE2b-256 |
139657e21541e343cd9496309ba57287a312028e8484654be956711ddcdd9aa0
|