Guards LLM agents from accidental large-context file reads, web fetches, and tool output.
Project description
GOT
Guardians of the Token keeps agentic coding sessions from wasting context. It runs locally and watches the moves that quietly blow up your context window — reading a huge file, fetching a big page, dumping noisy command output, or drifting off-topic in a long session — and pauses them before they cost you.
Why GOT
One bad move can burn your whole context window:
- reading a giant log or transcript into chat
- dumping command output that should have gone to a file
- fetching a large page directly into the model
- sending an off-topic prompt into an already-huge session
GOT turns those into a clear pause instead of a silent loss:
🛡️ Guardians of the Token blocked this command.
Target: /tmp/big.log
Estimate: ~200,000 tokens (50% of the 400,000-token window)
Next options:
- Inspect the beginning / end
- Search for a term
- Summarize a bounded section
- Bypass once for the full file
Everything runs locally with lightweight token estimates — no data leaves your machine.
Install
pip install guardians-of-the-token
guardians-install
guardians-install detects your local clients (Claude Code, Codex, Claude
Desktop) and lets you pick what to enable with a checkbox selector. Add --yes
to enable everything detected without prompting. If global installs are blocked,
use pipx install guardians-of-the-token first.
That's it — the guards are now active. The rest of this README is what they do and how to tune them.
What you get
| Client | What GOT guards |
|---|---|
| Claude Code | Read, Bash, WebFetch, oversized output, off-topic prompts, and cold-start continuity |
| Codex | risky Bash file dumps, URL fetches, oversized output |
| Claude Desktop / MCP | bounded file tools via guardians-mcp (see Advanced) |
1. Context guard
Before a risky Read, Bash, or WebFetch runs, GOT estimates its token cost
from file size, URL metadata, or command shape. If it's too large it blocks the
call and suggests bounded alternatives (inspect, search, summarize). Oversized
command output is trimmed before it reaches the model.
Need the full payload anyway? Bypass once:
touch /tmp/guardians_bypass
2. Prompt guard (Claude Code)
Once a session passes ~30% of the context window, GOT checks each new prompt for topic drift. If a prompt looks unrelated to what you've been doing, it's blocked before Claude reads it — saving a full round-trip of input tokens.
🛡️ Guardians blocked this prompt before Claude processed it.
Reason: this looks unrelated to the current large Claude session.
Similarity: 0.07 (block threshold 0.10)
Context: 168.9k / 200.0k tokens (84%)
Estimated cost if sent: $0.5068
To continue anyway, resend the prompt prefixed with GOT_UNBLOCK.
To send it anyway, resend prefixed with GOT_UNBLOCK (or run /got-unblock).
A small ONNX model (~22 MB) is fetched automatically on install; re-fetch it
anytime with guardians-download-models.
3. Cold-start resume (Claude Code)
Long sessions eventually compact, and the next session starts cold. As a session
approaches its limit, GOT saves a snapshot of what you were doing to
.got/project_state.md — the latest summary, your recent goals, files touched,
and recent commands — built entirely from the local transcript.
On your next cold start in that project, GOT tells you a snapshot exists and asks if you want to resume. To load it:
/got-resume
Claude reads the snapshot and picks up where you left off. Nothing is injected unless you ask.
Configuration
User config lives at ~/.guardians.json; per-project overrides go in
.guardians.toml. Common knobs:
{
"warn_threshold_pct": 20,
"max_output_tokens": 8000,
"telemetry_enabled": false,
"prompt_guard": {
"enabled": true,
"block_context_pct": 0.30,
"very_low_similarity": 0.10,
"unblock_prefix": "GOT_UNBLOCK"
},
"project_state": {
"enabled": true,
"save_context_pct": 0.70,
"max_age_hours": 168
}
}
Set any feature's "enabled": false to turn it off. In .guardians.toml you can
also whitelist known-safe paths so agents never get blocked on them:
whitelist = ["README.md", "docs/**"]
ignore = ["node_modules/**", ".git/**"]
Reports
GOT logs every block, trim, and snapshot locally to .got/events.jsonl.
guardians-report # text summary of tokens and dollars saved
guardians-dashboard # local web dashboard at 127.0.0.1:8766
Telemetry is off by default. If you opt in, GOT sends a single anonymous
install event (install ID, version, Python version, OS) — never paths, prompts,
content, commands, or token counts. Toggle with GUARDIANS_TELEMETRY=0|1.
Keep GOT current:
guardians update # update from PyPI
guardians update --check # check only
Advanced
These surfaces are optional and aimed at non-hook workflows:
guardians-mcp— MCP server giving Claude Desktop projects bounded file tools (got_file_size,got_file_head,got_file_search, …) and a preflight policy. Initialize a project withguardians-project-init /path/to/project.guardians-proxy— minimal FastAPI proxy that estimates request size for Anthropic/OpenAI-style payloads before forwarding (experimental).guardians-test-server— local fixture server for testing URL guards without downloading large payloads.
Manual / scoped installs are also available, e.g.
guardians-claude-install /path/to/workspace or --global, and the same for
guardians-codex-install.
GOT is early, practical infrastructure for local LLM workflows, focused on preventing accidental context loss in Claude Code and Codex.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file guardians_of_the_token-1.3.0.tar.gz.
File metadata
- Download URL: guardians_of_the_token-1.3.0.tar.gz
- Upload date:
- Size: 69.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50a89c93a08e8f5508655650d6e5647d2cb5dda9468bbf4364395e4fa614e02d
|
|
| MD5 |
331ebf3d82aa7ae3b05cc0a1d44cb9ea
|
|
| BLAKE2b-256 |
e9315216996aa21b2a51eb85046b4148041f573bf06f01bb6df549f5e7e5228c
|
File details
Details for the file guardians_of_the_token-1.3.0-py3-none-any.whl.
File metadata
- Download URL: guardians_of_the_token-1.3.0-py3-none-any.whl
- Upload date:
- Size: 66.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b84e4844e23918f7d990a40b958089dbb2554f1372caa7f78bee83af36bffd13
|
|
| MD5 |
99fffde369e7aabd60b732f5cc4a7c64
|
|
| BLAKE2b-256 |
d78f2e3ed444b1614bcea6dee0c5ede844bfe3b30111f2a60a3772a1d8e2354a
|