2x the effective context with smart weight-loss for Claude Code — prune bloated sessions, protect agent teams from compaction, monitor token usage with MCP tools
Project description
Cozempic
50,000+ power users trust Cozempic to keep their Claude Code sessions lean.
Context cleaning for Claude Code — remove the bloat, keep everything that matters, protect Agent Teams from context loss.
What It Does
Claude Code sessions fill up with dead weight: progress ticks, thinking blocks, stale file reads, duplicate CLAUDE.md injections, base64 screenshots, oversized tool outputs, and metadata bloat. A typical session carries 8-46MB — most of it noise that inflates every API call.
Cozempic removes it with 18 composable strategies across 3 prescription tiers, while your actual conversation, decisions, and working context stay untouched. The guard daemon runs automatically — install once, forget about it.
Key Features
- 18 pruning strategies — gentle (5), standard (11), aggressive (18)
- Guard daemon — auto-starts via SessionStart hook, monitors and prunes continuously
- Interactive "prune now?" nudge — a non-blocking heads-up at 25% / 55% / 80% context (once per tier) recommending
cozempic reload, so interactive users get cozempic's higher-fidelity prune+resume on their own terms instead of falling back to lossy autocompact. Takes no action on its own; silence withCOZEMPIC_NUDGE_OFF=1 - Interactive-safe reload — in interactive sessions the guard warns first and reloads only at an idle breakpoint (never mid-turn); headless sessions reload as before
- Safe-point protection — the guard never terminates-and-resumes through in-flight work: a running Workflow, a background subagent, an agent team, or an open tool call defers the reload so nothing is lost
- compact-summary-collapse — 85-95% savings by removing pre-compaction messages already in the summary
- Agent Teams protection — checkpoints team state through compaction, reactive overflow recovery
- Behavioral digest — extracts your corrections ("don't do X"), persists them to Claude Code's memory system so they survive compaction
- 13 doctor checks — diagnose and auto-fix session corruption, orphaned tool results, zombie teams
- Token-aware diagnostics — exact token counts from
usagefields, cache hit rate, context % bar - Auto-detects 1M context — correct thresholds for both 200K and 1M models
- Efficient idle polling — backs off the poll cadence when the session is quiet and skips redundant no-op checkpoints
- Auto-updates — checks PyPI daily, upgrades in-place
Zero external dependencies. Python 3.10+ stdlib only.
Install
Pick your package manager:
# pip (Python ≥ 3.10)
pip install cozempic
# pipx — isolated user install, always available on PATH
pipx install cozempic
# uv / uvx — no install needed, run on demand
uvx cozempic --help
# Homebrew (macOS / Linux)
brew install Ruya-AI/cozempic/cozempic
# Nix flake
nix profile install github:Ruya-AI/cozempic?dir=packaging/nix
AUR (yay -S cozempic) and MacPorts (port install py-cozempic) submissions are in progress — see packaging/README.md for status and PKGBUILD/Portfile sources.
That's it. Cozempic auto-initializes on first use — hooks are wired globally, guard daemon auto-starts on every Claude Code session. No manual setup needed. Opt out with COZEMPIC_NO_GLOBAL_INIT=1.
As a Claude Code Plugin
Install cozempic (any method above), then inside Claude Code:
/plugin marketplace add Ruya-AI/cozempic
/plugin install cozempic
This gives you MCP tools, skills (/cozempic:diagnose, /cozempic:treat, etc.), and auto-wired hooks.
Quick Start
# Auto-detect and diagnose the current session
cozempic current --diagnose
# Dry-run the standard prescription
cozempic treat current
# Apply with backup
cozempic treat current --execute
# Go aggressive on a specific session
cozempic treat <session_id> -rx aggressive --execute
# Check for session corruption
cozempic doctor
# View behavioral digest rules
cozempic digest show
# Show all strategies & prescriptions
cozempic formulary
Strategies
| # | Strategy | Tier | What It Does | Expected |
|---|---|---|---|---|
| 1 | compact-summary-collapse |
gentle | Remove all pre-compaction messages (already in the summary) | 85-95% |
| 2 | attribution-snapshot-strip |
gentle | Strip attribution-snapshot metadata entries | 0-2% |
| 3 | progress-collapse |
gentle | Collapse consecutive and isolated progress tick messages | 40-48% |
| 4 | file-history-dedup |
gentle | Deduplicate file-history-snapshot messages | 3-6% |
| 5 | metadata-strip |
gentle | Strip token usage stats, stop_reason, costs | 1-3% |
| 6 | thinking-blocks |
standard | Remove/truncate thinking content + signatures | 2-5% |
| 7 | tool-output-trim |
standard | Trim large tool results (>8KB or >100 lines), microcompact-aware | 1-8% |
| 8 | tool-result-age |
standard | Compact old tool results by age — minify mid-age, stub old | 10-40% |
| 9 | stale-reads |
standard | Remove file reads superseded by later edits | 0.5-2% |
| 10 | system-reminder-dedup |
standard | Deduplicate repeated system-reminder tags | 0.1-3% |
| 11 | tool-use-result-strip |
standard | Strip toolUseResult envelope field (Edit diffs, never sent to API) | 5-50% |
| 12 | image-strip |
aggressive | Strip old base64 image blocks, keep most recent 20% | 1-40% |
| 13 | http-spam |
aggressive | Collapse consecutive HTTP request runs | 0-2% |
| 14 | error-retry-collapse |
aggressive | Collapse repeated error-retry sequences | 0-5% |
| 15 | background-poll-collapse |
aggressive | Collapse repeated polling messages | 0-1% |
| 16 | document-dedup |
aggressive | Deduplicate large document blocks (CLAUDE.md injection) | 0-44% |
| 17 | mega-block-trim |
aggressive | Trim any content block over 32KB | safety net |
| 18 | envelope-strip |
aggressive | Strip constant envelope fields (cwd, version, slug) | 2-4% |
Prescriptions
| Prescription | Strategies | Risk | Typical Savings |
|---|---|---|---|
gentle |
5 | Minimal | 85-95% (with compact boundary) |
standard |
11 | Low | 25-45% |
aggressive |
18 | Moderate | 35-60% |
Dry-run is the default. Nothing is modified until you pass --execute. Backups are always created.
Guard — Continuous Protection
The guard daemon monitors your session and prunes automatically:
# Auto-starts via SessionStart hook after cozempic init
# Or run manually:
cozempic guard --daemon
4-tier proactive pruning (every 30s):
| Tier | Threshold | Action | Reload? |
|---|---|---|---|
| Soft | 25% | gentle file cleanup | No |
| Hard | 55% | standard prune | Yes (interactive: at a breakpoint; deferred if agents active) |
| Hard2 | 80% | aggressive prune | Yes (gated by the safe-point check) |
| User | 90% | manual aggressive | Yes |
Interactive sessions — instead of a surprise reload mid-work, the guard surfaces a nudge and reloads only once you pause between turns, after warning you. Near the wall (≈88%) it reloads even mid-turn — a higher-fidelity prune beats hitting autocompact. Detection is automatic (COZEMPIC_INTERACTIVE=auto); on/off force it. Headless/CI sessions reload immediately as before.
Safe-point reload — a reload terminates and resumes the Claude process, so the guard validates first: if a Workflow, a background subagent, an agent team, or an open tool call is in flight, the reload defers (read-only checkpoint) rather than destroying that work. Tune the near-wall force point with COZEMPIC_FORCE_RELOAD_PCT (default 0.88).
Reactive overflow recovery — kqueue/polling file watcher detects inbox-flood overflow within milliseconds, auto-prunes with escalating prescriptions, circuit breaker prevents loops.
tmux/screen — reload resumes in the same pane via send-keys. Plain terminals open a new window.
Token thresholds auto-detect — 200K and 1M models detected automatically. Override with COZEMPIC_CONTEXT_WINDOW=200000 for Pro plan.
Behavioral Digest
Cozempic extracts your corrections and persists them across compactions:
# View extracted rules
cozempic digest show
# Manually extract from current session
cozempic digest update
# Sync rules to Claude Code's memory system
cozempic digest inject
How it works:
- Detects correction signals in your messages ("don't do X", "stop adding Y", "always use Z")
- All corrections start as "pending" and activate after 2 occurrences (prevents one-shot noise from polluting the digest)
- Rules synced to Claude Code's native memory system (
~/.claude/projects/<cwd>/memory/) - Claude reads these as feedback memories on every turn — they survive compaction natively
- PreCompact and Stop hooks auto-extract before context is lost
Agent Teams Protection
When Claude's auto-compaction fires, Agent Teams lose coordination state. Cozempic prevents this with five layers:
- Continuous checkpoint — saves team state every N seconds
- Hook-driven checkpoint — fires after every Task spawn, TaskCreate/Update, before compaction, at session end
- Tiered pruning — soft threshold trims without disruption; hard threshold does full prune + reload
- Reactive overflow recovery — detects inbox-flood within milliseconds, auto-recovers (~10s downtime)
- is_protected() — compact summaries, compact boundaries, content-replacement entries, and behavioral digest messages are never stripped
Doctor
cozempic doctor # Diagnose issues
cozempic doctor --fix # Auto-fix where possible
| Check | What It Detects | Auto-Fix |
|---|---|---|
trust-dialog-hang |
Resume hangs on Windows | Reset flag |
claude-json-corruption |
Truncated/corrupted JSON | Restore from backup |
corrupted-tool-use |
tool_use.name >200 chars |
Parse and repair |
orphaned-tool-results |
tool_result missing matching tool_use — causes 400 errors |
Strip orphans |
zombie-teams |
Stale team directories with dead agents | Remove stale dirs |
oversized-sessions |
Session files >50MB | — |
stale-backups |
Old .jsonl.bak files wasting disk |
Delete old backups |
disk-usage |
Session storage exceeding healthy thresholds | — |
Commands
cozempic init Wire hooks + slash command into project
cozempic list List sessions with sizes and token estimates
cozempic current [-d] Show/diagnose current session
cozempic diagnose <session> Analyze bloat sources
cozempic treat <session> [-rx PRESET] Run prescription (dry-run default)
cozempic treat <session> --execute Apply changes with backup
cozempic strategy <name> <session> Run single strategy
cozempic reload [-rx PRESET] Treat + auto-resume in new terminal
cozempic checkpoint [--show] Save team state to disk
cozempic guard [--daemon] Start guard (auto-starts via hook)
cozempic doctor [--fix] Check for known issues
cozempic digest [show|update|clear|flush|recover|inject]
cozempic self-update Upgrade to latest version from PyPI
cozempic formulary Show all strategies & prescriptions
Hook Integration
After cozempic init, these hooks are wired automatically:
| Hook | When | What |
|---|---|---|
SessionStart |
Session opens | Guard daemon + digest inject |
PostToolUse[Task] |
Agent spawn | Team checkpoint |
PostToolUse[TaskCreate|TaskUpdate] |
Todo changes | Team checkpoint |
PreCompact |
Before compaction | Checkpoint + digest flush |
Stop |
Session end | Checkpoint + digest flush |
Safety
- Dry-run by default —
--executerequired to modify files - Atomic writes —
write → fsync → os.replace()— no partial writes - Strict session resolution — refuses to act on ambiguous matches
- Timestamped backups — automatic
.jsonl.bakbefore any modification - is_protected() — compact summaries, boundaries, marble-origami state, content-replacement, behavioral digest entries are never removed
- parentUuid re-linking — conversation chain integrity maintained after removals
- Sibling tool_use protection — tool_use blocks are kept when their tool_result is kept
- Team messages protected — Task, TaskCreate, SendMessage never pruned
- Strategies compose sequentially — each runs on the output of the previous
Example Output
Prescription: aggressive
Before: 158.2K tokens (29.56MB, 6602 messages)
After: 121.5K tokens (23.09MB, 5073 messages)
Freed: 36.7K tokens (23.2%) — 6.47MB, 1529 removed, 4038 modified
Context: [============--------] 61%
Strategy Results:
compact-summary-collapse 8.17MB saved (85.2%) (4201 removed)
progress-collapse 1.63MB saved (5.5%) (1525 removed)
metadata-strip 693.9KB saved (2.3%) (2735 modified)
tool-use-result-strip 1.44MB saved (4.9%) (891 modified)
thinking-blocks 1.11MB saved (3.8%) (1127 modified)
tool-output-trim 1.72MB saved (5.8%) (167 modified)
...
Changelog
v1.8.23
- Hardened numeric input validation —
NaN,infinity, and non-representable huge integers are now rejected with a clear error at every CLI flag,COZEMPIC_*env var, and config field (aNaNthreshold would otherwise silently disable the gate it controls). Thanks to @ynaamane (#116), and folded in the matching fix for the interactive-guard reload-grace knob - Standing adversarial QA fleet —
docs/qa-fleet.md+ a corpus-driven regression test so this whole class can't regress
v1.8.22
- Interactive "prune now?" nudge — non-blocking heads-up at 25% / 55% / 80% context (once per tier, with hysteresis so it never nags), recommending
cozempic reload. Brings cozempic's higher-fidelity prune+resume to interactive sessions without surprise reloads. Tunable viaCOZEMPIC_NUDGE_PCTS; silence withCOZEMPIC_NUDGE_OFF=1 - Interactive-safe reload — warns first, then reloads only at an idle breakpoint (never mid-turn); near the wall (
COZEMPIC_FORCE_RELOAD_PCT, default 88%) it reloads even mid-turn. Headless/CI behaviour unchanged - Safe-point protection — the guard never terminates-and-resumes through a running Workflow, background subagent, agent team, or open tool call; the reload defers instead so in-flight work is preserved
- Interactivity detection —
COZEMPIC_INTERACTIVE=auto|on|off - Efficient idle polling — exponential poll back-off when the session is quiet (
COZEMPIC_IDLE_BACKOFF_CYCLES) and skips redundant no-op SOFT checkpoints
v1.7.1
cozempic reload --session <id|path>escape hatch when auto-detect fails in multi-agent sessions. Previously reload had no way to recover from ambiguous session detection, leaving users stuck. Matchesguard --session.- Error message now names the flag to use (was "use an explicit session ID" with no instruction on how)
- Auto-update message clarified: after upgrade, says "active on next run (this process still vX.Y.Z)" — users no longer think the upgrade failed when
--versionstill prints the old number (the running Python process can't hot-swap its own code)
v1.7.0
- Telemetry opt-out:
COZEMPIC_NO_TELEMETRY=1disables anonymous usage counters - Documented configuration:
COZEMPIC_NO_AUTO_UPDATE,COZEMPIC_NO_TELEMETRY,COZEMPIC_CONTEXT_WINDOWenv vars
v1.6.x
- 4-tier pruning: soft (25%, no reload) → hard (55%, reload) → emergency (80%, aggressive reload) → user (90%, manual)
- Agent-aware reload: defers reload at 55% when agents are running, forces at 80%
- Same-terminal resume: tmux/screen users get
/exit+claude --resumein the same pane - Clean messaging: only shows strategies that did something, 1-line hook status output
- 1M default: Opus/Sonnet 4.5/4.6 default to 1M context (CC doesn't use
[1m]suffix) - Auto-upgrade everywhere: SessionStart hook backgrounds
pip install --upgrade cozempicon every session. MCP/plugin useuv run --upgrade. npm install.js always upgrades. cozempic self-update: force-upgrade from PyPI regardless of install method (pip, uv, editable, clone)- Auto-updater fixed: removed TTY check (was blocking hook-triggered updates), tries uv → pip → pipx
v1.5.0
tool-result-agestrategy — age-based tool result compaction. Recent results stay verbatim, mid-age get JSON minified and diff context collapsed, old replaced with compact stubs. Claude can re-read any file. 10-40% additional savings targeting the 45% of session size that tool results occupy.- 18 strategies total, standard prescription 11, aggressive 18
- Tests: 273 → 283
v1.4.0 / v1.4.1
- Track 1 — Bug fixes:
is_protected()guard on all strategies,isSidechainpreserved in envelope-strip,output_tokensin token formula,parentUuidre-linking, sibling tool_use protection - Track 2 — New strategies:
compact-summary-collapse(85-95%),attribution-snapshot-strip, microcompact-awaretool-output-trim - Behavioral digest: extract corrections, sync to Claude Code memory, CLI commands, hook wiring
- Context window detection: MCP server and plugin now auto-detect 200K/1M (was hardcoded 200K)
- Cache efficiency metrics:
cozempic diagnoseshows cache hit rate - transcript_path: hooks parse session path from payload for faster resolution
- Tests: 165 → 273
v1.3.0 / v1.3.1
- Writer-safe live prune + sidecar session store
- Guard startup cleanup, updater fixes, MCP maintenance
v1.2.0 — v1.2.8
- Atomic file writes, strict session resolution, schema-first team detection
- tool-use-result-strip strategy (5-50% on edit-heavy sessions)
- image-strip strategy (keep last 20%)
- Auto-update, install tracking, npm package
- Safety improvements: SIGTERM handler, backup cleanup, permission error handling
Configuration
| Variable | Default | Effect |
|---|---|---|
COZEMPIC_CONTEXT_WINDOW |
auto-detect | Override context window size (e.g. 200000 for Pro plan) |
COZEMPIC_NO_AUTO_UPDATE |
off | Skip automatic version checks. Not recommended — Claude Code ships frequent changes and cozempic updates keep strategies compatible with the latest session format. |
COZEMPIC_NO_TELEMETRY |
off | Skip anonymous usage counters. Cozempic pings a simple counter on each prune — no personal data, session content, or identifiable information is sent. Helps us prioritize development. |
Contributing
Contributions welcome. To add a strategy:
- Create a function in the appropriate tier file under
src/cozempic/strategies/ - Decorate with
@strategy(name, description, tier, expected_savings) - Return a
StrategyResultwith a list ofPruneActions - Add to the appropriate prescription in
src/cozempic/registry.py
License
MIT — see LICENSE.
Built by Ruya AI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cozempic-1.8.29.tar.gz.
File metadata
- Download URL: cozempic-1.8.29.tar.gz
- Upload date:
- Size: 485.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8977ecb406bf42b095cdc1b988fe1c386e5d8a2c833b2bb964eb04c0c24fb927
|
|
| MD5 |
91c5d49a63aae2ab3f16fa31bf3dc51f
|
|
| BLAKE2b-256 |
0bf21d4a4a9309abb5bff254bd006da655cf0d603b84095f0ba28e41dd73a860
|
File details
Details for the file cozempic-1.8.29-py3-none-any.whl.
File metadata
- Download URL: cozempic-1.8.29-py3-none-any.whl
- Upload date:
- Size: 230.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d4bd26988df570dba5d299b7a76686ccd5e9965993fc11c198127a245638109
|
|
| MD5 |
7d17827e6a2cff53190930ebd4592a0b
|
|
| BLAKE2b-256 |
0b7ae075302d46971452cb057382d757e274da30363e22d4c726fea5cac351ad
|