Universal context intelligence layer — compresses LLM context across CLI, MCP, browser, and IDE
Project description
███████╗ ██████╗ ███████╗ ██╔════╝██╔═══██╗╚══███╔╝ ███████╗██║ ██║ ███╔╝ ╚════██║██║▄▄ ██║ ███╔╝ ███████║╚██████╔╝███████╗ ╚══════╝ ╚══▀▀═╝ ╚══════╝
Compress LLM context to save tokens and reduce costs
Install · How It Works · Supported Tools · Changelog · Discord
sqz compresses command output before it reaches your LLM. Single Rust binary, zero config.
The real win is dedup: when the same file gets read 5 times in a session, sqz sends it once and returns a 13-token reference for every repeat.
Without sqz: With sqz:
File read #1: 2,000 tokens File read #1: ~800 tokens (compressed)
File read #2: 2,000 tokens File read #2: ~13 tokens (dedup ref)
File read #3: 2,000 tokens File read #3: ~13 tokens (dedup ref)
─────────────────────── ───────────────────────
Total: 6,000 tokens Total: ~826 tokens (86% saved)
Token Savings
Single-command compression (measured via cargo test -p sqz-engine benchmarks):
| Content | Before | After | Saved |
|---|---|---|---|
| Repeated log lines | 148 | 62 | 58% |
| Large JSON array | 259 | 142 | 45% |
| JSON API response | 64 | 53 | 17% |
| Git diff | 61 | 54 | 12% |
| Prose/docs | 124 | 121 | 2% |
| Stack trace (safe mode) | 82 | 82 | 0% |
Session-level savings (with dedup cache across repeated reads):
| Scenario | Without sqz | With sqz | Saved |
|---|---|---|---|
| Same file read 5x | 10,000 | 826 | 92% |
| Same JSON response 3x | 192 | 79 | 59% |
| Test-fix-test cycle (3 runs) | 15,000 | 5,186 | 65% |
The dedup cache is where the real savings live. Single-command compression ranges from 2-58% depending on content. Repeated reads drop to 13 tokens each.
Install
cargo install sqz-cli
Then:
sqz init
That's it. Shell hooks installed, AI tool hooks configured.
How It Works
sqz installs a PreToolUse hook that intercepts bash commands before your AI tool runs them. The output gets compressed transparently — the AI tool never knows.
Claude → git status → [sqz hook rewrites] → compressed output (85% smaller)
What gets compressed:
- Shell output — git, cargo, npm, docker, kubectl, ls, grep, etc.
- JSON — strips nulls, compact encoding
- Logs — collapses repeated lines
- Test output — shows failures only
What doesn't get compressed:
- Stack traces, error messages, secrets — routed to safe mode (0% compression)
- Your prompts and the AI's responses — controlled by the AI tool, not sqz
Supported Tools
| Tool | Integration | Setup |
|---|---|---|
| Claude Code | PreToolUse hook (transparent) | sqz init |
| Cursor | PreToolUse hook (transparent) | sqz init |
| Windsurf | PreToolUse hook (transparent) | sqz init |
| Cline | PreToolUse hook (transparent) | sqz init |
| Gemini CLI | BeforeTool hook (transparent) | sqz init |
| OpenCode | TypeScript plugin (transparent) | sqz init |
| VS Code | Extension | Install from Marketplace |
| JetBrains | Plugin | Install from Marketplace |
| Chrome | Browser extension | ChatGPT, Claude.ai, Gemini, Grok, Perplexity |
| Firefox | Browser extension | Same sites |
CLI
sqz init # Install hooks
sqz compress <text> # Compress (or pipe from stdin)
sqz compact # Evict stale context to free tokens
sqz gain # Show daily token savings
sqz stats # Cumulative report
sqz discover # Find missed savings
sqz resume # Re-inject session context after compaction
sqz hook claude # Process a PreToolUse hook
sqz proxy --port 8080 # API proxy (compresses full request payloads)
Track Your Savings
$ sqz gain
sqz token savings (last 7 days)
──────────────────────────────────────────────────
04-13 │█████ │ 2329 saved
04-14 │ │ 0 saved
04-15 │██████████████████████████████│ 12954 saved
04-16 │████████████ │ 5532 saved
──────────────────────────────────────────────────
Total: 1178 compressions, 19214 tokens saved
How Compression Works
- Per-command formatters —
git status→ compact summary,cargo test→ failures only,docker ps→ name/image/status table - Structural summaries — code files compressed to imports + function signatures + call graph (~70% reduction). The model sees the architecture, not implementation noise.
- Dedup cache — SHA-256 content hash, persistent across sessions. Second read = 13-token reference.
- JSON pipeline — strip nulls → project out debug fields → flatten → collapse arrays → TOON encoding (lossless compact format)
- Safe mode — stack traces, secrets, migrations detected by entropy analysis and routed through with 0% compression
For the full technical details, see docs/.
Configuration
# ~/.sqz/presets/default.toml
[preset]
name = "default"
version = "1.0"
[compression.condense]
enabled = true
max_repeated_lines = 3
[compression.strip_nulls]
enabled = true
[budget]
warning_threshold = 0.70
default_window_size = 200000
Privacy
- Zero telemetry — no data transmitted, no crash reports
- Fully offline — works in air-gapped environments
- All processing local
Development
git clone https://github.com/ojuschugh1/sqz.git
cd sqz
cargo test --workspace
cargo build --release
License
Elastic License 2.0 (ELv2) — use, fork, modify freely. Two restrictions: no competing hosted service, no removing license notices.
Links
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sqz-0.7.0.tar.gz.
File metadata
- Download URL: sqz-0.7.0.tar.gz
- Upload date:
- Size: 7.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d25d716854401d30fb65781b76c971dcff1881b8421e86ad837b3517a9d9de7
|
|
| MD5 |
3d7aaeda81facea19eb8f142d9a789bb
|
|
| BLAKE2b-256 |
15bdc0f44bae7f925066d8325051458aaa344c2a65061e20725239e98033f604
|
File details
Details for the file sqz-0.7.0-py3-none-any.whl.
File metadata
- Download URL: sqz-0.7.0-py3-none-any.whl
- Upload date:
- Size: 8.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4713382bc6ab63e21000d8dd2fe66ad106c64ea0ece5c68da0aab64f2298fe68
|
|
| MD5 |
512b7535a40f1a24c58b98bf2bd7a18e
|
|
| BLAKE2b-256 |
9c3580881b70a0ffd2f9b77c516238050afba41b95917c7c1a7dd86976189189
|