Skip to main content

Universal context intelligence layer — compresses LLM context across CLI, MCP, browser, and IDE

Project description

  ███████╗ ██████╗ ███████╗
  ██╔════╝██╔═══██╗╚══███╔╝
  ███████╗██║   ██║  ███╔╝
  ╚════██║██║▄▄ ██║ ███╔╝
  ███████║╚██████╔╝███████╗
  ╚══════╝ ╚══▀▀═╝ ╚══════╝
  

Compress LLM context to save tokens and reduce costs

Real session stats: 3,003 compressions · 178,442 tokens saved · 24.7% avg reduction · up to 92% with dedup

Crates.io npm PyPI VS Code Firefox JetBrains Discord

Install · How It Works · Supported Tools · Changelog · Discord


sqz compresses command output before it reaches your LLM. Single Rust binary, zero config.

The real win is dedup: when the same file gets read 5 times in a session, sqz sends it once and returns a 13-token reference for every repeat.

Without sqz:                    With sqz:

File read #1:  2,000 tokens     File read #1:  ~800 tokens (compressed)
File read #2:  2,000 tokens     File read #2:  ~13 tokens  (dedup ref)
File read #3:  2,000 tokens     File read #3:  ~13 tokens  (dedup ref)
───────────────────────         ───────────────────────
Total:         6,000 tokens     Total:         ~826 tokens (86% saved)

Token Savings

24.7% average reduction across 3,003 real compressions · 92% saved on repeated file reads · 86% on shell/git output · 13-token refs for cached content

One developer's week, measured from actual sqz gain output:

$ sqz gain
sqz token savings (last 7 days)
──────────────────────────────────────────────────
  04-13 │                              │   2,329 saved
  04-14 │                              │       0 saved
  04-15 │███                           │  12,954 saved
  04-16 │██                            │   9,223 saved
  04-17 │████                          │  14,752 saved
  04-18 │██████████████████████████████│ 105,569 saved
  04-19 │████████                      │  30,882 saved
  04-20 │█                             │   4,334 saved
──────────────────────────────────────────────────
  Total: 3,003 compressions, 178,442 tokens saved (24.7% avg reduction)

Per-command compression

Single-command compression (measured via cargo test -p sqz-engine benchmarks):

Content Before After Saved
Repeated log lines 148 62 58%
Large JSON array 259 142 45%
JSON API response 64 53 17%
Git diff 61 54 12%
Prose/docs 124 121 2%
Stack trace (safe mode) 82 82 0%

Session-level with dedup

Where the real savings live — the cache sends each file once, repeats cost 13 tokens:

Scenario Without sqz With sqz Saved
Same file read 5× 10,000 826 92%
Same JSON response 3× 192 79 59%
Test-fix-test cycle (3 runs) 15,000 5,186 65%

Single-command compression ranges from 2–58% depending on content. Repeated reads drop to 13 tokens each. Your mileage will vary with how repetitive your tool calls are — agentic sessions with many file re-reads see the biggest wins.

Install

Prebuilt binaries (no compiler required — works on every platform):

# macOS / Linux
curl -fsSL https://raw.githubusercontent.com/ojuschugh1/sqz/main/install.sh | sh

# Windows (PowerShell)
irm https://raw.githubusercontent.com/ojuschugh1/sqz/main/install.ps1 | iex

# Any platform via npm
npm install -g sqz-cli

Build from source (cargo install sqz-cli) works too, but needs a C toolchain:

  • Linux: build-essential (apt) or equivalent
  • macOS: Xcode Command Line Tools (xcode-select --install)
  • Windows: Visual Studio Build Tools with the "Desktop development with C++" workload. Without these, cargo install fails with linker link.exe not found. If you don't already have them, use the PowerShell or npm install above instead.

Then initialize:

sqz init --global     # hooks apply to every project on this machine
# or
sqz init              # hooks apply to just this project (.claude/settings.local.json)

--global writes to ~/.claude/settings.json (the user scope per the Anthropic scope table), so the sqz hook fires in every Claude Code session on this machine. This is the common case on first install. Your existing permissions, env, statusLine, and unrelated hooks in ~/.claude/settings.json are preserved — sqz merges its entries rather than overwriting.

Plain sqz init (project scope) is useful when you want sqz active only inside one repo.

That's it. Shell hooks installed, AI tool hooks configured.

How It Works

sqz installs a PreToolUse hook that intercepts bash commands before your AI tool runs them. The output gets compressed transparently — the AI tool never knows.

Claude → git status → [sqz hook rewrites] → compressed output (85% smaller)

What gets compressed:

  • Shell output — git, cargo, npm, docker, kubectl, ls, grep, etc.
  • JSON — strips nulls, compact encoding
  • Logs — collapses repeated lines
  • Test output — shows failures only

What doesn't get compressed:

  • Stack traces, error messages, secrets — routed to safe mode (0% compression)
  • Your prompts and the AI's responses — controlled by the AI tool, not sqz

Supported Tools

Tool Integration Setup
Claude Code PreToolUse hook (transparent) sqz init
Cursor PreToolUse hook (transparent) sqz init
Windsurf PreToolUse hook (transparent) sqz init
Cline PreToolUse hook (transparent) sqz init
Gemini CLI BeforeTool hook (transparent) sqz init
OpenCode TypeScript plugin (transparent) sqz init
VS Code Extension Install from Marketplace
JetBrains Plugin Install from Marketplace
Chrome Browser extension ChatGPT, Claude.ai, Gemini, Grok, Perplexity
Firefox Browser extension Same sites

CLI

sqz init --global     # Install hooks for every project on this machine
sqz init              # Install hooks for just this project
sqz compress <text>   # Compress (or pipe from stdin)
sqz compact           # Evict stale context to free tokens
sqz gain              # Show daily token savings
sqz stats             # Cumulative report
sqz discover          # Find missed savings
sqz resume            # Re-inject session context after compaction
sqz hook claude       # Process a PreToolUse hook
sqz proxy --port 8080 # API proxy (compresses full request payloads)

Track Your Own Savings

Run sqz gain in your shell any time to see your own daily breakdown (see the Token Savings section above for what the output looks like), and sqz stats for the full cumulative report:

$ sqz stats
┌─────────────────────────┬──────────────────┐
│           sqz compression stats            │
├─────────────────────────┼──────────────────┤
│ Total compressions                  3,003 │
│ Tokens saved                      178,442 │
│ Avg reduction                       24.7% │
│ Cache entries                          43 │
│ Cache size                        39.1 KB │
└─────────────────────────┴──────────────────┘

Stats are stored locally in SQLite under ~/.sqz/sessions.db — nothing leaves your machine.

How Compression Works

  1. Per-command formattersgit status → compact summary, cargo test → failures only, docker ps → name/image/status table
  2. Structural summaries — code files compressed to imports + function signatures + call graph (~70% reduction). The model sees the architecture, not implementation noise.
  3. Dedup cache — SHA-256 content hash, persistent across sessions. Second read = 13-token reference.
  4. JSON pipeline — strip nulls → project out debug fields → flatten → collapse arrays → TOON encoding (lossless compact format)
  5. Safe mode — stack traces, secrets, migrations detected by entropy analysis and routed through with 0% compression

For the full technical details, see docs/.

Configuration

# ~/.sqz/presets/default.toml
[preset]
name = "default"
version = "1.0"

[compression.condense]
enabled = true
max_repeated_lines = 3

[compression.strip_nulls]
enabled = true

[budget]
warning_threshold = 0.70
default_window_size = 200000

Privacy

  • Zero telemetry — no data transmitted, no crash reports
  • Fully offline — works in air-gapped environments
  • All processing local

Development

git clone https://github.com/ojuschugh1/sqz.git
cd sqz
cargo test --workspace
cargo build --release

License

Elastic License 2.0 (ELv2) — use, fork, modify freely. Two restrictions: no competing hosted service, no removing license notices.

Links

Star History

Star History Chart

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqz-1.0.0.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sqz-1.0.0-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file sqz-1.0.0.tar.gz.

File metadata

  • Download URL: sqz-1.0.0.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for sqz-1.0.0.tar.gz
Algorithm Hash digest
SHA256 bdda2c8dfb869ab25447b3a55d4720d20736650d946ca428f99a79b67582e2e2
MD5 192283765005b6f1e1d4e5b6a84a754b
BLAKE2b-256 1b9dfb3e31d4987793c8f3d4560808d06820515b14d07bf1209b7532b3e6fe29

See more details on using hashes here.

File details

Details for the file sqz-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: sqz-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for sqz-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fd1ab9066495e2355ae549dfd198a1311261b8cbafa6dfc9d287001eb142f614
MD5 50e525dc296fd507ea517013bfe1e895
BLAKE2b-256 726f677e131e034a2acf8e66f1611080544730d7240fbd75a71f1504a93e2b71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page