SuperCompress — learned context compression for LLMs.

These details have not been verified by PyPI

Project links

Project description

SuperCompress

Learned context compression for LLMs — trim long prompts before inference with a small CPU policy, measurable quality vs baselines, and documented environmental impact.


Live site	supercompress.vercel.app
Documentation	arjunkshah-supercompress-55.mintlify.app
API dashboard	`/dashboard` on the live site
Hosted API	Same origin on Vercel — `/api/health`, `/api/v1/compress`, dashboard at `/dashboard`

Open the interactive playground →

Why SuperCompress?

Long agent context is expensive. Blind truncation keeps head and tail but drops answers in the middle. SuperCompress learns which lines to keep for the current question — under a fixed token budget.

Metric	SuperCompress	Truncation / FIFO
KV savings @ 35% budget	~65%	~65%
Oracle recall	100%	~25%
Policy size	~5K params	rule-based
Runs on	CPU (pre-inference)	CPU

At 1M compressions (est.): ~800M tokens avoided · 29 kWh · 12 kg CO₂ — see the environment guide.

Hosted API (Vercel)

The live site ships serverless API routes backed by Vercel Blob for key storage. No separate deploy step — push to main and Vercel builds static web/ plus api/.

Optional self-host: Docker, Fly.io (fly.toml), or Render (render.yaml) for the Python FastAPI stack.

Quick start

Hosted API (key + package)

pip install git+https://github.com/arjunkshah/supercompress.git
export SUPERCOMPRESS_API_KEY=sc_live_YOUR_KEY

from supercompress import SuperCompress

out = SuperCompress().compress(context, "Your question")
print(out.compressed_text)

Get a key at supercompress.vercel.app/dashboard.

Install (local compression)

pip install git+https://github.com/arjunkshah/supercompress.git
# local dev + tests + API server
pip install -e ".[dev,serve]"

Python (in-process)

from supercompress import compress_context, compare_policies

result = compress_context(
    "long context text…",
    "What does fetch return when the row is missing?",
    budget_ratio=0.35,
)
print(result.compressed_text)
print(f"{result.kv_savings_pct:.1f}% KV saved · {result.kept_tokens}/{result.original_tokens} tokens")

Hosted API (recommended)

1. Get a key — dashboard → Create key → copy sc_live_…

2. Install & call (stdlib HTTP client — no local PyTorch needed for the API):

pip install git+https://github.com/arjunkshah/supercompress.git
export SUPERCOMPRESS_API_KEY=sc_live_YOUR_KEY

from supercompress import SuperCompress

sc = SuperCompress()  # reads SUPERCOMPRESS_API_KEY
out = sc.compress("long context…", "What does fetch return?")
print(out.compressed_text)  # send to your LLM

Or raw HTTP:

curl -X POST https://supercompress.vercel.app/api/v1/compress \
  -H "X-API-Key: sc_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"context":"…","query":"Summarize","budget_ratio":0.35}'

On the live site, the dashboard hits the same origin — no SC_API_BASE config needed.

Local dev (no Firebase):

SC_AUTH_DEV=1 SC_KEY_STORE=memory python scripts/local_web_server.py
# → http://127.0.0.1:8790/dashboard

Deploy API (Docker / Fly.io / Render): see the API dashboard guide.

Browser demo

Open web/index.html or deploy the static web/ folder. Compression runs client-side — no API key required for the playground.

Documentation

Full docs: arjunkshah-supercompress-55.mintlify.app

Doc	Description
Quickstart	First compression in minutes
API reference	Python + HTTP endpoints
API dashboard	Keys, auth, usage
Integrations	OpenAI, LangChain, LlamaIndex
Environment	kWh / CO₂ methodology

Repo copies also live under docs/.

Benchmarks

python scripts/benchmark_web.py    # regenerates web/assets/data/benchmarks.json
python scripts/generate_charts.py  # SVG charts for landing page
pytest tests/ -q                   # 65 tests

Full benchmarks: supercompress.vercel.app/benchmarks

Policy comparison (8 seeds, budget 0.35):

Policy	Oracle recall	Entity recall	Latency
FIFO / Truncation	25%	73%	~57 ms
Summarization	61%	65%	~63 ms
H2O	98%	73%	~56 ms
SuperCompress	100%	73%	~60 ms

Charts: web/assets/img/chart-kv-savings.svg, chart-oracle-recall.svg, chart-impact.svg

Project layout

supercompress/          # Core library (~5K-param policy, baselines)
  api/                  # Hosted API — keys, Firebase auth, usage
web/                    # Landing page + browser demo + dashboard
scripts/                # benchmark_web.py, local_web_server.py, charts
tests/                  # test_supercompress, test_api_hard, test_api_server
checkpoints/default.pt  # Trained weights (included)
docs/                   # API, integrations, environment, dashboard

Development

git clone https://github.com/arjunkshah/supercompress.git
cd supercompress
pip install -e ".[dev,serve]"
pytest tests/ -q
python scripts/local_web_server.py   # optional: /dashboard, /v1/compress

Optional extras:

pip install -e ".[firebase]"   # Firebase Admin for production key store

NOTE: I DO NOT GIVE AYUSH ROUT (github.com/ayushrout12) ANY PERMISSION TO COPY OR USE MY PRODUCT IN ANY WAY, SHAPE, OR FORM. I DO NOT GIVE HIM CONSENT TO FORK, REFERENCE, OR CLONE/REFERENCE/USE THIS REPO IN ANY WAY, SHAPE, OR FORM.

What we claim (and don't)

We claim: learned CPU eviction beats truncation on oracle recall at similar KV savings; documented environmental estimates; reproducible benchmarks and tests.

We don't claim: live datacenter metering; CO₂ numbers without documented assumptions; that every workload matches benchmark seeds.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.5.0

Jun 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

supercompress-0.5.0.tar.gz (36.3 kB view details)

Uploaded Jun 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

supercompress-0.5.0-py3-none-any.whl (33.3 kB view details)

Uploaded Jun 26, 2026 Python 3

File details

Details for the file supercompress-0.5.0.tar.gz.

File metadata

Download URL: supercompress-0.5.0.tar.gz
Upload date: Jun 26, 2026
Size: 36.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for supercompress-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`df0eac3b8593c3d8fdb1e1832af3b2204b593f2bcd21f232af9677a02ecb3aca`
MD5	`c820aa0dacfd4c5211a1fde1c967814d`
BLAKE2b-256	`6c7f5dc0b0414852917f98ddc39f331e7652f71a0d6ca7594e4901a0b3ead198`

See more details on using hashes here.

File details

Details for the file supercompress-0.5.0-py3-none-any.whl.

File metadata

Download URL: supercompress-0.5.0-py3-none-any.whl
Upload date: Jun 26, 2026
Size: 33.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for supercompress-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e423876eb416c5e833dd2e0fb39aa72d5fff5c5352836d073b027a5b78b525b`
MD5	`85de9003e334cd57a0c88c8bb8bb257c`
BLAKE2b-256	`82646186e86ffd983c9e3033ebc0c82b6da86b47da490173a4173399fc3261af`

See more details on using hashes here.

supercompress 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SuperCompress

Why SuperCompress?

Hosted API (Vercel)

Quick start

Hosted API (key + package)

Install (local compression)

Python (in-process)

Hosted API (recommended)

Browser demo

Documentation

Benchmarks

Project layout

Development

NOTE: I DO NOT GIVE AYUSH ROUT (github.com/ayushrout12) ANY PERMISSION TO COPY OR USE MY PRODUCT IN ANY WAY, SHAPE, OR FORM. I DO NOT GIVE HIM CONSENT TO FORK, REFERENCE, OR CLONE/REFERENCE/USE THIS REPO IN ANY WAY, SHAPE, OR FORM.

What we claim (and don't)

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes