Shrink LLM context windows by removing noise, redundancy, and long-tail detail — without losing signal.
Project description
Context Compressor
Shrink LLM context windows — removing noise, redundancy, and long-tail detail without losing the signal. Typically 40–80% fewer tokens depending on how repetitive the input is (benchmarks).
Long agent loops, verbose tool output, and 10,000-line security scans blow past even a 200K context window — and every redundant token you send costs latency and money. Context Compressor is a small, fast, rule-based pipeline that strips the fat out of any text before it reaches the model.
from context_compressor import ContextCompressor
compressor = ContextCompressor()
result = compressor.compress(noisy_log_text)
print(result.stats.summary()) # 136 -> 65 tokens (52.2% smaller, backend=tiktoken)
print(result.compressed) # the cleaned text, ready to send to your LLM
Why
| Problem | What happens | What this does |
|---|---|---|
| Context overflow | Multi-turn agents accumulate history until the window overflows and the run breaks. | Collapse repeated turns and boilerplate phrasing. |
| Verbose tool output | A vuln scan or SELECT * dumps thousands of near-identical lines, 90% noise. |
Drop noise, dedupe rows, trim long-tail detail. |
| Token cost | Every wasted token is latency + dollars on every call. | 40–80% token reduction on noisy input, measured with tiktoken. |
It works on anything text: chat transcripts, application logs, JSON blobs, SQL result dumps, and security scanner output.
Highlights
- Zero required dependencies. Pure Python standard library.
tiktokenis optional — without it a built-in heuristic counter is used automatically. - Lossless-leaning by default. Removals are high-precision; counts are
preserved (
port 22 open [x3]) rather than silently dropped. - Composable pipeline. Toggle each stage, tune thresholds, or add your own noise patterns via plain dataclass config.
- Measured, not guessed. Every run returns before/after token counts and a per-stage breakdown.
- Security-aware. A dedicated summarizer turns raw scanner output into a severity-ranked brief.
- CLI included.
cat scan.log | context-compress --stats.
Install
pip install llm-context-compressor # zero dependencies
pip install "llm-context-compressor[tiktoken]" # exact OpenAI/Anthropic-style token counts
Install name is
llm-context-compressor; the import isimport context_compressor(likescikit-learn→import sklearn).
Or from source:
git clone https://github.com/uninhibited-scholar/context-compressor
cd context-compressor
pip install -e ".[dev]"
pytest
How it works
The pipeline runs cheap, high-precision stages first, then optionally falls back
to extractive summarization only if a target_ratio is requested and the rules
didn't get there:
raw text
│
▼ 1. NoiseFilter drop timestamps, progress bars, status chatter, separators
▼ 2. DetailTrimmer shorten long strings, cap JSON depth, collapse table runs, strip log metadata
▼ 3. PatternRemover collapse repeated agent phrasing ("as I mentioned…")
▼ 4. RedundancyFilter dedupe identical/near-identical lines, keep counts
▼ 5. ExtractiveSummarizer (optional) TextRank-style, only if still over target
│
▼
compressed text + full token/stage metrics
Usage
Presets
from context_compressor import ContextCompressor, CompressionConfig
ContextCompressor(CompressionConfig.conservative()) # safe, lossless-ish
ContextCompressor() # balanced default
ContextCompressor(CompressionConfig.aggressive()) # smallest output
Hit a target size
cfg = CompressionConfig(target_ratio=0.3) # aim for 30% of the original
result = ContextCompressor(cfg).compress(long_transcript)
Tune any stage
from context_compressor import CompressionConfig, NoiseConfig
cfg = CompressionConfig()
cfg.noise.drop_log_levels = True
cfg.redundancy.near_duplicate = True
cfg.trim.max_list_items = 10
cfg.noise.extra_patterns.append((r"^TRACE:.*$", "trace_line")) # your own rule
Security scan brief
brief = ContextCompressor().compress_security_scan(nessus_output, examples_per_type=3)
print(brief)
【Security Scan Summary】
Detected 11 findings across 6 categories.
🔴 SQL Injection [CRITICAL]: 2 (CVE-2024-2117)
1. https://shop.local/search?q=test
2. https://shop.local/item?id=42
🟠 Cross-Site Scripting (XSS) [HIGH]: 2
...
🟢 Open Port / Service [LOW]: 3
1. 10.0.0.5
… and 2 more
Command line
context-compress scan.log --stats
cat transcript.txt | context-compress --preset aggressive
context-compress nessus.txt --security
context-compress big.log --target 0.3 > small.log
Compressed text goes to stdout; metrics go to stderr, so pipes stay clean.
Reading the metrics
result = compressor.compress(text)
s = result.stats
s.original_tokens # 136
s.compressed_tokens # 65
s.reduction_pct # 52.2
s.token_backend # "tiktoken" or "heuristic"
for stage in s.stages:
print(stage.name, stage.chars_removed, stage.details)
Benchmarks
Reproducible with python benchmarks/benchmark.py (token counts via tiktoken):
| Dataset | Tokens before | Tokens after | Reduction | Time |
|---|---|---|---|---|
| Application log | 13,479 | 7,864 | 41.7% | 30 ms |
| Security scan | 4,625 | 3,863 | 16.5% | 11 ms |
| Agent transcript | 3,172 | 2,665 | 16.0% | 8 ms |
| JSON result dump | 1,124 | 220 | 80.4% | 1 ms |
Reduction scales with how repetitive the input is — heavily duplicated logs and
scan output compress much further than already-unique prose. See
benchmarks/BENCHMARKS.md for the chart.
JSON blobs
result = compressor.compress_json(huge_json_string) # caps depth, lists, long strings
print(result.compressed)
RAG: LangChain & LlamaIndex
Drop-in adapters compress retrieved chunks before they reach the model. They are dependency-free (they duck-type the document objects), so installing this package never pulls in either framework.
# LangChain — implements the BaseDocumentTransformer interface
from context_compressor.integrations import CompressorDocumentTransformer
transformer = CompressorDocumentTransformer()
smaller_docs = transformer.transform_documents(retrieved_docs)
# each doc.metadata["compression"] now records the token savings
# LlamaIndex — works on nodes / Documents
from context_compressor.integrations import compress_nodes
nodes = compress_nodes(retriever.retrieve("my query"))
Integrating with an agent loop
from context_compressor import ContextCompressor, CompressionConfig
compressor = ContextCompressor(CompressionConfig(target_ratio=0.4))
def before_model_call(history: str) -> str:
# Compress accumulated context before each turn to stay under the window.
return compressor.compress(history).compressed
API at a glance
| Object | Purpose |
|---|---|
ContextCompressor |
The pipeline. .compress(text) -> CompressionResult. |
CompressionConfig |
All knobs; .aggressive() / .conservative() presets. |
CompressionResult |
.compressed text + .stats. |
NoiseFilter, RedundancyFilter, DetailTrimmer, PatternRemover |
Stages, usable standalone. |
ExtractiveSummarizer |
Dependency-free TextRank-style summarizer. |
SecuritySummarizer |
Scanner output → severity-ranked brief. |
TokenCounter |
tiktoken-backed counter with heuristic fallback. |
Development
pip install -e ".[dev]"
pytest --cov=context_compressor # 24 tests, ~94% coverage
中文简介
Context Compressor 是一个零依赖的 Python 库,用于在把文本送入大模型之前
压缩上下文:去除噪音(时间戳、进度条、状态消息)、合并重复行、裁剪长字符串/
深层 JSON、折叠 Agent 重复话术,并可选地做抽取式摘要。典型可减少 50–80% 的
Token,配合 tiktoken 可获得与 OpenAI/Anthropic 对齐的精确计数。内置面向网络
安全扫描结果的专属摘要器,可将上万行扫描日志归纳为按风险等级排序的简报。
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_context_compressor-0.2.0.tar.gz.
File metadata
- Download URL: llm_context_compressor-0.2.0.tar.gz
- Upload date:
- Size: 29.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf715af3cb842b093180826371f5478b4ea7bb9f162ffc78b8c9b80c8b619462
|
|
| MD5 |
fcc9a45260f074711ae5a67d55819d35
|
|
| BLAKE2b-256 |
eb84a7a527cb1ea3c6a7c621b1a0db860ebfacac3f44e54b5fb79e75b1859305
|
Provenance
The following attestation bundles were made for llm_context_compressor-0.2.0.tar.gz:
Publisher:
publish.yml on uninhibited-scholar/context-compressor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_context_compressor-0.2.0.tar.gz -
Subject digest:
cf715af3cb842b093180826371f5478b4ea7bb9f162ffc78b8c9b80c8b619462 - Sigstore transparency entry: 1843964319
- Sigstore integration time:
-
Permalink:
uninhibited-scholar/context-compressor@78063e083fe895f197762d1965bfbdb981df706e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/uninhibited-scholar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@78063e083fe895f197762d1965bfbdb981df706e -
Trigger Event:
push
-
Statement type:
File details
Details for the file llm_context_compressor-0.2.0-py3-none-any.whl.
File metadata
- Download URL: llm_context_compressor-0.2.0-py3-none-any.whl
- Upload date:
- Size: 27.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97d6390264d02101934ff2675cb402850930db467d62a1abfbed002cf213e98d
|
|
| MD5 |
46cfad7ae597bc19988d738ed9021cf6
|
|
| BLAKE2b-256 |
b6564002838b030686a7b3d5b7709aace0bd093112b7765367a8ea5f029db1f4
|
Provenance
The following attestation bundles were made for llm_context_compressor-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on uninhibited-scholar/context-compressor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_context_compressor-0.2.0-py3-none-any.whl -
Subject digest:
97d6390264d02101934ff2675cb402850930db467d62a1abfbed002cf213e98d - Sigstore transparency entry: 1843964446
- Sigstore integration time:
-
Permalink:
uninhibited-scholar/context-compressor@78063e083fe895f197762d1965bfbdb981df706e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/uninhibited-scholar
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@78063e083fe895f197762d1965bfbdb981df706e -
Trigger Event:
push
-
Statement type: