Skip to main content

14-stage Fusion Pipeline for LLM token compression — 15-82% reduction depending on content, zero LLM inference cost, reversible compression, AST-aware code analysis

Project description

Claw Compactor

14-Stage Fusion Pipeline for LLM Token Compression

Claw Compactor Banner

CI Tests Python License PyPI Downloads Stars

15–82% compression depending on content · Zero LLM inference cost · Reversible · 1663 tests

Architecture · Benchmarks · Quick Start · API


What is Claw Compactor?

Claw Compactor is an open-source LLM token compression engine built around a 14-stage Fusion Pipeline. Each stage is a specialized compressor — from AST-aware code analysis to JSON statistical sampling to simhash-based deduplication — chained through an immutable data flow architecture where each stage's output feeds the next.

Input
  |
  v
┌─────────────────────────────────────────────────────────────────────────┐
│                         FUSION PIPELINE                                 │
│                                                                         │
│  QuantumLock ─> Cortex ─> Photon ─> RLE ─> SemanticDedup ─> Ionizer    │
│       |            |         |        |          |              |        │
│   KV-cache    auto-detect  base64   path     simhash       JSON         │
│   alignment   16 languages  strip  shorten   dedup        sampling      │
│                                                                         │
│  ─> LogCrunch ─> SearchCrunch ─> DiffCrunch ─> StructuralCollapse      │
│        |              |              |                |                  │
│    log folding    result dedup   context fold    import merge            │
│                                                                         │
│  ─> Neurosyntax ─> Nexus ─> TokenOpt ─> Abbrev ─────────> Output       │
│        |             |          |           |                            │
│    AST compress   ML token   format     NL shorten                      │
│    (tree-sitter)  classify   optimize   (text only)                     │
│                                                                         │
│  [ RewindStore ] ── hash-addressed LRU for reversible retrieval         │
└─────────────────────────────────────────────────────────────────────────┘

Key design principles:

  • Immutable data flowFusionContext is a frozen dataclass. Every stage produces a new FusionResult; nothing is mutated in-place.
  • Gate-before-compress — Each stage has should_apply() that inspects context type, language, and role before doing any work. Stages that don't apply are skipped at zero cost.
  • Content-aware routing — Cortex auto-detects content type (code, JSON, logs, diffs, search results) and language (Python, Go, Rust, TypeScript, etc.), then downstream stages make type-aware compression decisions.
  • Reversible compression — Ionizer stores originals in a hash-addressed RewindStore. The LLM can call a tool to retrieve any compressed section by its marker ID.

Benchmarks

Real-World Compression (FusionEngine v7 vs Legacy Regex)

Content Type Legacy FusionEngine Improvement
Python source 7.3% 25.0% 3.4x
JSON (100 items) 12.6% 81.9% 6.5x
Build logs 5.5% 24.1% 4.4x
Agent conversation 5.7% 31.0% 5.4x
Git diff 6.2% 15.0% 2.4x
Search results 5.3% 40.7% 7.7x
Weighted average 9.2% 36.3% 3.9x

SWE-bench Real Tasks

Tested on real SWE-bench instances with actual repository code:

Instance Size Compression
django__django-11620 4.5K 14.5%
sympy__sympy-14396 5.5K 19.1%
scikit-learn-25747 11.8K 15.9%
scikit-learn-13554 73K 11.8%
scikit-learn-25308 81K 14.4%

vs LLMLingua-2 (ROUGE-L Fidelity)

Compression Rate Claw Compactor LLMLingua-2 Delta
0.3 (aggressive) 0.653 0.346 +88.2%
0.5 (balanced) 0.723 0.570 +26.8%

Claw Compactor preserves more semantic content at the same compression ratio, with zero LLM inference cost.


Quick Start

Install from PyPI

pip install claw-compactor

Or clone from source

git clone https://github.com/open-compress/claw-compactor.git
cd claw-compactor
pip install -e .

Run

# Benchmark your workspace (non-destructive)
claw-compactor benchmark /path/to/workspace

# Full compression pipeline
claw-compactor compress /path/to/workspace

Requirements: Python 3.9+. Optional: pip install claw-compactor[accurate] for exact token counts via tiktoken.


API

FusionEngine — Single Text

from scripts.lib.fusion.engine import FusionEngine

engine = FusionEngine()

result = engine.compress(
    text="def hello():\n    # greeting function\n    print('hello')",
    content_type="code",    # or let Cortex auto-detect
    language="python",      # optional hint
)

print(result["compressed"])     # compressed output
print(result["stats"])          # per-stage timing + token counts
print(result["markers"])        # Rewind markers for reversibility

FusionEngine — Chat Messages

messages = [
    {"role": "system", "content": "You are a coding assistant..."},
    {"role": "user", "content": "Fix the auth bug in login.py"},
    {"role": "assistant", "content": "I found the issue. Here's the fix:\n```python\n..."},
    {"role": "tool", "content": '{"results": [{"file": "login.py", ...}, ...]}'},
]

result = engine.compress_messages(messages)

# Cross-message dedup runs first, then per-message pipeline
print(result["stats"]["reduction_pct"])   # aggregate compression %
print(result["per_message"])              # per-message breakdown

Rewind — Reversible Retrieval

engine = FusionEngine(enable_rewind=True)
result = engine.compress(large_json, content_type="json")

# LLM sees compressed output with markers like [rewind:abc123...]
# When the LLM needs the original, it calls the Rewind tool:
original = engine.rewind_store.retrieve("abc123def456...")

Custom Stage

from scripts.lib.fusion.base import FusionStage, FusionContext, FusionResult

class MyStage(FusionStage):
    name = "my_compressor"
    order = 22  # runs between StructuralCollapse (20) and Neurosyntax (25)

    def should_apply(self, ctx: FusionContext) -> bool:
        return ctx.content_type == "log"

    def apply(self, ctx: FusionContext) -> FusionResult:
        compressed = my_compression_logic(ctx.content)
        return FusionResult(
            content=compressed,
            original_tokens=estimate_tokens(ctx.content),
            compressed_tokens=estimate_tokens(compressed),
        )

# Add to pipeline
pipeline = engine.pipeline.add(MyStage())

The 14 Stages

# Stage Order Purpose Applies To
1 QuantumLock 3 Isolates dynamic content in system prompts to maximize KV-cache hit rate system messages
2 Cortex 5 Auto-detects content type and programming language (16 languages) untyped content
3 Photon 8 Detects and compresses base64-encoded images all
4 RLE 10 Path shorthand ($WS), IP prefix compression, enum compaction all
5 SemanticDedup 12 SimHash fingerprint deduplication across content blocks all
6 Ionizer 15 JSON array statistical sampling with schema discovery + error preservation json
7 LogCrunch 16 Folds repeated log lines with occurrence counts log
8 SearchCrunch 17 Deduplicates search/grep results search
9 DiffCrunch 18 Folds unchanged context lines in git diffs diff
10 StructuralCollapse 20 Merges import blocks, collapses repeated assertions/patterns code
11 Neurosyntax 25 AST-aware code compression via tree-sitter (safe regex fallback). Never shortens identifiers. code
12 Nexus 35 ML token-level compression (stopword removal fallback without model) text
13 TokenOpt 40 Tokenizer format optimization — strips bold/italic markers, normalizes whitespace all
14 Abbrev 45 Natural language abbreviation. Only fires on text — never touches code, JSON, or structured data. text

Each stage is independent and stateless. Stages communicate only through the immutable FusionContext that flows forward through the pipeline.


Workspace Commands

python3 scripts/mem_compress.py <workspace> <command> [options]
Command Description
full Run complete compression pipeline
benchmark Dry-run compression report
compress Rule-based compression only
dict Dictionary encoding with auto-learned codebook
observe Session transcript JSONL to structured observations
tiers Generate L0/L1/L2 tiered summaries
dedup Cross-file duplicate detection
estimate Token count report
audit Workspace health check
optimize Tokenizer-level format optimization
auto Watch mode — compress on file changes

Options: --json, --dry-run, --since YYYY-MM-DD, --quiet


Architecture

See ARCHITECTURE.md for the full technical deep-dive:

  • Immutable data flow design
  • Stage execution model and gating
  • Rewind reversible compression protocol
  • Cross-message semantic deduplication
  • How to extend the pipeline
12,000+ lines Python  ·  1,676 tests  ·  14 fusion stages  ·  0 external ML dependencies

Installation

# Clone
git clone https://github.com/open-compress/claw-compactor.git
cd claw-compactor

# Optional: exact token counting
pip install tiktoken

# Optional: AST-aware code compression (Neurosyntax)
pip install tree-sitter-language-pack

# Development
pip install -e ".[dev,accurate]"

Zero required dependencies. tiktoken and tree-sitter are optional enhancements — the pipeline runs with built-in heuristic fallbacks for both.


Project Stats

Metric Value
Tests 1,676 passed
Python source 12,000+ lines
Fusion stages 14
Languages detected 16
Required dependencies 0
Compression (code) 15–25%
Compression (JSON peak) 81.9%
ROUGE-L @ 0.3 rate 0.653
License MIT

Related


token-compression llm-tools fusion-pipeline reversible-compression ast-code-analysis context-compression ai-agent openclaw python developer-tools

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claw_compactor-7.0.1.tar.gz (271.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

claw_compactor-7.0.1-py3-none-any.whl (181.3 kB view details)

Uploaded Python 3

File details

Details for the file claw_compactor-7.0.1.tar.gz.

File metadata

  • Download URL: claw_compactor-7.0.1.tar.gz
  • Upload date:
  • Size: 271.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claw_compactor-7.0.1.tar.gz
Algorithm Hash digest
SHA256 628844ad5f0de7e1e5b664da699a86691030ec2bef31abc19af71f6215682d28
MD5 77d87e9118dc37822c4863eef0aae381
BLAKE2b-256 223a8586608a0e105335d68c481ca9996cab33634454688686b2322390bf79d2

See more details on using hashes here.

Provenance

The following attestation bundles were made for claw_compactor-7.0.1.tar.gz:

Publisher: publish.yml on open-compress/claw-compactor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file claw_compactor-7.0.1-py3-none-any.whl.

File metadata

  • Download URL: claw_compactor-7.0.1-py3-none-any.whl
  • Upload date:
  • Size: 181.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claw_compactor-7.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 12e04f133002ac6c9fa2d23ff78b87f19adea3fe8a9b28be21347f270f1f9e58
MD5 fbb370eda8e2b30f80faa7fc87a1a596
BLAKE2b-256 e06e84450555346356b3d84d0801e5882d72bcd9ce3caa8e0ea89cf541e4cc98

See more details on using hashes here.

Provenance

The following attestation bundles were made for claw_compactor-7.0.1-py3-none-any.whl:

Publisher: publish.yml on open-compress/claw-compactor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page