Intelligent context compression algorithms for LLM systems
Project description
sageRefiner
Intelligent Context Compression Library for LLM Systems
sageRefiner provides state-of-the-art context compression algorithms to reduce token usage while maintaining semantic quality for Large Language Model applications.
Features
-
8 Compression Algorithms
- LongRefiner: LLM-based selective compression with importance scoring
- REFORM: Attention-based compression with KV cache optimization
- Provence: Sentence-level pruning using DeBERTa reranker
- LLMLingua2: Fast BERT-based token classification
- LongLLMLingua: Question-aware perplexity-based compression
- RECOMP-Abstractive: T5-based summarization
- RECOMP-Extractive: BERT-based sentence selection
- EHPC: Evaluator Heads based efficient compression
-
High Compression Ratios: 2-10x compression while preserving key information
-
Flexible Configuration: YAML/dict-based configuration
-
Production Ready: Battle-tested in the SAGE framework
Installation
# Basic installation
pip install isage-refiner
# With benchmark support
pip install isage-refiner[benchmark]
# Development mode
git clone https://github.com/intellistream/sageRefiner.git
cd sageRefiner
pip install -e .
Quick Start
from sage_refiner import LLMLingua2Compressor
# Initialize compressor
compressor = LLMLingua2Compressor()
# Compress context
result = compressor.compress(
context="Your long document text here...",
question="What is the main topic?",
target_token=500,
)
print(f"Compression rate: {result['compression_rate']:.2%}")
print(f"Compressed: {result['compressed_context']}")
Algorithms
| Algorithm | Model | Best For |
|---|---|---|
| LongRefiner | Qwen/Llama + LoRA | High-quality semantic compression |
| REFORM | Any Llama/Qwen | Fast attention-based selection |
| Provence | DeBERTa | Document-level filtering |
| LLMLingua2 | BERT | Speed-critical applications |
| LongLLMLingua | GPT-2/Llama | Long document scenarios |
| RECOMP-Abst | T5 | Summarization-style compression |
| RECOMP-Extr | BERT | Sentence extraction |
| EHPC | Llama-8B | Evaluator heads selection |
Configuration
from sage_refiner import RefinerConfig
config = RefinerConfig(
algorithm="llmlingua2",
target_token=500,
force_tokens=["important", "keyword"],
)
Examples
# Basic compression
python examples/basic_compression.py
# Compare algorithms
python examples/algorithm_comparison.py
Requirements
- Python 3.11+
- PyTorch 2.0+
- Transformers 4.43+
Benchmarking
The benchmarks module provides a comprehensive evaluation framework for all compression
algorithms:
# Quick comparison of multiple algorithms
pip install isage-refiner[benchmark]
sage-refiner-bench compare \
--algorithms baseline,longrefiner,reform,provence \
--samples 100
# Detailed evaluation with budget sweep
sage-refiner-bench sweep \
--algorithm longrefiner \
--budgets 512,1024,2048,4096
For detailed benchmarking documentation, see benchmarks/README.md and benchmarks/STRUCTURE.md.
Citation
@software{sageRefiner2025,
title = {sageRefiner: Context Compression for LLM},
author = {SAGE Team},
year = {2025},
url = {https://github.com/intellistream/sageRefiner}
}
License
Apache License 2.0
Links
- SAGE Framework: https://github.com/intellistream/SAGE
- Issues: https://github.com/intellistream/sageRefiner/issues
Documentation & Development
- Development Guide - Setup, workflow, and guidelines
- Pre-commit Hooks - Setup and usage guide
- CI/CD Pipeline - Automated checks details
For quick setup:
bash utils/installation/quickstart.sh # Full installation
bash utils/hooks/setup-hooks.sh # Setup pre-commit hooks
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isage_refiner-0.1.0.10.tar.gz.
File metadata
- Download URL: isage_refiner-0.1.0.10.tar.gz
- Upload date:
- Size: 87.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb7d5635abebdc69d1368d04957a0e1b3430c9561617d0cb9674eff50d496d88
|
|
| MD5 |
223a69b6cf96abcd2e7639323d403674
|
|
| BLAKE2b-256 |
68509c2b5dee7b9823bd4dccdfd348c2c71b9b5e0f3d4d4cb7c822887d613584
|
File details
Details for the file isage_refiner-0.1.0.10-py2.py3-none-any.whl.
File metadata
- Download URL: isage_refiner-0.1.0.10-py2.py3-none-any.whl
- Upload date:
- Size: 80.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
84347fddfe6c0d0bb1a91b4feca27501356cdef94d869bb958c9a139c812777e
|
|
| MD5 |
e60a2204d06a66cf512f990d78ff57ac
|
|
| BLAKE2b-256 |
857d38fc8b7924235ccc30fac02e5ac14da1b1e5418a0fe1830ac09417b142d8
|