Skip to main content

Intelligent token reduction library for LLM applications with context-aware compression

Project description

Token Reducer

Intelligent token reduction library for LLM applications with advanced semantic and structural compression

Python 3.9+ PyPI version License: MIT Code style: black

Overview

Token Reducer is a Python library designed to reduce token counts in text and code inputs for Large Language Model (LLM) applications while preserving semantic meaning, logical structure, and task-relevant information. Achieve 50-75% token reduction with advanced semantic compression techniques.

What's New in v0.2.0

  • Semantic Text Compression (68%+ reduction) - Entity abstraction, proposition extraction, hierarchical summarization
  • AST-Based Code Compression (73%+ reduction) - Safe Python code minification using Abstract Syntax Trees
  • Domain-Specific Handlers (74%+ reduction) - Specialized compression for logs, transcripts, legal documents
  • Enhanced Configuration - 8 new advanced parameters for fine-grained control
  • Reversibility Mappings - Optional entity/variable restoration
  • Progressive Compression - Compress to specific token budgets

Key Features

  • Context-Aware Compression: Task-specific strategies for summarization, RAG, extraction, reasoning, and more
  • Multi-Level Compression: Choose between light (5-15%), moderate (20-40%), or aggressive (50-75%) reduction
  • Multi-Stage Pipeline: 6-stage intelligent compression (Identification → Segmentation → Redundancy Removal → Semantic Compression → Optimization → Reversibility)
  • Text & Code Support: Domain-specific compression for natural language and source code
  • Tokenizer Agnostic: Works with OpenAI, Anthropic, HuggingFace, and custom tokenizers
  • Fail-Safe Mode: Automatic quality validation with semantic similarity checking
  • High Performance: <100ms per 1000 tokens for text, <200ms for code
  • Offline Operation: No cloud dependencies, works standalone with intelligent fallbacks

Installation

Basic Installation

pip install token-reducer

With Optional Dependencies

# For advanced NLP features (entity extraction, NER)
pip install token-reducer[nlp]

# For semantic similarity (embeddings-based deduplication)
pip install token-reducer[similarity]

# For HuggingFace tokenizers
pip install token-reducer[transformers]

# For Anthropic tokenizers
pip install token-reducer[anthropic]

# Install everything
pip install token-reducer[all]

Quick Start

Basic Text Compression

from token_reducer import compress_text, TaskContext, CompressionLevel

# Compress text for summarization task
result = compress_text(
    text="Your long article text here...",
    task=TaskContext.SUMMARIZATION,
    level=CompressionLevel.MODERATE,
    tokenizer="gpt-4"
)

print(f"Original: {result.original_tokens} tokens")
print(f"Compressed: {result.compressed_tokens} tokens")
print(f"Reduction: {result.reduction_percentage}%")
print(f"\nCompressed text:\n{result.compressed_text}")

Advanced Semantic Compression (NEW in v0.2.0)

from token_reducer import (
    compress_text,
    CompressionConfig,
    CompressionLevel,
    TaskContext
)

# Configure advanced compression
config = CompressionConfig(
    task=TaskContext.SUMMARIZATION,
    level=CompressionLevel.AGGRESSIVE,
    tokenizer="gpt-4",
    # Advanced features
    enable_entity_abstraction=True,  # Replace entities with placeholders
    enable_semantic_dedup=True,      # Remove semantically redundant sentences
    enable_proposition_extraction=True,  # Simplify complex sentences
    semantic_threshold=0.85,         # Similarity threshold for deduplication
    reversible=True,                 # Generate reversibility mappings
    target_tokens=500                # Compress to specific token count
)

result = compress_text("Your text here...", config=config)

AST-Based Code Compression (NEW in v0.2.0)

from token_reducer import PythonASTCompressor

# Create AST compressor
compressor = PythonASTCompressor(
    remove_comments=True,
    remove_docstrings=True,
    rename_variables=True,  # Scope-aware variable renaming
    remove_dead_code=True   # Remove unused imports
)

code = '''
def calculate_sum(first_number, second_number):
    """Calculate the sum of two numbers."""
    # Add the numbers together
    result = first_number + second_number
    return result
'''

compressed_code, rename_map = compressor.compress(code)
print(f"Compressed code:\n{compressed_code}")
# Output: def calculate_sum(a,b):
#     c=a+b
#     return c

Domain-Specific Compression (NEW in v0.2.0)

Log File Compression

from token_reducer import LogHandler

handler = LogHandler()

logs = """
[2025-11-23 03:15:01] ERROR: Connection failed
[2025-11-23 03:15:02] ERROR: Connection failed
[2025-11-23 03:15:03] ERROR: Connection failed
[2025-11-23 03:15:04] ERROR: Connection failed
"""

compressed_logs, stats = handler.compress_logs(logs, collapse_threshold=3)
print(compressed_logs)
# Output: [TS] ERROR: Connection failed (repeated 4x)

Meeting Transcript Compression

from token_reducer import TranscriptHandler

handler = TranscriptHandler()

transcript = """
John Smith: Um, so like, I think we should, you know, move forward.
Jane Doe: Yeah, I mean, that sounds good.
"""

compressed, speaker_map = handler.compress_transcript(transcript)
print(compressed)
# Output: JS: so I think we should, move forward. JD: Yeah, that sounds good.

Entity Abstraction

from token_reducer import EntityAbstractor

abstractor = EntityAbstractor(use_spacy=False)  # Uses regex fallback

text = "International Business Machines Corporation held a session in Islamabad."
abstracted, entity_map = abstractor.abstract_entities(text, preserve=True)

print(abstracted)
# Output: [ORG1] held a session in [LOC1].
print(entity_map)
# Output: {'[ORG1]': 'International Business Machines Corporation', '[LOC1]': 'Islamabad'}

Performance Benchmarks

Feature Token Reduction Use Case
AST Code Compression 73.4% Python code minification
Log Compression 74.1% Server logs, error logs
Hierarchical Summarization 68.4% Long documents, articles
Entity Abstraction 30.8% Named entity replacement
Proposition Extraction 18.2% Sentence simplification
Transcript Compression 27.3% Meeting notes, conversations

Compression Strategies

Task Types

Token Reducer adapts compression strategies based on your use case:

  • SUMMARIZATION: Preserves causal links and chronological order
  • RAG: Optimizes for retrieval context (entities, facts, key phrases)
  • EXTRACTION: Keeps only fields relevant to extraction target
  • REASONING: Preserves premises, key details, and logical connections
  • TRANSLATION: Bypasses compression entirely
  • CODE_COMPLETION: Preserves function signatures and interfaces
  • DEBUGGING: Maintains variable names and error-relevant context
  • QUESTION_ANSWERING: Preserves facts and entities for potential questions

Compression Levels

Level Token Reduction Semantic Similarity Use Case
Light 5-15% >98% Maximum safety, minimal loss
Moderate 20-40% >90% Balanced compression and quality
Aggressive 50-75% >80% Maximum savings, acceptable loss

Advanced Features (v0.2.0)

Content Type Identification

Automatically detects and routes content to appropriate handlers:

from token_reducer import ContentIdentifier

identifier = ContentIdentifier()
content_type = identifier.identify(text)
# Returns: CODE, LOGS, PROSE, TRANSCRIPT, CHAT, LEGAL, or ACADEMIC

Hierarchical Summarization

Multi-level text summarization:

from token_reducer import HierarchicalSummarizer

summarizer = HierarchicalSummarizer()

# Sentence-level
summary = summarizer.summarize(text, level="sentence")

# Paragraph-level
summary = summarizer.summarize(text, level="paragraph", max_sentences=3)

# Document-level
summary = summarizer.summarize(text, level="document", target_ratio=0.5)

Semantic Deduplication

Remove semantically redundant content:

from token_reducer import SemanticDeduplicator

deduplicator = SemanticDeduplicator(
    use_embeddings=False,  # Uses heuristic fallback
    similarity_threshold=0.85
)

unique_sentences = deduplicator.deduplicate(sentences)

Legal Document Compression

Specialized handler for legal text:

from token_reducer import LegalHandler

handler = LegalHandler()
compressed, removed = handler.compress_legal_document(
    document,
    preserve_clauses=True
)

# Extract clauses and definitions
clauses = handler.extract_clauses(document)
definitions = handler.extract_definitions(document)

How It Works

Multi-Stage Pipeline (v0.2.0)

  1. Identification: Detect content type (Code, Logs, Prose, etc.)
  2. Segmentation: Break into logical units (sentences, paragraphs, functions)
  3. Redundancy Removal: Remove repeated patterns and filler content
  4. Semantic Compression: Apply NLP techniques (entity abstraction, summarization)
  5. Optimization: Final cleanup and variable shortening
  6. Reversibility: Generate mapping for restoration (optional)

Text Compression Pipeline

  1. Normalize: Fix spacing, remove HTML, standardize quotes
  2. Prune: Remove duplicates, redundancy, verbose explanations
  3. Compress: Extract entities/facts, compact phrasing, reduce adjectives
  4. Summarize: Apply task-specific tightening
  5. Repack: Shorten sentences, optimize structure

Code Compression Pipeline

  1. Parse AST: Build Abstract Syntax Tree
  2. Remove Docstrings: Safe docstring removal
  3. Remove Comments: Strip all comments
  4. Rename Variables: Scope-aware shortening
  5. Dead Code Elimination: Remove unused imports/functions
  6. Unparse: Convert back to code

Use Cases

Reduce LLM API Costs

# Before: 10,000 tokens × $0.03/1K = $0.30 per request
# After (70% reduction): 3,000 tokens × $0.03/1K = $0.09 per request
# Savings: 70% cost reduction

Fit More Context in Token Limits

from token_reducer import batch_compress_text

# Compress multiple documents to fit in context window
results = batch_compress_text(
    texts=[doc1, doc2, doc3, doc4, doc5],
    task=TaskContext.RAG,
    level=CompressionLevel.MODERATE
)

combined = "\n\n".join(r.compressed_text for r in results)

RAG Pipeline Optimization

# Compress retrieved documents before sending to LLM
retrieved_docs = vector_store.similarity_search(query, k=10)

compressed_docs = [
    compress_text(
        doc.page_content,
        task=TaskContext.RAG,
        level=CompressionLevel.MODERATE
    )
    for doc in retrieved_docs
]

context = "\n\n".join(d.compressed_text for d in compressed_docs)

Configuration Options

CompressionConfig Parameters

config = CompressionConfig(
    # Basic settings
    task=TaskContext.SUMMARIZATION,
    level=CompressionLevel.MODERATE,
    tokenizer="gpt-4",
    
    # Preservation settings
    preserve_entities=True,
    preserve_numbers=True,
    preserve_facts=True,
    preserve_instructions=True,
    
    # Quality settings
    quality_threshold=0.90,
    enable_fail_safe=True,
    
    # Advanced features (v0.2.0)
    enable_ast_parsing=False,
    enable_semantic_dedup=False,
    enable_entity_abstraction=False,
    enable_proposition_extraction=False,
    use_embeddings=False,
    semantic_threshold=0.85,
    target_tokens=None,
    reversible=False
)

API Reference

Main Functions

  • compress_text(text, task, level, tokenizer, **kwargs) - Compress text
  • compress_code(code, task, level, language, tokenizer, **kwargs) - Compress code
  • batch_compress_text(texts, task, level, tokenizer, **kwargs) - Batch compression
  • compress_with_config(text, config) - Compress with configuration object

Advanced Classes (v0.2.0)

  • PythonASTCompressor - AST-based code compression
  • EntityAbstractor - Named entity abstraction
  • PropositionExtractor - Sentence simplification
  • SemanticDeduplicator - Semantic deduplication
  • HierarchicalSummarizer - Multi-level summarization
  • LogHandler - Log file compression
  • TranscriptHandler - Transcript compression
  • LegalHandler - Legal document compression
  • ContentIdentifier - Content type detection
  • Segmenter - Content segmentation

Changelog

v0.2.0 (2025-11-23)

Major Features:

  • Added semantic text compression (entity abstraction, proposition extraction, deduplication)
  • Added AST-based Python code compression (73%+ reduction)
  • Added domain-specific handlers (logs, transcripts, legal documents)
  • Added multi-stage pipeline architecture
  • Added 8 new configuration parameters
  • Added reversibility mappings
  • Added progressive compression to token budgets

Performance:

  • AST Code Compression: 73.4% token reduction
  • Log Compression: 74.1% token reduction
  • Hierarchical Summarization: 68.4% token reduction

v0.1.0 (2025-11-22)

  • Initial release with core token reduction functionality
  • Basic text and code compression
  • Task-aware compression strategies
  • Multi-level compression support
  • Multiple tokenizer support

Development

Setup Development Environment

git clone https://github.com/UsamaTufail31/token-reducer.git
cd token-reducer
pip install -e ".[dev,all]"
pre-commit install

Run Tests

pytest
pytest --cov=token_reducer --cov-report=html

Code Quality

black src/ tests/
isort src/ tests/
ruff src/ tests/
mypy src/

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use Token Reducer in your research, please cite:

@software{token_reducer,
  title = {Token Reducer: Intelligent Token Reduction for LLM Applications},
  author = {Tufail, Usama},
  year = {2025},
  version = {0.2.0},
  url = {https://github.com/UsamaTufail31/token-reducer}
}

Support


Created by Usama Tufail | GitHub | PyPI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

token_reducer-0.2.1.tar.gz (51.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

token_reducer-0.2.1-py3-none-any.whl (57.3 kB view details)

Uploaded Python 3

File details

Details for the file token_reducer-0.2.1.tar.gz.

File metadata

  • Download URL: token_reducer-0.2.1.tar.gz
  • Upload date:
  • Size: 51.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for token_reducer-0.2.1.tar.gz
Algorithm Hash digest
SHA256 44b8b22f4194f8c2ec67c56d33825c9dca96e5f91d3615bb56c855d5c05dd509
MD5 0da9728617ec0e748d89e1cd3e616e0c
BLAKE2b-256 e52bfdadd541a4b50375cef4a6931d217190c88346df84ff682d6306d4bd1f9c

See more details on using hashes here.

File details

Details for the file token_reducer-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: token_reducer-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 57.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for token_reducer-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 12b78d900a705920642f8b92b622a0d749d7eb1cf5992452a249e8733c87533a
MD5 7f970f2eceb16515e9794f0a41cbad07
BLAKE2b-256 bf1ae5d109fda80ca648e1de284d899466eb81f72df0004441b7355daa1e9e7b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page