Intelligent token reduction library for LLM applications with context-aware compression

These details have not been verified by PyPI

Project links

Project description

Token Reducer

Intelligent token reduction library for LLM applications with context-aware compression

Overview

Token Reducer is a Python library designed to reduce token counts in text and code inputs for Large Language Model (LLM) applications while preserving semantic meaning, logical structure, and task-relevant information. Achieve 50-70% token reduction without distorting facts or breaking code logic.

Key Features

🎯 Context-Aware Compression: Task-specific strategies for summarization, RAG, extraction, reasoning, and more
📊 Multi-Level Compression: Choose between light (5-15%), moderate (20-40%), or aggressive (50-70%) reduction
🔄 Multi-Pass Pipeline: Specialized passes for optimal compression (normalize → prune → compress → summarize → repack)
💻 Text & Code Support: Domain-specific compression for both natural language and source code
🔌 Tokenizer Agnostic: Works with OpenAI, Anthropic, HuggingFace, and custom tokenizers
🛡️ Fail-Safe Mode: Automatic quality validation with semantic similarity checking
⚡ High Performance: <100ms per 1000 tokens for text, <200ms for code
📴 Offline Operation: No cloud dependencies, works standalone

Installation

Basic Installation

pip install token-reducer

With Optional Dependencies

# For HuggingFace tokenizers
pip install token-reducer[transformers]

# For Anthropic tokenizers
pip install token-reducer[anthropic]

# For NLP features (entity extraction, sentence segmentation)
pip install token-reducer[nlp]

# For semantic similarity checking
pip install token-reducer[similarity]

# For code compression
pip install token-reducer[code]

# Install everything
pip install token-reducer[all]

Quick Start

Text Compression

from token_reducer import compress_text, TaskContext, CompressionLevel

# Compress text for summarization task
result = compress_text(
    text="Your long article text here...",
    task=TaskContext.SUMMARIZATION,
    level=CompressionLevel.MODERATE,
    tokenizer="gpt-4"
)

print(f"Original: {result.original_tokens} tokens")
print(f"Compressed: {result.compressed_tokens} tokens")
print(f"Reduction: {result.reduction_percentage}%")
print(f"\nCompressed text:\n{result.compressed_text}")

Code Compression

from token_reducer import compress_code, TaskContext, CompressionLevel

# Compress code for LLM context
result = compress_code(
    code="""
    def calculate_total_price(items, tax_rate=0.1):
        # Calculate subtotal
        subtotal = sum(item['price'] for item in items)
        # Apply tax
        tax = subtotal * tax_rate
        # Return total
        return subtotal + tax
    """,
    task=TaskContext.CODE_COMPLETION,
    level=CompressionLevel.AGGRESSIVE,
    language="python"
)

print(f"Compressed code:\n{result.compressed_code}")

Advanced Configuration

from token_reducer import CompressionConfig, compress_text

config = CompressionConfig(
    task=TaskContext.RAG,
    level=CompressionLevel.MODERATE,
    tokenizer="claude-3",
    preserve_entities=True,
    preserve_numbers=True,
    quality_threshold=0.90,
    enable_fail_safe=True
)

result = compress_text(text, config=config)

Compression Strategies

Task Types

Token Reducer adapts compression strategies based on your use case:

SUMMARIZATION: Preserves causal links and chronological order
RAG: Optimizes for retrieval context (entities, facts, key phrases)
EXTRACTION: Keeps only fields relevant to extraction target
REASONING: Preserves premises, key details, and logical connections
TRANSLATION: Bypasses compression entirely
CODE_COMPLETION: Preserves function signatures and interfaces
DEBUGGING: Maintains variable names and error-relevant context
QUESTION_ANSWERING: Preserves facts and entities for potential questions

Compression Levels

Level	Token Reduction	Semantic Similarity	Use Case
Light	5-15%	>98%	Maximum safety, minimal loss
Moderate	20-40%	>90%	Balanced compression and quality
Aggressive	50-70%	>80%	Maximum savings, acceptable loss

How It Works

Text Compression Pipeline

Normalize: Fix spacing, remove HTML, standardize quotes
Prune: Remove duplicates, redundancy, verbose explanations
Compress: Extract entities/facts, compact phrasing, reduce adjectives
Summarize: Apply task-specific tightening
Repack: Shorten sentences, optimize structure

Code Compression Pipeline

Remove Noise: Strip comments, blank lines, logging, debug prints
Rename Identifiers: Shorten variable/function/class names
Remove Unused: Eliminate dead code, unused imports/functions
Optimize Expressions: Simplify boolean/arithmetic expressions
Summarize Functions: Replace bodies with summaries (aggressive mode)
Reformat: Minimize whitespace, compact structure

Performance

Text: <100ms per 1000 tokens (excluding quality checks)
Code: <200ms per 1000 tokens (excluding quality checks)
Quality Check: <50ms for semantic similarity validation

Use Cases

Reduce LLM API Costs

# Before: 10,000 tokens × $0.03/1K = $0.30 per request
# After (60% reduction): 4,000 tokens × $0.03/1K = $0.12 per request
# Savings: 60% cost reduction

Fit More Context in Token Limits

# Compress multiple documents to fit in context window
from token_reducer import batch_compress_text

results = batch_compress_text(
    texts=[doc1, doc2, doc3, doc4, doc5],
    task=TaskContext.RAG,
    level=CompressionLevel.MODERATE,
    parallel=True
)

# Combine compressed documents within token limit
combined = "\n\n".join(r.compressed_text for r in results)

RAG Pipeline Optimization

# Compress retrieved documents before sending to LLM
retrieved_docs = vector_store.similarity_search(query, k=10)

compressed_docs = [
    compress_text(
        doc.page_content,
        task=TaskContext.RAG,
        level=CompressionLevel.MODERATE
    )
    for doc in retrieved_docs
]

# Use compressed docs in prompt
context = "\n\n".join(d.compressed_text for d in compressed_docs)

Documentation

Repository: https://github.com/UsamaTufail31/token-reducer
Issues: https://github.com/UsamaTufail31/token-reducer/issues
Examples: https://github.com/UsamaTufail31/token-reducer/tree/main/examples

Development

Setup Development Environment

# Clone repository
git clone https://github.com/UsamaTufail31/token-reducer.git
cd token-reducer

# Install with development dependencies
pip install -e ".[dev,all]"

# Install pre-commit hooks
pre-commit install

Run Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=token_reducer --cov-report=html

# Run specific test file
pytest tests/test_text_compression.py

Code Quality

# Format code
black src/ tests/

# Sort imports
isort src/ tests/

# Lint code
ruff src/ tests/

# Type check
mypy src/

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use Token Reducer in your research, please cite:

@software{token_reducer,
  title = {Token Reducer: Intelligent Token Reduction for LLM Applications},
  author = {Tufail, Usama},
  year = {2024},
  url = {https://github.com/UsamaTufail31/token-reducer}
}

Acknowledgments

Inspired by research in context compression and semantic similarity
Built with modern NLP libraries (spaCy, sentence-transformers, tiktoken)
Designed for the LLM application development community

Support

Issues: GitHub Issues
Discussions: GitHub Discussions

Created by Usama Tufail | GitHub

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Nov 22, 2025

This version

0.2.0

Nov 22, 2025

0.1.0

Nov 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

token_reducer-0.2.0.tar.gz (47.9 kB view details)

Uploaded Nov 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

token_reducer-0.2.0-py3-none-any.whl (55.5 kB view details)

Uploaded Nov 22, 2025 Python 3

File details

Details for the file token_reducer-0.2.0.tar.gz.

File metadata

Download URL: token_reducer-0.2.0.tar.gz
Upload date: Nov 22, 2025
Size: 47.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for token_reducer-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`3a5d1670aa52314961fe499324c1221bc4c8d64facfe44a9b67196bd1e0f322b`
MD5	`317526ee5415d9af7e8d462ea82c76ee`
BLAKE2b-256	`0a5e6132709faf02e3dc6399d8ab33d8082c9d7350750e71f7996a333f38befc`

See more details on using hashes here.

File details

Details for the file token_reducer-0.2.0-py3-none-any.whl.

File metadata

Download URL: token_reducer-0.2.0-py3-none-any.whl
Upload date: Nov 22, 2025
Size: 55.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for token_reducer-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a43c557ce8a86e0b45050eb3a5dc4b509af483a2d7be25e86efb2d11f9455a21`
MD5	`07752a033b2bca5d08aa1d3e6f98ddb0`
BLAKE2b-256	`7bebd7387819a6d7f26d997b6145b4e0e3a85eddcc74322671fa39cefbbad163`

See more details on using hashes here.

token-reducer 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Token Reducer

Overview

Key Features

Installation

Basic Installation

With Optional Dependencies

Quick Start

Text Compression

Code Compression

Advanced Configuration

Compression Strategies

Task Types

Compression Levels

How It Works

Text Compression Pipeline

Code Compression Pipeline

Performance

Use Cases

Reduce LLM API Costs

Fit More Context in Token Limits

RAG Pipeline Optimization

Documentation

Development

Setup Development Environment

Run Tests

Code Quality

Contributing

License

Citation

Acknowledgments

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes