Skip to main content

LLM token optimization CLI - compress prompts, save costs

Project description

TokenCrush

LLM token optimization CLI - compress prompts, save costs.

MIT License Python 3.10+

TokenCrush uses LLMLingua-2 to compress your prompts before sending them to LLMs, reducing token usage by 50-80% while preserving response quality.

Features

  • Prompt Compression: Reduce tokens by 2-5x using Microsoft's LLMLingua-2
  • Multi-Provider Support: Works with OpenAI, Anthropic (Claude), and Google (Gemini)
  • Simple CLI: One command to compress and chat
  • Cost Savings: Significantly reduce API costs

Installation

pip install tokencrush

Or with uv:

uv add tokencrush

Quick Start

1. Configure API Keys

# Set API keys (stored in ~/.config/tokencrush/config.toml)
tokencrush config set openai sk-your-openai-key
tokencrush config set anthropic sk-ant-your-anthropic-key
tokencrush config set google your-google-api-key

# Or use environment variables
export OPENAI_API_KEY=sk-your-key

2. Compress a Prompt

tokencrush compress "Your very long prompt text here..."

Output:

┌─────────────────────────────────────┐
│        Compression Result           │
├──────────────────┬──────────────────┤
│ Original Tokens  │ 500              │
│ Compressed Tokens│ 150              │
│ Compression Ratio│ 30.0%            │
│ Tokens Saved     │ 350              │
└──────────────────┴──────────────────┘

3. Chat with Compression

# Compress and send to GPT-4
tokencrush chat "Explain quantum computing in detail" --model gpt-4

# Use Claude
tokencrush chat "Write a poem" --model claude-3-sonnet

# Use Gemini
tokencrush chat "Summarize this" --model gemini-1.5-pro

CLI Reference

tokencrush compress

Compress text to reduce tokens.

tokencrush compress <text> [OPTIONS]

Options:
  -r, --rate FLOAT    Compression rate (0.0-1.0). Default: 0.5
  -g, --gpu           Use GPU for faster compression

tokencrush chat

Compress and send to LLM.

tokencrush chat <prompt> [OPTIONS]

Options:
  -m, --model TEXT    Model to use. Default: gpt-4
  -r, --rate FLOAT    Compression rate. Default: 0.5
  -g, --gpu           Use GPU for compression
  --no-compress       Skip compression

tokencrush config

Manage API keys.

tokencrush config set <provider> <key>   # Save API key
tokencrush config show                    # Show configured keys

Supported Models

Provider Models
OpenAI gpt-4, gpt-4-turbo, gpt-3.5-turbo
Anthropic claude-3-opus, claude-3-sonnet, claude-3-haiku
Google gemini-1.5-pro, gemini-1.5-flash

How It Works

  1. Compression: TokenCrush uses LLMLingua-2's trained model to identify and remove redundant tokens while preserving meaning.
  2. API Call: The compressed prompt is sent to your chosen LLM provider.
  3. Response: You get the same quality response with fewer input tokens = lower cost.

Development

# Clone the repository
git clone https://github.com/xodn348/tokencrush.git
cd tokencrush

# Install with dev dependencies
uv sync --all-extras

# Run tests
uv run pytest -v

# Build package
uv build

License

MIT License - see LICENSE for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokencrush-0.1.0.tar.gz (188.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokencrush-0.1.0-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file tokencrush-0.1.0.tar.gz.

File metadata

  • Download URL: tokencrush-0.1.0.tar.gz
  • Upload date:
  • Size: 188.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for tokencrush-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b184f344d738ac2e2dd65cbe95f0f59808b7cc9997f70cc70e0a8c4bf76b0f94
MD5 c75db37679b335cef0cc9cd32b96dbd7
BLAKE2b-256 7b70d55d97797c7a4970031d42d956d164df353eab51f2d98836f8ddace21181

See more details on using hashes here.

File details

Details for the file tokencrush-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tokencrush-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for tokencrush-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b3a07677e4bf5e177333ab070ffd9764bea97cc096b330cc1d45608e0236d1ac
MD5 7c4ab78ce5d80b60517434c6f8fffb29
BLAKE2b-256 c6e2debad9fc8ee8703e93e954ecbc0511712cb73ad994caf722887e24d2ad9e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page