Slash LLM costs with intelligent context compression, smart routing, and cost tracking
Project description
TokenPak — Cut your LLM token spend by 30–50%, zero config
TokenPak is a local proxy that compresses your LLM context before it hits the API — fewer tokens, lower cost, same results. No code changes, no cloud, no credentials stored.
30-second demo
pip install tokenpak
tokenpak serve # start proxy at localhost:8766
tokenpak integrate claude-code --apply # wire Claude Code to the proxy
✅ Applied: Updated ~/.claude/settings.json (2 changes).
Then verify it's working:
tokenpak demo
┌──────────────────────────────────────────────────────┐
│ TokenPak — Live Compression Demo │
├──────────────────────────────────────────────────────┤
│ Scenario DevOps agent (config + logs) │
│ Savings drivers dedup + alias │
├──────────────────────────────────────────────────────┤
│ Original 747 tokens │
│ Compressed 502 tokens │
│ Saved 245 tokens (32.8%) │
│ Cost saved (est.) $0.00073 per call │
├──────────────────────────────────────────────────────┤
│ Stages: dedup, alias, segmentize, directives │
└──────────────────────────────────────────────────────┘
Works with
Claude Code · Cursor · Cline · Continue.dev · Aider · OpenAI SDK · Anthropic SDK · LiteLLM · Codex
Run tokenpak integrate to see the full client list with setup guides for each.
Install
pip install tokenpak
See docs/quickstart.md for virtual-env setup and per-client configuration.
Requirements: Python 3.10+. No external dependencies for core functionality.
Exposing the proxy beyond 127.0.0.1? Set TOKENPAK_PROXY_AUTH_TOKEN to a
shared secret to require Authorization: Bearer <token> on remote requests
(see docs/configuration/proxy-auth.md).
What's included (Free)
- Context compression — 30–50% token reduction on real agent workloads, <50ms latency
Reproduce:
make benchmark-headline - Client integration — one command wires Claude Code, Cursor, Aider, and 6 other clients
- Model routing — send requests to the right model automatically, with fallback rules
- Cost tracking — per model, per session, per agent; local SQLite, zero cloud
- TIP Spend Guard — pre-send circuit breaker; blocks runaway requests before provider call. Yes/No release or
[TIP: allow=once max=$X]directive. Catches both single-request spikes and the death-by-1000-cuts pattern via session-cumulative tracking. See docs/spend-guard.md. - Vault indexing + semantic search — index your codebase; search without an LLM call
- MultiPak Pro Phase 1 OSS surface — read-only Vault Pak adapter, companion journal promotion-candidate marking,
tokenpak pakCLI,/pak/v1/*proxy stubs. Full MultiPak (capture pipeline, recall ranking, Handoff Paks, anchor hydration) requirestokenpak-paid(Pro). See docs/multipak.md. - CLI + proxy server —
tokenpak serve,tokenpak cost,tokenpak savings - A/B testing and replay/debug — compare compression configs, replay past requests
- 50 built-in compression recipes — YAML, customizable
80%+ of operations cost zero tokens. See docs/quickstart.md and docs/api-tpk-v1.md to get started.
Pricing
| Free | Pro | Team | |
|---|---|---|---|
| Context compression | ✅ | ✅ | ✅ |
| Client integration (all 9) | ✅ | ✅ | ✅ |
| Model routing | ✅ | ✅ | ✅ |
| Cost tracking | ✅ | ✅ | ✅ |
| Vault indexing + search | ✅ | ✅ | ✅ |
| CLI + proxy | ✅ | ✅ | ✅ |
| Advanced compression recipes | — | ✅ | ✅ |
| Budget enforcement + alerts | — | ✅ | ✅ |
| Priority support | — | ✅ | ✅ |
| Multi-agent coordination | — | — | ✅ |
| Shared vault (team) | — | — | ✅ |
| RBAC + audit logs | — | — | ✅ |
| Price | Free | $99/mo | $299/mo |
See tokenpak.ai/pricing for full tier details and enterprise options.
Support
- Docs: docs/quickstart.md · API reference
- Issues: github.com/tokenpak/tokenpak/issues
- Discussions: github.com/tokenpak/tokenpak/discussions
- Email: hello@tokenpak.ai
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokenpak-1.5.6.tar.gz.
File metadata
- Download URL: tokenpak-1.5.6.tar.gz
- Upload date:
- Size: 2.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d509835c0a5623704c4ac9ce78eea699602af6603efcd2b715e0c07336e8633e
|
|
| MD5 |
01b53d1c1f172e9e77def330cd9b334e
|
|
| BLAKE2b-256 |
437052fb2f1d1d58d084fd5e98bcaae3ae18ba464213a1a077dc07b012c617ec
|
Provenance
The following attestation bundles were made for tokenpak-1.5.6.tar.gz:
Publisher:
release.yml on tokenpak/tokenpak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokenpak-1.5.6.tar.gz -
Subject digest:
d509835c0a5623704c4ac9ce78eea699602af6603efcd2b715e0c07336e8633e - Sigstore transparency entry: 1502923733
- Sigstore integration time:
-
Permalink:
tokenpak/tokenpak@41705d8a984e8a3374728a6311898e3c098e4c89 -
Branch / Tag:
refs/tags/v1.5.6 - Owner: https://github.com/tokenpak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@41705d8a984e8a3374728a6311898e3c098e4c89 -
Trigger Event:
push
-
Statement type:
File details
Details for the file tokenpak-1.5.6-py3-none-any.whl.
File metadata
- Download URL: tokenpak-1.5.6-py3-none-any.whl
- Upload date:
- Size: 2.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d660bc10b2eac28bc80aed8ada6eda8809487527c2b5ebbb3f14961cf098e95
|
|
| MD5 |
041dd0ef38942c026f6c5d9b11c3c975
|
|
| BLAKE2b-256 |
34ae5b9c9318e9a36e6d817f310437e0410ebabf7420374203778576a4e05da9
|
Provenance
The following attestation bundles were made for tokenpak-1.5.6-py3-none-any.whl:
Publisher:
release.yml on tokenpak/tokenpak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokenpak-1.5.6-py3-none-any.whl -
Subject digest:
9d660bc10b2eac28bc80aed8ada6eda8809487527c2b5ebbb3f14961cf098e95 - Sigstore transparency entry: 1502923807
- Sigstore integration time:
-
Permalink:
tokenpak/tokenpak@41705d8a984e8a3374728a6311898e3c098e4c89 -
Branch / Tag:
refs/tags/v1.5.6 - Owner: https://github.com/tokenpak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@41705d8a984e8a3374728a6311898e3c098e4c89 -
Trigger Event:
push
-
Statement type: