Cut LLM costs with deterministic context compression, smart routing, and cost tracking
Project description
TokenPak — Cut your LLM token spend with deterministic context compression, zero config
TokenPak is a local proxy that compresses your LLM context before it hits the API — fewer tokens, lower cost, no code changes, no cloud, no credentials stored.
30-second demo
pip install tokenpak
tokenpak serve # start proxy at localhost:8766
tokenpak integrate claude-code --apply # wire Claude Code to the proxy
✅ Applied: Updated ~/.claude/settings.json (2 changes).
Then verify it's working:
tokenpak demo
┌──────────────────────────────────────────────────────┐
│ TokenPak — Live Compression Demo (illustrative) │
├──────────────────────────────────────────────────────┤
│ Scenario DevOps agent (config + logs) │
│ Savings drivers dedup + alias │
├──────────────────────────────────────────────────────┤
│ Original 747 tokens │
│ Compressed 502 tokens │
│ Fewer tokens 245 tokens │
├──────────────────────────────────────────────────────┤
│ Stages: dedup, alias, segmentize, directives │
└──────────────────────────────────────────────────────┘
Illustrative fixture — token counts vary by workload. Measure your own with
tokenpak savings; receipt-backed ranges publish once the benchmark lane lands.
Works with
Claude Code · Cursor · Cline · Continue.dev · Aider · OpenAI SDK · Anthropic SDK · LiteLLM · Codex
Run tokenpak integrate to see the full client list with setup guides for each.
Install
pip install tokenpak
See docs/quickstart.md for virtual-env setup and per-client configuration.
Requirements: Python 3.10+. No external dependencies for core functionality.
Exposing the proxy beyond 127.0.0.1? Set TOKENPAK_PROXY_AUTH_TOKEN to a
shared secret to require Authorization: Bearer <token> on remote requests
(see docs/configuration/proxy-auth.md).
What's included (Free)
- Context compression — deterministic token reduction on real agent workloads, <50ms latency. Measure your own savings with
tokenpak savings(reproduce the headline benchmark withmake benchmark-headline) - Client integration — one command wires Claude Code, Cursor, Aider, and 6 other clients
- Model routing — send requests to the right model automatically, with fallback rules
- Cost tracking — per model, per session, per agent; local SQLite, zero cloud
- TIP Spend Guard — pre-send circuit breaker; blocks runaway requests before provider call. Yes/No release or
[TIP: allow=once max=$X]directive. Catches both single-request spikes and the death-by-1000-cuts pattern via session-cumulative tracking. See docs/spend-guard.md. - Vault indexing + semantic search — index your codebase; search without an LLM call
- MultiPak Pro Phase 1 OSS surface — read-only Vault Pak adapter, companion journal promotion-candidate marking,
tokenpak pakCLI,/pak/v1/*proxy stubs. Full MultiPak (capture pipeline, recall ranking, Handoff Paks, anchor hydration) requirestokenpak-paid(Pro). See docs/multipak.md. - CLI + proxy server —
tokenpak serve,tokenpak cost,tokenpak savings - A/B testing and replay/debug — compare compression configs, replay past requests
- 50 built-in compression recipes — YAML, customizable
Repeated context is reused from cache instead of re-sent on every call. See docs/quickstart.md and docs/api-tpk-v1.md to get started.
Open source & editions
TokenPak's core is Apache-2.0 open source; TokenPak Pro and hosted services are proprietary. Commercial packaging is not published yet.
Support
- Docs: docs/quickstart.md · API reference
- Issues: github.com/tokenpak/tokenpak/issues
- Discussions: github.com/tokenpak/tokenpak/discussions
- Email: hello@tokenpak.ai
License
The TokenPak open-source core is licensed under the Apache License 2.0 — see LICENSE. TokenPak Pro and hosted services are proprietary.
Trademark
"TokenPak", the TokenPak name, logo, and brand assets are trademarks of TokenPak and are not licensed under Apache-2.0 (Apache-2.0 §6 grants no trademark rights). Nominative and reference use — for example "works with TokenPak" or "a plugin for TokenPak" — is fine. Using the name or logo in a way that implies endorsement, sponsorship, or affiliation, or naming a fork, product, or service "TokenPak" (or something confusingly similar), is not.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokenpak-1.8.0.tar.gz.
File metadata
- Download URL: tokenpak-1.8.0.tar.gz
- Upload date:
- Size: 2.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
837e8ed5c8e9d51094604d0e904e7968ce6ed6d712f30daa03f35af5c42f6e79
|
|
| MD5 |
ccc7b6cd5d751872a6432daf2e4c57f6
|
|
| BLAKE2b-256 |
3073183358a1e57df8a70d566d4501f2ac782d4532742d342c309ed39d9fc8b4
|
Provenance
The following attestation bundles were made for tokenpak-1.8.0.tar.gz:
Publisher:
release.yml on tokenpak/tokenpak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokenpak-1.8.0.tar.gz -
Subject digest:
837e8ed5c8e9d51094604d0e904e7968ce6ed6d712f30daa03f35af5c42f6e79 - Sigstore transparency entry: 1749784970
- Sigstore integration time:
-
Permalink:
tokenpak/tokenpak@7bb494dfcb389b37cd30e75662109ce64126a466 -
Branch / Tag:
refs/tags/v1.8.0 - Owner: https://github.com/tokenpak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7bb494dfcb389b37cd30e75662109ce64126a466 -
Trigger Event:
push
-
Statement type:
File details
Details for the file tokenpak-1.8.0-py3-none-any.whl.
File metadata
- Download URL: tokenpak-1.8.0-py3-none-any.whl
- Upload date:
- Size: 2.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f5ed9edbd3c1c5d50c77abbc7109dabac9af83e10a9b026ed9ff4f30914683e
|
|
| MD5 |
b0e0a526ca90ca7497661199bf0157ae
|
|
| BLAKE2b-256 |
f73a9985646ed77d793c3787ad9802dc5d9fd8884ca1710d518ccede1e396af7
|
Provenance
The following attestation bundles were made for tokenpak-1.8.0-py3-none-any.whl:
Publisher:
release.yml on tokenpak/tokenpak
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokenpak-1.8.0-py3-none-any.whl -
Subject digest:
5f5ed9edbd3c1c5d50c77abbc7109dabac9af83e10a9b026ed9ff4f30914683e - Sigstore transparency entry: 1749785054
- Sigstore integration time:
-
Permalink:
tokenpak/tokenpak@7bb494dfcb389b37cd30e75662109ce64126a466 -
Branch / Tag:
refs/tags/v1.8.0 - Owner: https://github.com/tokenpak
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@7bb494dfcb389b37cd30e75662109ce64126a466 -
Trigger Event:
push
-
Statement type: