Slash LLM costs with intelligent context compression, smart routing, and cost tracking

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

TokenPak — Cut your LLM token spend by 30–50%. One command to configure your LLM proxy.

TokenPak is a local proxy that compresses your LLM context before it hits the API — fewer tokens, lower cost, same results. No code changes, no cloud, no credentials stored.

Status: early preview. Core compression engine and proxy are in place. tokenpak setup is the interactive wizard that detects your API keys, picks a compression profile, and starts the proxy. Per-client auto-integration (the forthcoming tokenpak integrate command) is not yet shipped — after tokenpak setup runs, point your client at http://127.0.0.1:8766 via the one-line export below. See QUICKSTART at https://github.com/tokenpak/docs (rendered at tokenpak.ai/quickstart).

Quick start

pip install tokenpak
tokenpak setup                      # interactive wizard — detects keys, picks a profile, starts the proxy

Then point your LLM client at the proxy with one env var. For the Anthropic SDK:

export ANTHROPIC_BASE_URL=http://127.0.0.1:8766

Or for OpenAI-compatible clients:

export OPENAI_BASE_URL=http://127.0.0.1:8766

Then use your client normally. TokenPak compresses requests on the way out and logs savings to a local SQLite ledger.

If you prefer manual configuration (no wizard), tokenpak start brings the proxy up with defaults and you set ANTHROPIC_BASE_URL / OPENAI_BASE_URL yourself.

Reproduce the 30–50% headline claim locally: make benchmark-headline.

See QUICKSTART at https://github.com/tokenpak/docs (rendered at tokenpak.ai/quickstart) for per-client setup (Claude Code, Cursor, Aider, and others).

What savings look like

After a few proxied requests, tokenpak savings reports the cumulative reduction:

┌──────────────────────────────────────────────────────┐
│  TokenPak — Savings                                  │
├──────────────────────────────────────────────────────┤
│  Sample scenario       DevOps agent (config + logs)  │
│  Savings drivers                      dedup + alias  │
├──────────────────────────────────────────────────────┤
│  Original                                747 tokens  │
│  Compressed                              502 tokens  │
│  Saved                          245 tokens  (32.8%)  │
│  Cost saved (est.)                $0.00073 per call  │
├──────────────────────────────────────────────────────┤
│  Stages: dedup, alias, segmentize, directives        │
└──────────────────────────────────────────────────────┘

Actual numbers depend on your workload. Agent-style prompts with lots of repeated context see the biggest gains.

Works with

Any LLM client that respects a custom base URL:

Claude Code · Cursor · Cline · Continue.dev · Aider · OpenAI SDK · Anthropic SDK · LiteLLM · Codex

Per-client configuration steps are in QUICKSTART at https://github.com/tokenpak/docs (rendered at tokenpak.ai/quickstart). Auto-wiring via a single tokenpak integrate <client> command is tracked for a future release.

Install

pip install tokenpak

TokenPak's runtime dependencies include anthropic, openai, fastapi, flask, litellm, llmlingua, pandas, pydantic, requests, rich, scipy, sentence-transformers, tree-sitter-languages, watchdog, and a few others — all installed automatically. Note that sentence-transformers and scipy are large (several hundred MB of dependencies); expect pip install to take a few minutes on first install.

Requires Python 3.10+.

See QUICKSTART at https://github.com/tokenpak/docs (rendered at tokenpak.ai/quickstart) for virtual-env setup and first-run details.

What's included

Context compression — deterministic pipeline (dedup → alias → segmentize → directives); typical 30–50% token reduction on agent workloads.
Local proxy — runs at 127.0.0.1:8766; zero cloud component.
Model routing — configurable rules with fallback chains.
Cost & savings tracking — per model, per session, per agent; local SQLite (~/.tokenpak/monitor.db).
Dashboard — local web UI for visualizing savings (tokenpak dashboard).
Vault indexing + semantic search — index a directory; search without an LLM call.
A/B testing and request replay — compare compression configs; re-run past requests.
50 built-in compression recipes — YAML, customizable.

See QUICKSTART at https://github.com/tokenpak/docs (rendered at tokenpak.ai/quickstart) and API reference at https://github.com/tokenpak/docs (rendered at tokenpak.ai/api) to get started.

Pro tier

Pro adds team-scale features on top of the OSS core: shared multi-seat dashboards, advanced routing policies, enterprise credential management, priority support, and budget enforcement (429 budget_exceeded on configured caps). It ships as the tokenpak-paid package, distributed through a private license-gated index at pypi.tokenpak.ai. See tokenpak.ai/paid to request access.

Installing the Pro package (after you have a license key):

pip install --index-url https://pypi.tokenpak.ai --extra-index-url https://pypi.org/simple tokenpak-paid
tokenpak activate <your-license-key>

Running pip install tokenpak-paid-stub from public PyPI fetches a discovery stub that prints these install instructions — so pip works as a learning path, not a dead end. The real paid code stays license-gated.

Current limitations

Honest about what isn't ready yet:

No tokenpak integrate <client> auto-wire command — configure clients by env var as shown above. Auto-wire is planned.
No published CI/CD — releases are manual; automation is tracked in the release-workflow standards.
tokenpak demo is a compression-recipes demo (shows recipes applied to a sample input), not the decorated savings panel above. The panel shows what tokenpak savings output can look like after real usage.

We'd rather ship an honest preview than an advertised product that doesn't match install-time reality.

Non-localhost access

TokenPak's default is localhost-only. If you want to expose the proxy to other machines on your LAN, set an auth token:

export TOKENPAK_PROXY_AUTH_TOKEN=$(openssl rand -hex 32)
tokenpak start                # or tokenpak setup for first-time config

Clients then include the token on non-localhost requests:

X-TokenPak-Auth: <your-token>

Localhost (127.0.0.1, ::1) traffic bypasses auth — your local tools keep working without changes. Non-localhost requests without the env var return 403 forbidden; requests with a missing or wrong header return 401 unauthorized. The token is stripped from the request before any upstream forward, so provider APIs (Anthropic, OpenAI, etc.) never see it.

Support

Docs: QUICKSTART at https://github.com/tokenpak/docs (rendered at tokenpak.ai/quickstart) · API reference at https://github.com/tokenpak/docs (rendered at tokenpak.ai/api) · FAQ at https://github.com/tokenpak/docs (rendered at tokenpak.ai/faq)
Issues: github.com/tokenpak/tokenpak/issues
Discussions: github.com/tokenpak/tokenpak/discussions
Email: hello@tokenpak.ai

License

Apache 2.0. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

TokenPak

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.3.22

Apr 25, 2026

1.3.20

Apr 24, 2026

1.3.18

Apr 24, 2026

1.3.17

Apr 24, 2026

1.3.16

Apr 24, 2026

1.3.15

Apr 24, 2026

1.3.14

Apr 24, 2026

1.3.13

Apr 24, 2026

1.3.12

Apr 24, 2026

1.3.11

Apr 23, 2026

1.3.10

Apr 23, 2026

1.3.9

Apr 23, 2026

This version

1.3.8

Apr 23, 2026

1.3.7

Apr 23, 2026

1.3.6

Apr 23, 2026

1.3.2

Apr 22, 2026

1.3.1

Apr 22, 2026

1.3.0

Apr 22, 2026

1.2.93

Apr 22, 2026

1.2.92

Apr 22, 2026

1.2.91

Apr 22, 2026

1.2.9

Apr 22, 2026

1.2.8

Apr 22, 2026

1.2.7

Apr 22, 2026

1.2.6

Apr 22, 2026

1.2.5

Apr 22, 2026

1.2.4

Apr 22, 2026

1.2.3

Apr 22, 2026

1.2.2

Apr 22, 2026

1.2.1

Apr 21, 2026

1.2.0 yanked

Apr 21, 2026

Reason this release was yanked:

Release-blocking bugs; superseded by 1.2.1 (same day)

1.1.0

Apr 21, 2026

1.0.2 yanked

Mar 25, 2026

Reason this release was yanked:

will be republished soon

1.0.1 yanked

Mar 22, 2026

Reason this release was yanked:

will be republished soon

1.0.0 yanked

Mar 22, 2026

Reason this release was yanked:

will be republished soon

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenpak-1.3.8.tar.gz (1.3 MB view details)

Uploaded Apr 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tokenpak-1.3.8-py3-none-any.whl (1.6 MB view details)

Uploaded Apr 23, 2026 Python 3

File details

Details for the file tokenpak-1.3.8.tar.gz.

File metadata

Download URL: tokenpak-1.3.8.tar.gz
Upload date: Apr 23, 2026
Size: 1.3 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for tokenpak-1.3.8.tar.gz
Algorithm	Hash digest
SHA256	`3531c668cafbeae284ae9a899d514dd524bed44b8ee21098aebde089312925c6`
MD5	`a14a1b4f7d8a1b9685c000958587786c`
BLAKE2b-256	`87f46a2ed47c460ffe852d0dc21c852ee65552c4c6bcde36bf11a2a18f79cded`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenpak-1.3.8.tar.gz:

Publisher: release.yml on tokenpak/tokenpak

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tokenpak-1.3.8.tar.gz
- Subject digest: 3531c668cafbeae284ae9a899d514dd524bed44b8ee21098aebde089312925c6
- Sigstore transparency entry: 1365004789
- Sigstore integration time: Apr 23, 2026
Source repository:
- Permalink: tokenpak/tokenpak@2c50a8d8cc57d5fda88edff79967e5baaf46663c
- Branch / Tag: refs/tags/v1.3.8
- Owner: https://github.com/tokenpak
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@2c50a8d8cc57d5fda88edff79967e5baaf46663c
- Trigger Event: push

File details

Details for the file tokenpak-1.3.8-py3-none-any.whl.

File metadata

Download URL: tokenpak-1.3.8-py3-none-any.whl
Upload date: Apr 23, 2026
Size: 1.6 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for tokenpak-1.3.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`14f77f24e8cd5ad1d7245ef0532a80b6987dfa0e1cfbba3dc7be11f0b94c2bba`
MD5	`81e2fce9a0b36f7dc42142f3d0dbbb58`
BLAKE2b-256	`d235293ffb0d173cc3bd97b8967529456829dcee62d025fb2adbe5a6cacfba4c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenpak-1.3.8-py3-none-any.whl:

Publisher: release.yml on tokenpak/tokenpak

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tokenpak-1.3.8-py3-none-any.whl
- Subject digest: 14f77f24e8cd5ad1d7245ef0532a80b6987dfa0e1cfbba3dc7be11f0b94c2bba
- Sigstore transparency entry: 1365004800
- Sigstore integration time: Apr 23, 2026
Source repository:
- Permalink: tokenpak/tokenpak@2c50a8d8cc57d5fda88edff79967e5baaf46663c
- Branch / Tag: refs/tags/v1.3.8
- Owner: https://github.com/tokenpak
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@2c50a8d8cc57d5fda88edff79967e5baaf46663c
- Trigger Event: push

tokenpak 1.3.8

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

TokenPak — Cut your LLM token spend by 30–50%. One command to configure your LLM proxy.

Quick start

What savings look like

Works with

Install

What's included

Pro tier

Current limitations

Non-localhost access

Support

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance