TokenJam — local-first OTel-native observability for Autonomous AI agents
Project description
TokenJam
Token Efficiency For AI Agents
TokenJam reads your agent's telemetry and tells you when to downsize, when to trim prompts, what to cache, and what to script. The result is a lower AI bill. Runs entirely on your machine.
pipx install tokenjam
Don't have pipx? brew install pipx on macOS, apt install pipx on Debian/Ubuntu, or see docs/installation.md. pip install tokenjam also works in a clean venv.
No cloud · No signup · No vendor lock-in
Four Analyzers. One Install.
TokenJam reads telemetry from every major agent runtime, framework, provider, and observability tool and surfaces savings across four areas.
🪶 DownsizeFlags sessions where a cheaper model in the same family is worth a look. Never claims quality equivalence — surfaces examples so you can spot-check.
|
💾 CacheShows your current caching ratio per (provider, model) and suggests Anthropic prompt-cache breakpoints from stable prefixes in your real usage.
|
📜 ScriptFinds clusters of deterministic
|
✂️ TrimPredicts which regions of your prompts the model gives little weight to. Surfaces what's safe to cut.
|
Run all four with tj optimize. Run several with tj optimize downsize cache trim.
30-second quickstart
For Claude Code users — zero code, auto-backfills your last 30 days:
pipx install 'tokenjam[mcp]'
tj onboard --claude-code
tj optimize # cost-saving candidates from your actual usage
To upgrade later: pipx upgrade tokenjam (then tj stop && tj serve & to reload the daemon, and tj --version to verify). See docs/installation.md.
For any Python agent:
from tokenjam.sdk import watch
from tokenjam.sdk.integrations.anthropic import patch_anthropic
patch_anthropic()
@watch(agent_id="my-agent")
def run(task: str) -> str:
...
→ Python SDK · TypeScript SDK · Codex · OTel-compatible agents
Why local-first matters
Your spans contain prompts, completions, tool inputs, and customer data. Shipping that to a SaaS vendor for "observability" is a data-egress decision most teams aren't ready to make.
| TokenJam | LangSmith | Langfuse | Datadog LLM Obs | |
|---|---|---|---|---|
| Signup required | ❌ | ✅ | ✅ | ✅ |
| Data leaves your machine | ❌ | ✅ | cloud only | ✅ |
| Cost-optimization analyzers (Downsize, Cache, Script, Trim) | ✅ | ❌ | ❌ | ❌ |
| Real-time sensitive-action alerts | ✅ | ❌ | ❌ | ❌ |
| Behavioral drift detection | ✅ | ❌ | ❌ | ❌ |
| OTel GenAI SemConv native | ✅ | partial | partial | partial |
| Works with any agent / framework | ✅ | LangChain-first | partial | ❌ |
| Free, MIT licensed | ✅ | freemium | freemium | paid |
Web UI
tj serve runs a local dashboard at http://127.0.0.1:7391/ with status, traces, cost breakdown, alerts, budget, and drift.
Beyond optimization
TokenJam is also a full observability stack. The four analyzers ride on top.
- Real-time cost tracking — every LLM call priced as it happens
- Safety alerts — 13 alert types, 6 channels (ntfy, Discord, Telegram, webhook, file, stdout)
- Behavioral drift detection — Z-score baselines, no LLM required
- Schema validation — declare or infer JSON Schema for tool outputs
- OTel-native — point any OTLP exporter at
tj serveand you're done - MCP server — 14 tools letting Claude Code query its own telemetry mid-session
CLI
tj optimize # all four cost-optimization analyzers
tj optimize downsize # one analyzer
tj status # current cost, tokens, active alerts
tj cost --since 7d # spend by agent / model / day / tool
tj alerts # everything that fired while you were away
tj drift # behavioral drift Z-scores
tj backfill claude-code # ingest historical ~/.claude/projects/ sessions
tj serve # start the web UI + REST API
Documentation
| Topic | Where |
|---|---|
| 🪶 Downsize / Cache / Script / Trim deep-dives | docs/optimize/ |
| Claude Code & Codex integration | docs/claude-code-integration.md |
| Python SDK reference | docs/python-sdk.md |
| TypeScript SDK reference | docs/typescript-sdk.md |
| Framework support (LangChain / CrewAI / etc.) | docs/framework-support.md |
| Alert channels & rule reference | docs/alerts.md |
| Backfill from Langfuse / Helicone / OTLP | docs/backfill/ |
| Configuration | docs/configuration.md |
| Architecture deep-dive | docs/architecture.md |
| Installation extras (Trim, framework patches) | docs/installation.md |
| Export to Grafana / Datadog / NDJSON | docs/export.md |
| NemoClaw sandbox observer | docs/nemoclaw-integration.md |
Roadmap
Shipped in 0.3.x: Downsize · Cache · Script · Trim · Claude Code + Codex onboarding · MCP server · Web UI · Backfill adapters (Langfuse, Helicone, OTLP) · Period comparison · Routing-config export · Read-only policy preview
Up next:
-
tj policy add | edit | apply— unified rule surface -
tj replay— replay captured sessions against new model versions - TypeScript framework patches (LangChain JS, OpenAI Agents SDK)
- Vercel AI SDK & Mastra integrations
- Docker image
- GitHub Actions for CI drift/cost checks
tokenjam.dev · PyPI · npm · Issues
MIT License · Built by Metabuilder Labs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokenjam-0.3.5.tar.gz.
File metadata
- Download URL: tokenjam-0.3.5.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8dc8ef21e1549eb2bb86f5e13d367d448dfe198020b715afa8c39192c6675fd
|
|
| MD5 |
4b96fbb30ff6220968da965720566dc6
|
|
| BLAKE2b-256 |
cac085d5e5c21a40b8d33a1e762b30aa04b01a6ba435d59b3a627d01ad0039ef
|
Provenance
The following attestation bundles were made for tokenjam-0.3.5.tar.gz:
Publisher:
publish-pypi.yml on Metabuilder-Labs/tokenjam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokenjam-0.3.5.tar.gz -
Subject digest:
e8dc8ef21e1549eb2bb86f5e13d367d448dfe198020b715afa8c39192c6675fd - Sigstore transparency entry: 1830547224
- Sigstore integration time:
-
Permalink:
Metabuilder-Labs/tokenjam@fbafb6f35ce27c04c157cbddf6be338dafd19cf6 -
Branch / Tag:
refs/tags/v0.3.5 - Owner: https://github.com/Metabuilder-Labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@fbafb6f35ce27c04c157cbddf6be338dafd19cf6 -
Trigger Event:
release
-
Statement type:
File details
Details for the file tokenjam-0.3.5-py3-none-any.whl.
File metadata
- Download URL: tokenjam-0.3.5-py3-none-any.whl
- Upload date:
- Size: 256.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3aa52f5537ff05af55498c6166c3ef4d93a2c266a2c986de8661d744a5a1e479
|
|
| MD5 |
f46a01aa845fc17d0c69a1ebd1fd726a
|
|
| BLAKE2b-256 |
f5980e2e95077447ec7606b2eeaba2e95e3b0d6823269476a72d4847f200fb13
|
Provenance
The following attestation bundles were made for tokenjam-0.3.5-py3-none-any.whl:
Publisher:
publish-pypi.yml on Metabuilder-Labs/tokenjam
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tokenjam-0.3.5-py3-none-any.whl -
Subject digest:
3aa52f5537ff05af55498c6166c3ef4d93a2c266a2c986de8661d744a5a1e479 - Sigstore transparency entry: 1830547573
- Sigstore integration time:
-
Permalink:
Metabuilder-Labs/tokenjam@fbafb6f35ce27c04c157cbddf6be338dafd19cf6 -
Branch / Tag:
refs/tags/v0.3.5 - Owner: https://github.com/Metabuilder-Labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@fbafb6f35ce27c04c157cbddf6be338dafd19cf6 -
Trigger Event:
release
-
Statement type: