Skip to main content

TokenJam — local-first OTel-native observability for Autonomous AI agents

Project description

TokenJam

TokenJam

Token Efficiency For AI Agents

TokenJam reads your agent's telemetry and tells you when to downsize, when to trim prompts, what to cache, and what to script. The result is a lower AI bill. Runs entirely on your machine.

CI PyPI Python npm License: MIT OTel

pipx install tokenjam

Don't have pipx? brew install pipx on macOS, apt install pipx on Debian/Ubuntu, or see docs/installation.md. pip install tokenjam also works in a clean venv.

No cloud · No signup · No vendor lock-in


Four Analyzers. One Install.

TokenJam reads telemetry from every major agent runtime, framework, provider, and observability tool and surfaces savings across four areas.

🪶 Downsize

Flags sessions where a cheaper model in the same family is worth a look. Never claims quality equivalence — surfaces examples so you can spot-check.

tj optimize downsize

Details →

💾 Cache

Shows your current caching ratio per (provider, model) and suggests Anthropic prompt-cache breakpoints from stable prefixes in your real usage.

tj optimize cache

Details →

📜 Script

Finds clusters of deterministic (tool_name, arg_shape) sequences that match the shape of work a plain script could replace.

tj optimize script

Details →

✂️ Trim

Predicts which regions of your prompts the model gives little weight to. Surfaces what's safe to cut.

tj optimize trim

Details →

Run all four with tj optimize. Run several with tj optimize downsize cache trim.


30-second quickstart

For Claude Code users — zero code, auto-backfills your last 30 days:

pipx install 'tokenjam[mcp]'
tj onboard --claude-code
tj optimize          # cost-saving candidates from your actual usage

To upgrade later: pipx upgrade tokenjam (then tj stop && tj serve & to reload the daemon, and tj --version to verify). See docs/installation.md.

For any Python agent:

from tokenjam.sdk import watch
from tokenjam.sdk.integrations.anthropic import patch_anthropic

patch_anthropic()

@watch(agent_id="my-agent")
def run(task: str) -> str:
    ...

Python SDK · TypeScript SDK · Codex · OTel-compatible agents


Why local-first matters

Your spans contain prompts, completions, tool inputs, and customer data. Shipping that to a SaaS vendor for "observability" is a data-egress decision most teams aren't ready to make.

TokenJam LangSmith Langfuse Datadog LLM Obs
Signup required
Data leaves your machine cloud only
Cost-optimization analyzers (Downsize, Cache, Script, Trim)
Real-time sensitive-action alerts
Behavioral drift detection
OTel GenAI SemConv native partial partial partial
Works with any agent / framework LangChain-first partial
Free, MIT licensed freemium freemium paid

Web UI

tj serve runs a local dashboard at http://127.0.0.1:7391/ with status, traces, cost breakdown, alerts, budget, and drift.

tj status page tj cost page
tj traces page tj alerts page

Beyond optimization

TokenJam is also a full observability stack. The four analyzers ride on top.

  • Real-time cost tracking — every LLM call priced as it happens
  • Safety alerts — 13 alert types, 6 channels (ntfy, Discord, Telegram, webhook, file, stdout)
  • Behavioral drift detection — Z-score baselines, no LLM required
  • Schema validation — declare or infer JSON Schema for tool outputs
  • OTel-native — point any OTLP exporter at tj serve and you're done
  • MCP server — 14 tools letting Claude Code query its own telemetry mid-session

CLI

tj optimize            # all four cost-optimization analyzers
tj optimize downsize   # one analyzer
tj status              # current cost, tokens, active alerts
tj cost --since 7d     # spend by agent / model / day / tool
tj alerts              # everything that fired while you were away
tj drift               # behavioral drift Z-scores
tj backfill claude-code # ingest historical ~/.claude/projects/ sessions
tj serve               # start the web UI + REST API

Full CLI reference →


Documentation

Topic Where
🪶 Downsize / Cache / Script / Trim deep-dives docs/optimize/
Claude Code & Codex integration docs/claude-code-integration.md
Python SDK reference docs/python-sdk.md
TypeScript SDK reference docs/typescript-sdk.md
Framework support (LangChain / CrewAI / etc.) docs/framework-support.md
Alert channels & rule reference docs/alerts.md
Backfill from Langfuse / Helicone / OTLP docs/backfill/
Configuration docs/configuration.md
Architecture deep-dive docs/architecture.md
Installation extras (Trim, framework patches) docs/installation.md
Export to Grafana / Datadog / NDJSON docs/export.md
NemoClaw sandbox observer docs/nemoclaw-integration.md

Roadmap

Shipped in 0.3.x: Downsize · Cache · Script · Trim · Claude Code + Codex onboarding · MCP server · Web UI · Backfill adapters (Langfuse, Helicone, OTLP) · Period comparison · Routing-config export · Read-only policy preview

Up next:

  • tj policy add | edit | apply — unified rule surface
  • tj replay — replay captured sessions against new model versions
  • TypeScript framework patches (LangChain JS, OpenAI Agents SDK)
  • Vercel AI SDK & Mastra integrations
  • Docker image
  • GitHub Actions for CI drift/cost checks

tokenjam.dev · PyPI · npm · Issues

MIT License · Built by Metabuilder Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenjam-0.3.5.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokenjam-0.3.5-py3-none-any.whl (256.8 kB view details)

Uploaded Python 3

File details

Details for the file tokenjam-0.3.5.tar.gz.

File metadata

  • Download URL: tokenjam-0.3.5.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokenjam-0.3.5.tar.gz
Algorithm Hash digest
SHA256 e8dc8ef21e1549eb2bb86f5e13d367d448dfe198020b715afa8c39192c6675fd
MD5 4b96fbb30ff6220968da965720566dc6
BLAKE2b-256 cac085d5e5c21a40b8d33a1e762b30aa04b01a6ba435d59b3a627d01ad0039ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenjam-0.3.5.tar.gz:

Publisher: publish-pypi.yml on Metabuilder-Labs/tokenjam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tokenjam-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: tokenjam-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 256.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokenjam-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3aa52f5537ff05af55498c6166c3ef4d93a2c266a2c986de8661d744a5a1e479
MD5 f46a01aa845fc17d0c69a1ebd1fd726a
BLAKE2b-256 f5980e2e95077447ec7606b2eeaba2e95e3b0d6823269476a72d4847f200fb13

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenjam-0.3.5-py3-none-any.whl:

Publisher: publish-pypi.yml on Metabuilder-Labs/tokenjam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page