Skip to main content

TokenJam — local-first OTel-native observability for Autonomous AI agents

Project description

TokenJam

TokenJam

Token Efficiency For AI Agents

TokenJam reads your agent's telemetry and tells you when to downsize, when to trim prompts, what to cache, what to script, and what plans you've already paid to figure out — then shows it all in a local browser dashboard. Runs entirely on your machine.

CI PyPI Python npm License: MIT OTel

pipx install tokenjam

Don't have pipx? brew install pipx on macOS, apt install pipx on Debian/Ubuntu, or see docs/installation.md. pip install tokenjam also works in a clean venv.

No cloud · No signup · No vendor lock-in


Five Analyzers + Lens. One Install.

TokenJam reads telemetry from every major agent runtime, framework, provider, and observability tool and surfaces savings across five areas — then brings them together in a local browser dashboard.

🪶 Downsize

Flags sessions where a cheaper model in the same family is worth a look. Never claims quality equivalence — surfaces examples so you can spot-check.

tj optimize downsize

Details →

💾 Cache

Shows your current caching ratio per (provider, model) and suggests Anthropic prompt-cache breakpoints from stable prefixes in your real usage.

tj optimize cache

Details →

📜 Script

Finds clusters of deterministic (tool_name, arg_shape) sequences that match the shape of work a plain script could replace.

tj optimize script

Details →

✂️ Trim

Predicts which regions of your prompts the model gives little weight to. Surfaces what's safe to cut.

tj optimize trim

Details →

🔁 Reuse

Detects clusters of sessions where your agent re-plans the same work and exports reviewable skeleton templates you can drop into a slash command or script.

tj optimize reuse

Details →

🔭 Lens

A local browser dashboard that brings every analyzer's findings, your real spend, and your alerts together in one place. No cloud, no signup, fully offline.

tj serve

Details →

Run all five analyzers with tj optimize. Run several with tj optimize downsize cache reuse.


30-second quickstart

For Claude Code users — zero code, auto-backfills your last 30 days:

pipx install tokenjam
tj onboard --claude-code
tj optimize          # cost-saving candidates from your actual usage
tj serve             # open the dashboard at http://127.0.0.1:7391/

To upgrade later: pipx upgrade tokenjam (then tj stop && tj serve & to reload the daemon, and tj --version to verify). See docs/installation.md.

For any Python agent:

from tokenjam.sdk import watch
from tokenjam.sdk.integrations.anthropic import patch_anthropic

patch_anthropic()

@watch(agent_id="my-agent")
def run(task: str) -> str:
    ...

Python SDK · TypeScript SDK · Codex · OTel-compatible agents


Lens — the local dashboard

tj serve runs Lens at http://127.0.0.1:7391/: an Overview triage screen with spend, recoverable waste, and health at a glance; an Optimize tab showing every analyzer's findings side by side; and the standard Status, Traces, Cost, Alerts, Drift, and Budget screens. Plan-tier-aware, fully offline, no signup.

Status screen Cost screen with spend-over-time chart
Traces table Alerts table

tokenjam.dev/products/lens for the visual walkthrough.


Beyond optimization

TokenJam is also a full observability stack. The five analyzers and Lens ride on top.

  • Real-time cost tracking — every LLM call priced as it happens
  • Safety alerts — 13 alert types, 6 channels (ntfy, Discord, Telegram, webhook, file, stdout)
  • Behavioral drift detection — Z-score baselines, no LLM required
  • Schema validation — declare or infer JSON Schema for tool outputs
  • OTel-native — point any OTLP exporter at tj serve and you're done
  • MCP server — lets Claude Code query its own telemetry mid-session

CLI

tj optimize            # all five cost-optimization analyzers
tj optimize downsize   # one analyzer (positional args)
tj tokenmaxx           # shareable spend-tier callout
tj status              # current cost, tokens, active alerts
tj cost --since 7d     # spend by agent / model / day / tool
tj alerts              # everything that fired while you were away
tj drift               # behavioral drift Z-scores
tj report --reuse      # HTML + Markdown skeleton export for the Reuse analyzer
tj backfill claude-code # ingest historical ~/.claude/projects/ sessions
tj serve               # start Lens + REST API

Full CLI reference →


Documentation

Topic Where
🪶 Downsize / Cache / Script / Trim deep-dives docs/optimize/
🔁 Reuse analyzer deep-dive docs/optimize/reuse.md
Claude Code & Codex integration docs/claude-code-integration.md
Python SDK reference docs/python-sdk.md
TypeScript SDK reference docs/typescript-sdk.md
Framework support (LangChain / CrewAI / etc.) docs/framework-support.md
Alert channels & rule reference docs/alerts.md
Backfill from Langfuse / Helicone / OTLP docs/backfill/
Configuration docs/configuration.md
Architecture deep-dive docs/architecture.md
Installation extras (Trim, framework patches) docs/installation.md
Export to Grafana / Datadog / NDJSON docs/export.md
NemoClaw sandbox observer docs/nemoclaw-integration.md
Release notes GitHub Releases

Roadmap

Shipped in 0.3.x: Downsize · Cache · Script · Trim · Claude Code + Codex onboarding · MCP server · Web UI · Backfill adapters (Langfuse, Helicone, OTLP) · Period comparison · Routing-config export · Read-only policy preview

Shipped in 0.4.x:

  • TokenJam Lens — local dashboard rebrand: Overview triage front-door, Optimize detail tab, real spend-over-time charts, cross-screen drill-through
  • Reuse analyzer — fifth analyzer: detects clusters of sessions with repeated planning, exports reviewable skeleton templates you can convert into slash commands or scripts
  • Daemon DB concurrency — per-thread DuckDB cursors so the Overview's fan-out doesn't block on a single shared connection (v0.4.1)
  • Cache cost transparencycache_read + cache_write token columns surfaced in CLI + UI + API (the previously-hidden ~91% cost driver on cache-heavy workloads)

Up next:

  • tj policy add | edit | apply — unified rule surface
  • tj replay — replay captured sessions against new model versions
  • TypeScript framework patches (LangChain JS, OpenAI Agents SDK)
  • Vercel AI SDK & Mastra integrations
  • Docker image
  • GitHub Actions for CI drift/cost checks

tokenjam.dev · PyPI · npm · Issues

MIT License · Built by Metabuilder Labs

TokenJam was created by Anil Murty — reach him at anil@metabldr.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenjam-0.4.2.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokenjam-0.4.2-py3-none-any.whl (340.1 kB view details)

Uploaded Python 3

File details

Details for the file tokenjam-0.4.2.tar.gz.

File metadata

  • Download URL: tokenjam-0.4.2.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokenjam-0.4.2.tar.gz
Algorithm Hash digest
SHA256 0bb270840e5afd0dd9fb35afc09ca2a8c1dd0f86d08dd575e0b7e38425ea3c51
MD5 769f5fa80260a082b77d29dfca8f8fab
BLAKE2b-256 ca665027b964ae48f89c85b70835933ca77e5c7df03fde856c0b0d59a4915f35

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenjam-0.4.2.tar.gz:

Publisher: publish-pypi.yml on Metabuilder-Labs/tokenjam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tokenjam-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: tokenjam-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 340.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tokenjam-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8c01eb1f9d8b3c51ddd4c1fecbb344b1e9b9ecad4074ed57c188acb2501ac5d2
MD5 bb8c42cc43768ef859adcc7563ef6a33
BLAKE2b-256 1eaadfe683ba70d39b992f3b7ad1e881362b1aecba385aae66e28f123189cfe7

See more details on using hashes here.

Provenance

The following attestation bundles were made for tokenjam-0.4.2-py3-none-any.whl:

Publisher: publish-pypi.yml on Metabuilder-Labs/tokenjam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page