Skip to main content

See where your LLM tokens actually go — down to the individual tool call. A small, local, zero-config proxy + dashboard that tracks exact token usage and cost across Claude / OpenAI / Gemini, and breaks input-token spend down per tool (Read, Bash, MCP, ...).

Project description

tokview

Wrap your coding agent and watch where every token goes — live, in your terminal, down to the individual tool call.

CI PyPI License: MIT Python

tokview terminal TUI demo: live spend, sessions, drill-downs, cache reads, and tool hotspots

A Codex or Claude Code session burns through millions of tokens, and all you get back is a bill — or, on a subscription, nothing at all. tokview is a tiny local proxy that sits in front of your agent and shows you, as it runs, exactly where the tokens go: by session, by request, by model, and — uniquely — by tool call. No account, no cloud, no code changes.

Try it in 30 seconds

uv tool install token-viewer      # PyPI name is token-viewer; the command is `tokview`
# or: pipx install token-viewer

tokview show --watch              # live terminal dashboard (one terminal)
tokview wrap claude               # run your agent through tokview (another terminal)
#   or:  tokview wrap codex

That's the whole workflow — wrap your agent, show the tokens. Agent flags pass straight through (tokview wrap codex --model gpt-5.5 --search), and multiple Codex/Claude sessions run at once and appear separately in tokview show.

Where your tokens actually go

Most counters stop at "this call used 180k tokens." tokview tells you which tool spent them — and catches the cost nothing else surfaces: a big tool result (a file Read, an MCP dump, a grep over your repo) gets re-sent into every later turn, silently multiplying your input bill. Often the single largest line item in a session is one file your agent re-read a dozen times.

And it sees traffic normal token counters can't:

  • Subscription agents. Codex and Claude Code OAuth / WebSocket traffic, not just API keys — with an estimated equivalent API spend per session, so you know what your subscription session would have cost on metered pricing.
  • Streaming, tool calls, and provider-compatible SDKs, captured at the proxy with zero app instrumentation.

Tool-level numbers are token estimates, not dollars — providers bill per model call, and cache discounts make per-tool dollars misleading.

Browser dashboard

The browser dashboard is still available when you want a wider visual scan, but the terminal TUI is the primary workflow.

Works with

Client Use Notes
Codex subscription tokview wrap codex HTTP + WebSocket Responses traffic, including ChatGPT-auth Codex backend calls.
Claude Code subscription / OAuth tokview wrap claude Native Anthropic Messages forwarding for subscription/OAuth and API-key traffic.
OpenAI-compatible SDKs OPENAI_BASE_URL=http://127.0.0.1:4000/v1 API-key traffic through LiteLLM.
Anthropic-compatible SDKs ANTHROPIC_BASE_URL=http://127.0.0.1:4000 Native Anthropic-compatible proxying.
Gemini-compatible SDKs GOOGLE_BASE_URL=http://127.0.0.1:4000 Direct proxy mode.

If a client can point at a provider-compatible base URL, tokview can usually observe it — no instrumentation required.

What you get

  • Live spend by session, request, provider, and model.
  • Input, output, cache-read, cache-write, and reasoning token counters when reported.
  • Estimated equivalent API spend for subscription traffic.
  • Tool argument/result token estimates — including Codex shell command families like read, grep, find, pytest, and npm.
  • A local SQLite history at ~/.tokview/db.sqlite.

vs. other token counters

Approach Good for What tokview adds
Provider dashboards Billing totals Live, local session / request / tool views.
SDK observability (Langfuse, etc.) Instrumented apps Wrapping a CLI you can't modify; localhost-only capture.
Claude/Codex log readers Post-hoc summaries Live proxy traffic + SDK coverage as it happens.
Tokenizers Prompt-size guesses Real provider usage, cache counters, streaming data.

How it works

Codex / Claude / SDKs ──► tokview local proxy ──► provider backend
                                  │
                                  ├─► SQLite  ~/.tokview/db.sqlite
                                  ├─► tokview show --watch   (terminal)
                                  └─► optional browser dashboard
  • API-key traffic routes through LiteLLM where that's the right layer.
  • Codex subscription traffic uses tokview's native Codex adapter so HTTP and WebSocket Responses traffic are observable.
  • Claude Code subscription/OAuth traffic uses tokview's native Anthropic adapter.
  • Costs marked ~ are estimated equivalent API spend — subscription products don't bill per request like API-key calls.

Commands

tokview wrap codex [CODEX_ARGS...]     # run Codex through tokview
tokview wrap claude [CLAUDE_ARGS...]   # run Claude Code through tokview
tokview unwrap codex                   # undo a wrap

tokview show --watch                   # live terminal dashboard
tokview show --latest                  # the most recently active session
tokview show --session SESSION_ID      # one session in detail

tokview status                         # running? counts, errors, diagnostics
tokview logs [-f] [-n N]               # tail the server log
tokview export --since YYYY-MM-DD --format csv|json
tokview reset                          # wipe the local SQLite history
tokview version

tokview start / tokview stop exist for debugging, but the normal workflow is tokview wrap ... plus tokview show.

Privacy & data

By default tokview stores accounting metadata only — no prompt or response text:

  • timestamp, latency, status
  • provider, model, session id
  • input / output / cache / reasoning token counters
  • cost, or estimated equivalent API cost
  • tool names with estimated argument/result tokens

Provider API keys come from your environment; tokview forwards them and never persists them. Everything stays in ~/.tokview/db.sqlite on your machine.

Configuration

~/.tokview/config.yaml is created on first run and defaults to localhost-only:

proxy:      { port: 4000, bind: 127.0.0.1 }
dashboard:  { port: 3000, bind: 127.0.0.1 }
storage:    { path: ~/.tokview/db.sqlite }
retention:  { days: 90 }
capture:    { prompts: false, responses: false }

Security

  • Binds to 127.0.0.1 by default.
  • Stores everything locally; no account, cloud service, or telemetry.
  • Uses LiteLLM's installed pricing map instead of runtime pricing fetches.

Full threat model in SECURITY.md.

Status

v0.0.x — alpha. Strongest today for Codex, Claude Code, OpenAI-/Anthropic-/Gemini-compatible SDKs, and other LiteLLM-supported providers routed through the proxy.

Contributing

python -m venv .venv && . .venv/bin/activate
pip install -e ".[dev]"
ruff check src tests
pytest -q

See CONTRIBUTING.md.

License

MIT. Bundled open-source dependencies are credited in NOTICES.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

token_viewer-0.0.6.tar.gz (25.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

token_viewer-0.0.6-py3-none-any.whl (455.2 kB view details)

Uploaded Python 3

File details

Details for the file token_viewer-0.0.6.tar.gz.

File metadata

  • Download URL: token_viewer-0.0.6.tar.gz
  • Upload date:
  • Size: 25.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.3

File hashes

Hashes for token_viewer-0.0.6.tar.gz
Algorithm Hash digest
SHA256 1f9912cd15fc07e28943e546f455e0d789118260a2904cf071b3ce6cea8c2338
MD5 f77cb2b633e35cf208c5ba7cea454042
BLAKE2b-256 36900d9804e87501089277eca308782e726776964e46252d7c5e0b9dcfaabfdb

See more details on using hashes here.

File details

Details for the file token_viewer-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: token_viewer-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 455.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.3

File hashes

Hashes for token_viewer-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 6e4656747259dec1866e3d178b770beff106a2aa621eab0d069078415a877296
MD5 9c01dd3e250fa83d236174acd6efc90d
BLAKE2b-256 879b2ab6affb5606f1165340fd4c179206b3cb09cc2b162907659eb49d342d98

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page