See where your LLM tokens actually go — down to the individual tool call. A small, local, zero-config proxy + dashboard that tracks exact token usage and cost across Claude / OpenAI / Gemini, and breaks input-token spend down per tool (Read, Bash, MCP, ...).
Project description
tokview
Wrap your coding agent and watch where every token goes — live, in your terminal, down to the individual tool call.
A Codex or Claude Code session burns through millions of tokens, and all you get back is a bill — or, on a subscription, nothing at all. tokview is a tiny local proxy that sits in front of your agent and shows you, as it runs, exactly where the tokens go: by session, by request, by model, and — uniquely — by tool call. No account, no cloud, no code changes.
Try it in 30 seconds
uv tool install token-viewer # PyPI name is token-viewer; the command is `tokview`
# or: pipx install token-viewer
tokview show --watch # live terminal dashboard (one terminal)
tokview wrap claude # run your agent through tokview (another terminal)
# or: tokview wrap codex
That's the whole workflow — wrap your agent, show the tokens. Agent flags pass straight through (tokview wrap codex --model gpt-5.5 --search), and multiple Codex/Claude sessions run at once and appear separately in tokview show.
Where your tokens actually go
Most counters stop at "this call used 180k tokens." tokview tells you which tool spent them — and catches the cost nothing else surfaces: a big tool result (a file Read, an MCP dump, a grep over your repo) gets re-sent into every later turn, silently multiplying your input bill. Often the single largest line item in a session is one file your agent re-read a dozen times.
And it sees traffic normal token counters can't:
- Subscription agents. Codex and Claude Code OAuth / WebSocket traffic, not just API keys — with an estimated equivalent API spend per session, so you know what your subscription session would have cost on metered pricing.
- Streaming, tool calls, and provider-compatible SDKs, captured at the proxy with zero app instrumentation.
Tool-level numbers are token estimates, not dollars — providers bill per model call, and cache discounts make per-tool dollars misleading.
Browser dashboard
The browser dashboard is still available when you want a wider visual scan, but the terminal TUI is the primary workflow.
Works with
| Client | Use | Notes |
|---|---|---|
| Codex subscription | tokview wrap codex |
HTTP + WebSocket Responses traffic, including ChatGPT-auth Codex backend calls. |
| Claude Code subscription / OAuth | tokview wrap claude |
Native Anthropic Messages forwarding for subscription/OAuth and API-key traffic. |
| OpenAI-compatible SDKs | OPENAI_BASE_URL=http://127.0.0.1:4000/v1 |
API-key traffic through LiteLLM. |
| Anthropic-compatible SDKs | ANTHROPIC_BASE_URL=http://127.0.0.1:4000 |
Native Anthropic-compatible proxying. |
| Gemini-compatible SDKs | GOOGLE_BASE_URL=http://127.0.0.1:4000 |
Direct proxy mode. |
If a client can point at a provider-compatible base URL, tokview can usually observe it — no instrumentation required.
What you get
- Live spend by session, request, provider, and model.
- Input, output, cache-read, cache-write, and reasoning token counters when reported.
- Estimated equivalent API spend for subscription traffic.
- Tool argument/result token estimates — including Codex shell command families like
read,grep,find,pytest, andnpm. - A local SQLite history at
~/.tokview/db.sqlite.
vs. other token counters
| Approach | Good for | What tokview adds |
|---|---|---|
| Provider dashboards | Billing totals | Live, local session / request / tool views. |
| SDK observability (Langfuse, etc.) | Instrumented apps | Wrapping a CLI you can't modify; localhost-only capture. |
| Claude/Codex log readers | Post-hoc summaries | Live proxy traffic + SDK coverage as it happens. |
| Tokenizers | Prompt-size guesses | Real provider usage, cache counters, streaming data. |
How it works
Codex / Claude / SDKs ──► tokview local proxy ──► provider backend
│
├─► SQLite ~/.tokview/db.sqlite
├─► tokview show --watch (terminal)
└─► optional browser dashboard
- API-key traffic routes through LiteLLM where that's the right layer.
- Codex subscription traffic uses tokview's native Codex adapter so HTTP and WebSocket Responses traffic are observable.
- Claude Code subscription/OAuth traffic uses tokview's native Anthropic adapter.
- Costs marked
~are estimated equivalent API spend — subscription products don't bill per request like API-key calls.
Commands
tokview wrap codex [CODEX_ARGS...] # run Codex through tokview
tokview wrap claude [CLAUDE_ARGS...] # run Claude Code through tokview
tokview unwrap codex # undo a wrap
tokview show --watch # live terminal dashboard
tokview show --latest # the most recently active session
tokview show --session SESSION_ID # one session in detail
tokview status # running? counts, errors, diagnostics
tokview logs [-f] [-n N] # tail the server log
tokview export --since YYYY-MM-DD --format csv|json
tokview reset # wipe the local SQLite history
tokview version
tokview start / tokview stop exist for debugging, but the normal workflow is tokview wrap ... plus tokview show.
Privacy & data
By default tokview stores accounting metadata only — no prompt or response text:
- timestamp, latency, status
- provider, model, session id
- input / output / cache / reasoning token counters
- cost, or estimated equivalent API cost
- tool names with estimated argument/result tokens
Provider API keys come from your environment; tokview forwards them and never persists them. Everything stays in ~/.tokview/db.sqlite on your machine.
Configuration
~/.tokview/config.yaml is created on first run and defaults to localhost-only:
proxy: { port: 4000, bind: 127.0.0.1 }
dashboard: { port: 3000, bind: 127.0.0.1 }
storage: { path: ~/.tokview/db.sqlite }
retention: { days: 90 }
capture: { prompts: false, responses: false }
Security
- Binds to
127.0.0.1by default. - Stores everything locally; no account, cloud service, or telemetry.
- Uses LiteLLM's installed pricing map instead of runtime pricing fetches.
Full threat model in SECURITY.md.
Status
v0.0.x — alpha. Strongest today for Codex, Claude Code, OpenAI-/Anthropic-/Gemini-compatible SDKs, and other LiteLLM-supported providers routed through the proxy.
Contributing
python -m venv .venv && . .venv/bin/activate
pip install -e ".[dev]"
ruff check src tests
pytest -q
See CONTRIBUTING.md.
License
MIT. Bundled open-source dependencies are credited in NOTICES.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file token_viewer-0.0.6.tar.gz.
File metadata
- Download URL: token_viewer-0.0.6.tar.gz
- Upload date:
- Size: 25.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f9912cd15fc07e28943e546f455e0d789118260a2904cf071b3ce6cea8c2338
|
|
| MD5 |
f77cb2b633e35cf208c5ba7cea454042
|
|
| BLAKE2b-256 |
36900d9804e87501089277eca308782e726776964e46252d7c5e0b9dcfaabfdb
|
File details
Details for the file token_viewer-0.0.6-py3-none-any.whl.
File metadata
- Download URL: token_viewer-0.0.6-py3-none-any.whl
- Upload date:
- Size: 455.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e4656747259dec1866e3d178b770beff106a2aa621eab0d069078415a877296
|
|
| MD5 |
9c01dd3e250fa83d236174acd6efc90d
|
|
| BLAKE2b-256 |
879b2ab6affb5606f1165340fd4c179206b3cb09cc2b162907659eb49d342d98
|