See where your LLM tokens actually go — down to the individual tool call. A small, local, zero-config proxy + dashboard that tracks exact token usage and cost across Claude / OpenAI / Gemini, and breaks input-token spend down per tool (Read, Bash, MCP, ...).
Project description
tokview
See where your tokens actually go — down to the individual tool call.
tokview is a small, local token viewer for LLM API calls. It runs a tiny proxy on your laptop; point your apps at it (one env var) and it shows the exact token usage and cost of every call to Claude, OpenAI, Gemini, or any provider you configure — and, uniquely, which tools ate your tokens.
Most tools tell you a call used 180k tokens. tokview tells you that 140k of them were a single Read result re-sent on every turn, and your mcp__github__search calls quietly added 40k more. Tracing platforms can show this if you instrument your code with their SDK; gateways track per-request cost but not per-tool tokens. tokview is the only one we know of that does per-tool token attribution as a drop-in local proxy — no SDK, no code changes, works with any app or CLI you can point at a URL (even Claude Code itself).
No accounts. No cloud. No Docker. One install.
Quick start
pipx install token-viewer # the command it installs is `tokview`
tokview start
You'll see:
+--------------------------------------------------------------------------+
| tokview v0.0.1 |
| |
| started in background (pid 12345) |
+--------------------------------------------------------------------------+
Logs: /Users/you/.tokview/tokview.log
Proxy: http://127.0.0.1:4000
Dashboard: http://127.0.0.1:3000
Point any app at the proxy:
export ANTHROPIC_BASE_URL=http://localhost:4000
export OPENAI_BASE_URL=http://localhost:4000/v1
export GOOGLE_BASE_URL=http://localhost:4000
Open the dashboard: http://localhost:3000.
Now make calls as usual (Anthropic SDK, OpenAI SDK, curl, Claude Code, whatever). They flow through the proxy. The dashboard fills in within milliseconds.
Track Claude Code itself
ANTHROPIC_BASE_URL=http://localhost:4000 claude
Every Claude Code interaction lands in the dashboard.
Why it's different
Three things have to be true at once, and tokview is the only tool we know of where they all are:
- Tool-level token attribution. Not just "this call used 180k tokens" — which tool spent them. It catches the dominant hidden agent cost: a big tool result (a
Read, an MCP dump) re-billed as input on every later turn. - Drop-in proxy, no instrumentation. Tracing platforms surface tool detail only if you wrap your code in their SDK. tokview gets it from one env var — for any app or CLI you can point at a URL, even ones you can't modify, like Claude Code.
- Fully local. SQLite on your laptop. No account, no cloud, nothing leaves your machine.
What it shows
- $ spent today / this week / month-to-date, updating live via SSE — no refresh
- Per-provider, per-model, per-session, per-tag breakdowns
- Session waterfall — click any session to see every call in it on a timeline, with cost, tokens, latency and TTFT (a trace view for your agent loops)
- Per-tool tokens — for agent sessions, which tools were called (
Read,Bash,mcp__…) and how many tokens each consumed (arguments + results). Token estimates only — catches the big hidden cost: a large tool result re-sent as input on every later turn. - Savings coach — deterministic, local tips: repeated prompts you could cache, caching savings already realized, cheaper-model what-ifs. No model is called to produce these; it's arithmetic over your own data.
- Latency & TTFT — time-to-first-token, total latency, and tokens/sec per model (p50/p95), plus per-call in the live tail
- Cache-hit visibility (Anthropic prompt caching, OpenAI cached input tokens, Gemini context cache) and reasoning-token costs (o-series, Claude extended thinking)
What it doesn't do (intentionally)
- No team / multi-user features. Single user, localhost only.
- No virtual API keys. Your real provider keys are read from env vars and forwarded straight to the provider.
- No alerting / Slack integration. Not yet.
- No data leaves your machine. Everything in
~/.tokview/db.sqlite. - No prompt content stored by default. (Opt-in with redaction; see Privacy below.)
Want any of these? Open an issue. The architecture is designed to evolve into a Postgres + Docker + auth setup later — see the design spec for the "🅑 path".
How it works
Your apps ──► tokview ──► Provider APIs
│
├─ writes a row → SQLite
└─ pushes a spend event → SSE → Dashboard
The proxy reads the exact token usage and cost from each provider's response object — Anthropic's cache_creation_input_tokens / cache_read_input_tokens, OpenAI's prompt_tokens_details.cached_tokens, Gemini's usageMetadata, the reasoning-tokens fields on o-series and Claude extended-thinking — and applies the right pricing tier for each. Cost is provider-truth, not a tokenizer estimate.
Your SDK doesn't know it's talking to a proxy. The response bytes are forwarded unchanged; the proxy tees the stream as it flies by so token capture never adds latency to your request.
Tool-level attribution comes from the same stream. An agent's tool calls flow through the proxy as structured blocks — tool_use/tool_result (Anthropic) or tool_calls/role:tool (OpenAI) — so tokview parses them out and tokenizes each tool's arguments and results locally. That gives you per-tool, per-session token estimates with no extra instrumentation. (It's an estimate, by design: the provider bills per call, not per block, and cache discounts make per-tool cost meaningless — so tokview reports tokens, not dollars, at the tool level. A proxy can see what a tool returned; it can't see the tool execute — that's client-side.)
CLI
tokview start [-f] start the proxy + dashboard (daemonizes; -f for foreground)
tokview stop graceful SIGTERM
tokview status pid, uptime, request counts, errors, diagnostics
tokview logs [-f] [-n N] tail the server log
tokview export --since DATE csv/json dump of all calls since DATE
tokview reset wipe the SQLite database (with confirmation)
tokview version
tokview config-path
Configuration
~/.tokview/config.yaml is auto-generated on first start. Defaults are localhost-only on ports 3000 / 4000.
proxy: { port: 4000, bind: 127.0.0.1 }
dashboard: { port: 3000, bind: 127.0.0.1 }
storage: { path: ~/.tokview/db.sqlite }
retention: { days: 90 }
capture: { prompts: false, responses: false }
Provider API keys come from environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY). tokview never reads or persists them.
Privacy
Default: only token counts + cost + metadata. No prompt text. No response text.
If you want full request/response logging, enable it in the config — regex-based redaction runs before persistence, so the DB never holds raw secrets:
capture:
prompts: true
responses: true
redact_patterns:
- '(sk|pk)-[A-Za-z0-9]{20,}'
- '[\w.+-]+@[\w-]+\.[\w.-]+'
Security stance
- All dependencies on the data path (proxy engine, web framework, ASGI server) are version-pinned. Patches arrive automatically on
pipx upgrade; major-version jumps require a tokview release. - Runtime fetching of model-pricing data is disabled — prices come from the pinned wheel, not a network fetch.
- Default bind is
127.0.0.1; non-loopback binds require explicittokview start --allow-remoteand the matching config setting.
Full threat model in SECURITY.md.
Status
v0.0.x — alpha. Single-user laptop tool. Works against Claude, OpenAI, Gemini, and 100+ other providers.
Roadmap lives in CHANGELOG.md. Near-term:
- Cost-map refresh with hash verification
tokview test-providers— smoke each configured provider with a $0.001 token- Optional Postgres backend for multi-user use
Contributing
PRs welcome. The loop is:
pip install -e ".[dev]"
ruff check src tests
pytest -q
See CONTRIBUTING.md.
License
MIT. © 2026 Tejas Chopra.
Bundled open-source dependencies are credited in NOTICES.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file token_viewer-0.0.2.tar.gz.
File metadata
- Download URL: token_viewer-0.0.2.tar.gz
- Upload date:
- Size: 468.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
278815f7ba4b1f9f4a1bca0efa618c7b03f175a8685d618376da6ffece36626a
|
|
| MD5 |
d0af91837c1277349fa889d6bb05c98d
|
|
| BLAKE2b-256 |
9c027459cb67591122d6b37cfdc7db46260fa96bfa2de77e38d8f42c71b724df
|
File details
Details for the file token_viewer-0.0.2-py3-none-any.whl.
File metadata
- Download URL: token_viewer-0.0.2-py3-none-any.whl
- Upload date:
- Size: 431.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d128a758dae9ad2cae5c8cc8c3e39ff6a52ec019782b87b87c92465ae3d02d5c
|
|
| MD5 |
28cb27c253dc4b66e84c0bcbcc15891c
|
|
| BLAKE2b-256 |
01c7d1a2609da1b292cac7abab70910024b5a8c9bfb6a1b0afdbd5fb77cb851d
|