Skip to main content

A small, local token viewer for LLM API calls. Runs a tiny proxy on your laptop; shows the exact cost of every Claude / OpenAI / Gemini call in a dashboard. That's it.

Project description

headroom-token-view

A small, local token viewer for LLM API calls.

It runs a tiny proxy on your laptop. Point your apps at it (one env var) and it shows the exact token usage and cost of every call you make to Claude, OpenAI, Gemini, and any other provider you configure, in a simple dashboard.

That's it. No accounts. No cloud. No Docker. One pipx install.

CI PyPI License: MIT Python

Quick start

pipx install headroom-token-view
htv start

You'll see:

+--------------------------------------------------------------------------+
| Headroom Token View v0.0.1                                               |
|                                                                          |
|   started in background (pid 12345)                                      |
+--------------------------------------------------------------------------+

Logs: /Users/you/.headroom-token-view/htv.log
Proxy: http://127.0.0.1:4000
Dashboard: http://127.0.0.1:3000

Point any app at the proxy:

export ANTHROPIC_BASE_URL=http://localhost:4000
export OPENAI_BASE_URL=http://localhost:4000/v1
export GOOGLE_BASE_URL=http://localhost:4000

Open the dashboard: http://localhost:3000.

Now make calls as usual (Anthropic SDK, OpenAI SDK, curl, Claude Code, whatever). They flow through the proxy. The dashboard fills in within milliseconds.

Track Claude Code itself

ANTHROPIC_BASE_URL=http://localhost:4000 claude

Every Claude Code interaction lands in the dashboard.

What it shows

  • $ spent today / this week / month-to-date
  • Per-provider, per-model, per-session, per-tag breakdowns
  • Cache hit visibility (Anthropic prompt caching, OpenAI cached input tokens, Gemini context cache)
  • Reasoning-token costs (o-series, Claude extended thinking)
  • A live tail of recent calls with status + latency
  • Real-time updates via SSE — no refresh needed

What it doesn't do (intentionally)

  • No team / multi-user features. Single user, localhost only.
  • No virtual API keys. Your real provider keys are read from env vars and forwarded straight to the provider.
  • No alerting / Slack integration. Not yet.
  • No data leaves your machine. Everything in ~/.headroom-token-view/db.sqlite.
  • No prompt content stored by default. (Opt-in with redaction; see Privacy below.)

Want any of these? Open an issue. The architecture is designed to evolve into a Postgres + Docker + auth setup later — see the design spec for the "🅑 path".

How it works

Your apps ──► headroom-token-view ──► Provider APIs
                       │
                       ├─ writes a row → SQLite
                       └─ pushes a spend event → SSE → Dashboard

The proxy reads the exact token usage and cost from each provider's response object — Anthropic's cache_creation_input_tokens / cache_read_input_tokens, OpenAI's prompt_tokens_details.cached_tokens, Gemini's usageMetadata, the reasoning-tokens fields on o-series and Claude extended-thinking — and applies the right pricing tier for each. Cost is provider-truth, not a tokenizer estimate.

Your SDK doesn't know it's talking to a proxy. The response bytes are forwarded unchanged; the proxy tees the stream as it flies by so token capture never adds latency to your request.

CLI

htv start [-f]            start the proxy + dashboard (daemonizes; -f for foreground)
htv stop                  graceful SIGTERM
htv status                pid, uptime, request counts, errors, diagnostics
htv logs [-f] [-n N]      tail the server log
htv export --since DATE   csv/json dump of all calls since DATE
htv reset                 wipe the SQLite database (with confirmation)
htv version
htv config-path

Configuration

~/.headroom-token-view/config.yaml is auto-generated on first start. Defaults are localhost-only on ports 3000 / 4000.

proxy:        { port: 4000, bind: 127.0.0.1 }
dashboard:    { port: 3000, bind: 127.0.0.1 }
storage:      { path: ~/.headroom-token-view/db.sqlite }
retention:    { days: 90 }
capture:      { prompts: false, responses: false }

Provider API keys come from environment variables (ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY). HTV never reads or persists them.

Privacy

Default: only token counts + cost + metadata. No prompt text. No response text.

If you want full request/response logging, enable it in the config — regex-based redaction runs before persistence, so the DB never holds raw secrets:

capture:
  prompts: true
  responses: true
  redact_patterns:
    - '(sk|pk)-[A-Za-z0-9]{20,}'
    - '[\w.+-]+@[\w-]+\.[\w.-]+'

Security stance

  • All dependencies on the data path (proxy engine, web framework, ASGI server) are version-pinned. Patches arrive automatically on pipx upgrade; major-version jumps require an HTV release.
  • Runtime fetching of model-pricing data is disabled — prices come from the pinned wheel, not a network fetch.
  • Default bind is 127.0.0.1; non-loopback binds require explicit htv start --allow-remote and the matching config setting.

Full threat model in SECURITY.md.

Status

v0.0.x — alpha. Single-user laptop tool. Works against Claude, OpenAI, Gemini, and 100+ other providers.

Roadmap lives in CHANGELOG.md. Near-term:

  • Cost-map refresh with hash verification
  • htv test-providers — smoke each configured provider with a $0.001 token
  • Optional Postgres backend for multi-user use

Contributing

PRs welcome. The loop is:

pip install -e ".[dev]"
ruff check src tests
pytest -q

See CONTRIBUTING.md.

License

MIT. © 2026 Tejas Chopra.

Bundled open-source dependencies are credited in NOTICES.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

headroom_token_view-0.0.1.tar.gz (447.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

headroom_token_view-0.0.1-py3-none-any.whl (417.9 kB view details)

Uploaded Python 3

File details

Details for the file headroom_token_view-0.0.1.tar.gz.

File metadata

  • Download URL: headroom_token_view-0.0.1.tar.gz
  • Upload date:
  • Size: 447.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.3

File hashes

Hashes for headroom_token_view-0.0.1.tar.gz
Algorithm Hash digest
SHA256 51129108b0d9ed0ac300570e1207fc80f1c2933846222d3fc3e01ef78caa54ee
MD5 389a5db17b471cbef4a57fc760e9a9a3
BLAKE2b-256 6da73be597fba39d6f99ad0f3c1ec09668833615e53b0c8f6072aefcb7006900

See more details on using hashes here.

File details

Details for the file headroom_token_view-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for headroom_token_view-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 443268c63a26b8bd8471c12e5b7f03b3406addb0e0f9717a77a6c0776b67ec96
MD5 957142a7c3edf4baa8e5aa8d06ca035a
BLAKE2b-256 47945e625483f98f8561e9ca050c5e7b14c59e2b47fcf5607f42e9d5a9c425bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page