Local LLM API proxy — captures usage, attributes cost to projects.
Project description
Halton Meter
Local LLM API proxy. Observes outbound traffic, attributes every request to a project, computes exact cost, surfaces it via terminal reports and a loopback HTTP API.
Designed for solo developers, agencies, and in-house dev teams who use Claude Code, the Anthropic SDK, OpenAI, Gemini, xAI, or other LLM clients heavily and want transparency over what's being spent and on what.
Platform support
macOS is the primary supported platform. Install + lifecycle + capture + reporting are exercised end-to-end on the maintainer's daily-driver before every release.
Linux is in public beta as of v0.1.24. Verified on Ubuntu 24.04 LTS against (a) a Docker + systemd-container soak run on every commit and (b) an AWS Lightsail instance with real systemd-logind, real loginctl enable-linger, real reboot survival, real Claude Code sessions captured end-to-end (Anthropic + Gemini interception confirmed). Other distros (Debian, Fedora, Arch) should work — same systemd --user mechanism — but are not yet covered by the release gate. Please open an issue with [linux-beta] in the subject if you hit anything; we want to graduate Linux to "primary supported" with broader distro coverage in v0.2.
Linux runs apps-mode only (env-var-only HTTPS_PROXY — no system-level proxy via networksetup-equivalent or transparent-proxy via iptables / nftables; that scope is on the post-v1.0 roadmap). The daemon + edge ship as systemd --user units under ~/.config/systemd/user/; the watchdog unit is macOS-only (the launchd-based networksetup poller has no Linux counterpart). See LINUX.md for the full Linux setup guide.
Windows apps-mode is available as of v0.2.11 (beta). uvx halton-meter works on Windows 10 / 11. No admin rights required for apps-mode. The daemon and edge run as Task Scheduler user-level tasks; the mitmproxy CA cert is installed into the user cert store via certutil -user -addstore Root; the system proxy is written to HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings with a WM_SETTINGCHANGE broadcast so Electron apps (Claude Code, VS Code) pick it up without restart. Full-mode (machine-wide proxy, NSSM service, MDM cert deployment) is on the post-v1.0 roadmap. See INSTALL.md for Windows setup instructions.
Quick install
Prerequisites
| Requirement | Notes |
|---|---|
uv |
Install with curl -LsSf https://astral.sh/uv/install.sh | sh (macOS / Linux) or powershell -c "irm https://astral.sh/uv/install.ps1 | iex" (Windows). uv ships its own Python interpreter, so you do not need to install Python yourself. |
| macOS or Linux | macOS (any recent version) — primary support. Linux with systemd --user and loginctl enable-linger — public beta. Windows: distribution layer works, OS-level proxy install is future work. |
| Admin password (one-time) | Granted during init to install the mitmproxy CA cert into the macOS System keychain. |
Everything else (mitmproxy, FastAPI, SQLAlchemy, structlog, Rich, Click, certifi) is bundled in the wheel. No Docker, no Postgres, no separate backend.
Install
Pick one — they all install from the same PyPI wheel:
# Option A — uvx (canonical; zero Python prereq, no PATH change).
uvx halton-meter --version
# Option B — uv tool install (persistent; puts `halton-meter` on PATH).
uv tool install halton-meter
halton-meter --version
# Option C — pipx, if you already use it.
pipx install halton-meter
halton-meter --version
Then run the setup flow (substitute uvx halton-meter for halton-meter if you chose Option A):
# one-time setup: cert trust + launchd supervisors + per-shell env block
halton-meter init --apps
# start metering
halton-meter start
# verify health
halton-meter status
After this, every terminal you open and every Spotlight-launched app routes its LLM API traffic through the meter. Existing terminals get the env on next shell open.
To stop metering: halton-meter stop. To remove entirely: halton-meter uninstall.
Graceful uninstall.
uninstallstops metering and removes the launchd supervisor plists, but the edge process keeps running in pure-passthrough mode until next reboot. This is intentional — apps withHTTPS_PROXY=http://127.0.0.1:8081baked into their environment (running terminals, IDEs, browsers) don't lose connectivity mid-flow. On reboot the edge dies with your session and isn't respawned. To kill it immediately:pkill -f "halton-meter edge". To also wipe~/.halton-meter/db.sqliteand logs, add--purge.
What it does
Captures and costs every outbound LLM API call across these providers:
- Anthropic —
api.anthropic.com(/v1/messages, streaming + non-streaming, extended thinking, cache read/write) - OpenAI —
api.openai.com(/v1/chat/completions,/v1/responses,/v1/embeddings; o-series reasoning tokens; prompt cache reads) - Gemini —
generativelanguage.googleapis.com(/v1beta/models/<model>:generateContent,:streamGenerateContent,:embedContent,:batchEmbedContents; Gemini 2.5 thinking tokens; >200k tiered pricing; cached-content reads) - xAI —
api.x.ai(/v1/chat/completionsOpenAI-compat shape;/v1/messagesAnthropic-compat shape)
Project attribution via HALTON_PROJECT env var, a .haltonrc file in the working directory, or the working directory name. macOS GUI apps get attributed via launchctl bundle-ID humanisation (mac:com.docker.docker → Docker).
Storage — local SQLite (~/.halton-meter/db.sqlite) with WAL. Token counts, latency, cost in integer millicents (no float drift), and full request/response bodies with credential redaction. Body capture default ON; opt out per project with halton-meter project <slug> set body-capture off. Background retention sweep keeps disk bounded (default 90 days).
Loopback HTTP API at 127.0.0.1:8765 for the cloud dashboard or any reader. JSON in, JSON out. Audit trail of every request, policy event, and reconciliation run.
Cost report — single-command client-billing surface:
halton-meter report # default window
halton-meter report --since 7d # last 7 days
halton-meter report --from 2026-04-01 --to 2026-05-01
halton-meter report --vs-previous # Δ% vs prior period per row
halton-meter report --by project # filter to one breakdown
halton-meter report --csv > spend.csv # export
halton-meter report --json | jq # pipe to dashboard
Layout adapts to terminal width (60 / 80 / 100 / 120 cols). Includes share-of-spend bars, p95 latency, and per-project 7-day trend sparklines.
Failure mode — if the daemon is stopped, upgraded, or crashes, the always-on edge process transparently tunnels traffic direct to upstream. Your apps never see ConnectionRefused. Logging is observation infrastructure, never a single point of failure.
Pricing rates
Halton Meter ships with a bundled per-model pricing matrix sourced from each provider's public price page. Every captured request is costed against this matrix at write time and the rate-source is stamped on the row (requests.rates_source = "bundled-YYYY-MM-DD" or "override"), so every figure in a report or CSV export is traceable to a specific snapshot.
Three layers of freshness defence:
- Visible.
halton-meter reportprintsRates: bundled YYYY-MM-DD · N override(s)under the masthead. - Overridable.
halton-meter pricing set / list / resetrecords operator-set rates. Future captures use them immediately; historicalhalton-meter recompute-costsconsults overrides first. - Probed.
halton-meter doctordoes a 2s loopback fetch ofrates-manifest.jsonand warns if the bundled date trails the latest published bundle. Fail-open: any network/parse error leaves the check silent. Disable withHALTON_METER_NO_RATES_PROBE=1.
Cost math itself never touches the network. The freshness probe is an opt-out signal — it never affects pricing, only the doctor row.
Three install modes
| Mode | Invoke | What's metered | Risk |
|---|---|---|---|
| env-var-only (default) | halton-meter init |
Whatever you explicitly route via halton-meter run <cmd> (Claude Code, the Anthropic SDK, python my_script.py, an interactive subshell with --shell, …). |
None — never touches the system proxy. |
| apps | halton-meter init --apps |
Spotlight/Dock-spawned apps that respect HTTPS_PROXY env (Claude Desktop, JetBrains IDEs, Cursor, Windsurf), and new shells launched after the install. Not browsers. |
None — system proxy still untouched. |
| full | halton-meter init --full |
All of apps plus browsers and GUI apps that ignore HTTPS_PROXY env (every other client) via the macOS system proxy. |
May break HSTS browser sites if the CA isn't trusted by SecTrust. The daemon refuses to enable the proxy when cert verification fails. |
Switching modes is one command — the install path reconciles all mode-specific state in either direction. No uninstall step required.
Decision tree
- "I just want to meter my Claude Code / SDK calls." →
init(env-only default) - "I want every app I open from Spotlight metered, but I don't want my browser to break." →
init --apps - "I want every HTTPS request my Mac makes, including browser traffic." →
init --full
Branded CLI surfaces
Every user-facing command renders with the brand-orange Halton Meter mark and editorial sections:
halton-meter status— aggregate verdict (HEALTHY / DEGRADED / BROKEN), Component Health table with pill state indicators, ground-truth port resolution, sentinel state.halton-meter doctor— row-by-row diagnostic with pill severity and copy-pasteable next-actions.halton-meter report— masthead with date range, summary KV block, by-project + by-model tables with share-of-spend bars and 7-day sparklines.halton-meter init / uninstall— sectioned step rows (supervisor (launchd)/certificate trust/shell environment), idempotency-aware.halton-meter start / stop— masthead + step panel with PIDs and/healthlatency.
Layouts adapt to terminal width. The aggregate verdict is always honest — never says HEALTHY when truth disagrees.
Useful commands
halton-meter init [--apps|--full] [--no-shell-rc] # install (re-runnable; idempotent)
halton-meter start | stop # supervisor control
halton-meter status [--json] # mode-aware HEALTHY / DEGRADED / BROKEN
halton-meter doctor [--json] [--curl] # detailed diagnostics
halton-meter report [--since 7d | --from … --to …] # cost report (see above)
halton-meter run <cmd> [args...] # exec wrapper that injects metering env
halton-meter run --shell # interactive subshell with metering env
halton-meter project <slug> set body-capture off # opt out of body capture per project
halton-meter bodies show <request_id> # inspect a captured body
halton-meter bodies stats # body-capture disk usage
halton-meter uninstall [--purge] # remove (graceful — see above)
halton-meter reset-proxy # emergency: disable system proxy
halton-meter version | --version # version
halton-meter doctor is the diagnostic of last resort: it prints every signal that matters (daemon health, cert trust, system proxy state, launchctl env, shell rc marker, env in current process) and ends with a top-line verdict and concrete next-actions. --curl adds an end-to-end TLS smoke test against https://example.com (hard-coded — never a provider domain).
Architecture (edge ↔ daemon split)
┌────────────────┐ ┌─────────────────────────┐
│ apps │ │ halton-meter-edge │
│ (Claude Code, │ ─CONNECT──► always-on TCP proxy │
│ SDKs, ...) │ │ listens on :8081 │
│ HTTPS_PROXY=...│ └────────┬─────────┬──────┘
└────────────────┘ │ │
healthy? not healthy
│ │
▼ ▼
┌────────────────┐ ┌─────────────────┐
│ halton-meter │ │ upstream API │
│ daemon │ │ (direct tunnel,│
│ │ │ no metering) │
│ mitmproxy + │ └─────────────────┘
│ FastAPI + │
│ SQLite │
└────────────────┘
Three macOS LaunchAgents:
com.haltonlabs.meter.edge— always-on TCP CONNECT proxy on:8081. Apps see this inHTTPS_PROXY. When the daemon is healthy, chains CONNECTs to it for full metering. When the daemon is down, tunnels direct.com.haltonlabs.meter— daemon (mitmproxy + FastAPI + SQLite) on:8090internal +:8765API.com.haltonlabs.meter.watchdog— Layer-2 watchdog. Polls/healthevery 1.5s; disables the system proxy after ~7.5s of unresponsiveness so traffic flows direct.
halton-meter stop no longer breaks running apps. They keep working in passthrough until you halton-meter start again. halton-meter uninstall drops the supervisor plists; the edge keeps running in passthrough until reboot or manual pkill.
What it doesn't do
- Doesn't wrap SDKs — your code stays exactly as it is
- Doesn't intercept anything you don't ask it to — only configured LLM endpoints
- Doesn't send your data anywhere by default — runs entirely locally
- Doesn't break your workflow — if the proxy fails, traffic falls through to the real provider
Known limitations
Honest list, not a roadmap. Apps that ignore both the macOS system proxy AND HTTPS_PROXY env have no general capture path:
- Go binaries with
GODEBUG=netdns=go. Go's pure-Go resolver path bypassesHTTPS_PROXYenv in some builds. Calls reach the provider unmetered. - libcurl callers using
--insecureor withCURLOPT_PROXYnot honoured.--insecurecallers skip TLS verification entirely. - WSL2 networking on Windows hosts. Currently doesn't route WSL2 guest traffic through the Windows-side proxy. Linux-native and macOS are the supported platforms.
- Hardcoded HTTP stacks that open raw TLS sockets to known provider IPs. Anything that bypasses both system proxy AND
HTTPS_PROXYenv will reach the provider unmetered. - Late-coming network interfaces. macOS network services are enumerated once when the daemon starts. If you plug in a new interface (Thunderbolt dock, USB-tethered iPhone), restart with
halton-meter stop && halton-meter start.
Project
Halton Meter is source-available under the Polyform Perimeter License 1.0.0. Built by Halton Labs.
- Website: https://haltonmeter.com
- Contact / support / bug reports: operator@haltonlabs.com
- Security disclosures: operator@haltonlabs.com (subject prefix
[security]) — please do not post to public channels - Source: available on request — email operator@haltonlabs.com with your use case
This package is the daemon component — the local proxy that intercepts, attributes, and costs LLM API traffic. The cloud dashboard and Phase 3 multi-tenant SaaS are developed separately.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file halton_meter-0.3.8.tar.gz.
File metadata
- Download URL: halton_meter-0.3.8.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b68730d2a4cbb21a83265b75baffbc84c73df14d437fd627641f3bd1ab8196c
|
|
| MD5 |
8703c95800dde31bdd14cd6b32a2e393
|
|
| BLAKE2b-256 |
0a698ce4557af82195aa499b7eeb207395bc8cf0b01af3c44e23ce31057a5df9
|
Provenance
The following attestation bundles were made for halton_meter-0.3.8.tar.gz:
Publisher:
publish.yml on haltonlabs/halton-meter
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
halton_meter-0.3.8.tar.gz -
Subject digest:
9b68730d2a4cbb21a83265b75baffbc84c73df14d437fd627641f3bd1ab8196c - Sigstore transparency entry: 1732033220
- Sigstore integration time:
-
Permalink:
haltonlabs/halton-meter@3ea77908620787efade0ac816e4d5a7cc14f706c -
Branch / Tag:
refs/tags/v0.3.8 - Owner: https://github.com/haltonlabs
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3ea77908620787efade0ac816e4d5a7cc14f706c -
Trigger Event:
push
-
Statement type:
File details
Details for the file halton_meter-0.3.8-py3-none-any.whl.
File metadata
- Download URL: halton_meter-0.3.8-py3-none-any.whl
- Upload date:
- Size: 665.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f7e4ca4cf949de253c53f9ef23f6c5991764c282ea278a313a7dc848dbbde99
|
|
| MD5 |
bafdc64979ef3df6d8a4f5a263d5fb47
|
|
| BLAKE2b-256 |
0a55d95a25fff7c97139f9d0ef78a0cf86e66f3ea7332ec069ee9257cf361e11
|
Provenance
The following attestation bundles were made for halton_meter-0.3.8-py3-none-any.whl:
Publisher:
publish.yml on haltonlabs/halton-meter
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
halton_meter-0.3.8-py3-none-any.whl -
Subject digest:
9f7e4ca4cf949de253c53f9ef23f6c5991764c282ea278a313a7dc848dbbde99 - Sigstore transparency entry: 1732033374
- Sigstore integration time:
-
Permalink:
haltonlabs/halton-meter@3ea77908620787efade0ac816e4d5a7cc14f706c -
Branch / Tag:
refs/tags/v0.3.8 - Owner: https://github.com/haltonlabs
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3ea77908620787efade0ac816e4d5a7cc14f706c -
Trigger Event:
push
-
Statement type: