Skip to main content

Local LLM API proxy — captures usage, attributes cost to projects.

Project description

Halton Meter

Local LLM API proxy. Observes outbound traffic, attributes every request to a project, computes exact cost, surfaces it via terminal reports and a loopback HTTP API.

Designed for solo developers, agencies, and in-house dev teams who use Claude Code, the Anthropic SDK, OpenAI, Gemini, xAI, or other LLM clients heavily and want transparency over what's being spent and on what.

License: Polyform Perimeter 1.0.1 PyPI Python


Platform support

macOS is the primary supported platform. Install + lifecycle + capture + reporting are exercised end-to-end on the maintainer's daily-driver before every release.

Linux is in public beta as of v0.1.24. Verified on Ubuntu 24.04 LTS against (a) a Docker + systemd-container soak run on every commit and (b) an AWS Lightsail instance with real systemd-logind, real loginctl enable-linger, real reboot survival, real Claude Code sessions captured end-to-end (Anthropic + Gemini interception confirmed). Other distros (Debian, Fedora, Arch) should work — same systemd --user mechanism — but are not yet covered by the release gate. Please open an issue with [linux-beta] in the subject if you hit anything; we want to graduate Linux to "primary supported" with broader distro coverage in v0.2.

Linux runs apps-mode only (env-var-only HTTPS_PROXY — no system-level proxy via networksetup-equivalent or transparent-proxy via iptables / nftables; that scope is on the post-v1.0 roadmap). The daemon + edge ship as systemd --user units under ~/.config/systemd/user/; the watchdog unit is macOS-only (the launchd-based networksetup poller has no Linux counterpart). See LINUX.md for the full Linux setup guide.

Windows apps-mode is available as of v0.2.11 (beta). uvx halton-meter works on Windows 10 / 11. No admin rights required for apps-mode. The daemon and edge run as Task Scheduler user-level tasks; the mitmproxy CA cert is installed into the user cert store via certutil -user -addstore Root; the system proxy is written to HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings with a WM_SETTINGCHANGE broadcast so Electron apps (Claude Code, VS Code) pick it up without restart. Full-mode (machine-wide proxy, NSSM service, MDM cert deployment) is on the post-v1.0 roadmap. See INSTALL.md for Windows setup instructions.


Quick install

Prerequisites

Requirement Notes
uv Install with curl -LsSf https://astral.sh/uv/install.sh | sh (macOS / Linux) or powershell -c "irm https://astral.sh/uv/install.ps1 | iex" (Windows). uv ships its own Python interpreter, so you do not need to install Python yourself.
macOS or Linux macOS (any recent version) — primary support. Linux with systemd --user and loginctl enable-linger — public beta. Windows: distribution layer works, OS-level proxy install is future work.
Admin password (one-time) Granted during init to install the mitmproxy CA cert into the macOS System keychain.

Everything else (mitmproxy, FastAPI, SQLAlchemy, structlog, Rich, Click, certifi) is bundled in the wheel. No Docker, no Postgres, no separate backend.

Install

Pick one — they all install from the same PyPI wheel:

# Option A — uvx (canonical; zero Python prereq, no PATH change).
uvx halton-meter --version

# Option B — uv tool install (persistent; puts `halton-meter` on PATH).
uv tool install halton-meter
halton-meter --version

# Option C — pipx, if you already use it.
pipx install halton-meter
halton-meter --version

Then run the setup flow (substitute uvx halton-meter for halton-meter if you chose Option A):

# one-time setup: cert trust + launchd supervisors + per-shell env block
halton-meter init --apps

# start metering
halton-meter start

# verify health
halton-meter status

After this, every terminal you open and every Spotlight-launched app routes its LLM API traffic through the meter. Existing terminals get the env on next shell open.

To stop metering: halton-meter stop. To remove entirely: halton-meter uninstall.

Graceful uninstall. uninstall stops metering and removes the launchd supervisor plists, but the edge process keeps running in pure-passthrough mode until next reboot. This is intentional — apps with HTTPS_PROXY=http://127.0.0.1:8081 baked into their environment (running terminals, IDEs, browsers) don't lose connectivity mid-flow. On reboot the edge dies with your session and isn't respawned. To kill it immediately: pkill -f "halton-meter edge". To also wipe ~/.halton-meter/db.sqlite and logs, add --purge.


What it does

Captures and costs every outbound LLM API call across these providers:

  • Anthropicapi.anthropic.com (/v1/messages, streaming + non-streaming, extended thinking, cache read/write)
  • OpenAIapi.openai.com (/v1/chat/completions, /v1/responses, /v1/embeddings; o-series reasoning tokens; prompt cache reads)
  • Geminigenerativelanguage.googleapis.com (/v1beta/models/<model>:generateContent, :streamGenerateContent, :embedContent, :batchEmbedContents; Gemini 2.5 thinking tokens; >200k tiered pricing; cached-content reads)
  • xAIapi.x.ai (/v1/chat/completions OpenAI-compat shape; /v1/messages Anthropic-compat shape)

Project attribution via HALTON_PROJECT env var, a .haltonrc file in the working directory, or the working directory name. macOS GUI apps get attributed via launchctl bundle-ID humanisation (mac:com.docker.dockerDocker).

Storage — local SQLite (~/.halton-meter/db.sqlite) with WAL. Token counts, latency, cost in integer millicents (no float drift), and full request/response bodies with credential redaction. Body capture default ON; opt out per project with halton-meter project <slug> set body-capture off. Background retention sweep keeps disk bounded (default 90 days).

Loopback HTTP API at 127.0.0.1:8765 for the cloud dashboard or any reader. JSON in, JSON out. Audit trail of every request, policy event, and reconciliation run.

Cost report — single-command client-billing surface:

halton-meter report                           # default window
halton-meter report --since 7d                # last 7 days
halton-meter report --from 2026-04-01 --to 2026-05-01
halton-meter report --vs-previous             # Δ% vs prior period per row
halton-meter report --by project              # filter to one breakdown
halton-meter report --csv > spend.csv         # export
halton-meter report --json | jq               # pipe to dashboard

Layout adapts to terminal width (60 / 80 / 100 / 120 cols). Includes share-of-spend bars, p95 latency, and per-project 7-day trend sparklines.

Failure mode — if the daemon is stopped, upgraded, or crashes, the always-on edge process transparently tunnels traffic direct to upstream. Your apps never see ConnectionRefused. Logging is observation infrastructure, never a single point of failure.


Pricing rates

Halton Meter ships with a bundled per-model pricing matrix sourced from each provider's public price page. Every captured request is costed against this matrix at write time and the rate-source is stamped on the row (requests.rates_source = "bundled-YYYY-MM-DD" or "override"), so every figure in a report or CSV export is traceable to a specific snapshot.

Three layers of freshness defence:

  1. Visible. halton-meter report prints Rates: bundled YYYY-MM-DD · N override(s) under the masthead.
  2. Overridable. halton-meter pricing set / list / reset records operator-set rates. Future captures use them immediately; historical halton-meter recompute-costs consults overrides first.
  3. Probed. halton-meter doctor does a 2s loopback fetch of rates-manifest.json and warns if the bundled date trails the latest published bundle. Fail-open: any network/parse error leaves the check silent. Disable with HALTON_METER_NO_RATES_PROBE=1.

Cost math itself never touches the network. The freshness probe is an opt-out signal — it never affects pricing, only the doctor row.


Three install modes

Mode Invoke What's metered Risk
env-var-only (default) halton-meter init Whatever you explicitly route via halton-meter run <cmd> (Claude Code, the Anthropic SDK, python my_script.py, an interactive subshell with --shell, …). None — never touches the system proxy.
apps halton-meter init --apps Spotlight/Dock-spawned apps that respect HTTPS_PROXY env (Claude Desktop, JetBrains IDEs, Cursor, Windsurf), and new shells launched after the install. Not browsers. None — system proxy still untouched.
full halton-meter init --full All of apps plus browsers and GUI apps that ignore HTTPS_PROXY env (every other client) via the macOS system proxy. May break HSTS browser sites if the CA isn't trusted by SecTrust. The daemon refuses to enable the proxy when cert verification fails.

Switching modes is one command — the install path reconciles all mode-specific state in either direction. No uninstall step required.

Decision tree

  • "I just want to meter my Claude Code / SDK calls."init (env-only default)
  • "I want every app I open from Spotlight metered, but I don't want my browser to break."init --apps
  • "I want every HTTPS request my Mac makes, including browser traffic."init --full

Branded CLI surfaces

Every user-facing command renders with the brand-orange Halton Meter mark and editorial sections:

  • halton-meter status — aggregate verdict (HEALTHY / DEGRADED / BROKEN), Component Health table with pill state indicators, ground-truth port resolution, sentinel state.
  • halton-meter doctor — row-by-row diagnostic with pill severity and copy-pasteable next-actions.
  • halton-meter report — masthead with date range, summary KV block, by-project + by-model tables with share-of-spend bars and 7-day sparklines.
  • halton-meter init / uninstall — sectioned step rows (supervisor (launchd) / certificate trust / shell environment), idempotency-aware.
  • halton-meter start / stop — masthead + step panel with PIDs and /health latency.

Layouts adapt to terminal width. The aggregate verdict is always honest — never says HEALTHY when truth disagrees.


Useful commands

halton-meter init [--apps|--full] [--no-shell-rc]   # install (re-runnable; idempotent)
halton-meter start | stop                           # supervisor control
halton-meter status [--json]                        # mode-aware HEALTHY / DEGRADED / BROKEN
halton-meter doctor [--json] [--curl]               # detailed diagnostics
halton-meter report [--since 7d | --from  --to ]  # cost report (see above)
halton-meter run <cmd> [args...]                    # exec wrapper that injects metering env
halton-meter run --shell                            # interactive subshell with metering env
halton-meter project <slug> set body-capture off    # opt out of body capture per project
halton-meter bodies show <request_id>               # inspect a captured body
halton-meter bodies stats                           # body-capture disk usage
halton-meter uninstall [--purge]                    # remove (graceful — see above)
halton-meter reset-proxy                            # emergency: disable system proxy
halton-meter version | --version                    # version

halton-meter doctor is the diagnostic of last resort: it prints every signal that matters (daemon health, cert trust, system proxy state, launchctl env, shell rc marker, env in current process) and ends with a top-line verdict and concrete next-actions. --curl adds an end-to-end TLS smoke test against https://example.com (hard-coded — never a provider domain).


Architecture (edge ↔ daemon split)

┌────────────────┐         ┌─────────────────────────┐
│ apps           │         │   halton-meter-edge     │
│ (Claude Code,  │ ─CONNECT──►  always-on TCP proxy  │
│  SDKs, ...)    │         │   listens on :8081      │
│ HTTPS_PROXY=...│         └────────┬─────────┬──────┘
└────────────────┘                  │         │
                              healthy?       not healthy
                                    │         │
                                    ▼         ▼
                       ┌────────────────┐  ┌─────────────────┐
                       │  halton-meter  │  │  upstream API   │
                       │  daemon        │  │  (direct tunnel,│
                       │                │  │   no metering)  │
                       │  mitmproxy +   │  └─────────────────┘
                       │  FastAPI +     │
                       │  SQLite        │
                       └────────────────┘

Three macOS LaunchAgents:

  • com.haltonlabs.meter.edge — always-on TCP CONNECT proxy on :8081. Apps see this in HTTPS_PROXY. When the daemon is healthy, chains CONNECTs to it for full metering. When the daemon is down, tunnels direct.
  • com.haltonlabs.meter — daemon (mitmproxy + FastAPI + SQLite) on :8090 internal + :8765 API.
  • com.haltonlabs.meter.watchdog — Layer-2 watchdog. Polls /health every 1.5s; disables the system proxy after ~7.5s of unresponsiveness so traffic flows direct.

halton-meter stop no longer breaks running apps. They keep working in passthrough until you halton-meter start again. halton-meter uninstall drops the supervisor plists; the edge keeps running in passthrough until reboot or manual pkill.


What it doesn't do

  • Doesn't wrap SDKs — your code stays exactly as it is
  • Doesn't intercept anything you don't ask it to — only configured LLM endpoints
  • Doesn't send your data anywhere by default — runs entirely locally
  • Doesn't break your workflow — if the proxy fails, traffic falls through to the real provider

Known limitations

Honest list, not a roadmap. Apps that ignore both the macOS system proxy AND HTTPS_PROXY env have no general capture path:

  • Go binaries with GODEBUG=netdns=go. Go's pure-Go resolver path bypasses HTTPS_PROXY env in some builds. Calls reach the provider unmetered.
  • libcurl callers using --insecure or with CURLOPT_PROXY not honoured. --insecure callers skip TLS verification entirely.
  • WSL2 networking on Windows hosts. Currently doesn't route WSL2 guest traffic through the Windows-side proxy. Linux-native and macOS are the supported platforms.
  • Hardcoded HTTP stacks that open raw TLS sockets to known provider IPs. Anything that bypasses both system proxy AND HTTPS_PROXY env will reach the provider unmetered.
  • Late-coming network interfaces. macOS network services are enumerated once when the daemon starts. If you plug in a new interface (Thunderbolt dock, USB-tethered iPhone), restart with halton-meter stop && halton-meter start.

Project

Halton Meter is source-available under the Polyform Perimeter License 1.0.1. Built by Halton Labs.

This package is the daemon component — the local proxy that intercepts, attributes, and costs LLM API traffic. The cloud dashboard and Phase 3 multi-tenant SaaS are developed separately.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

halton_meter-0.4.2.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

halton_meter-0.4.2-py3-none-any.whl (695.4 kB view details)

Uploaded Python 3

File details

Details for the file halton_meter-0.4.2.tar.gz.

File metadata

  • Download URL: halton_meter-0.4.2.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for halton_meter-0.4.2.tar.gz
Algorithm Hash digest
SHA256 306666cd71e77ad80114a1f9c198911aa8f4becabec3b320ead33d1f67a4336a
MD5 c6e8b416356bd59671b4a589b414844e
BLAKE2b-256 b228a6f55c529b2c7713c6ca00ba2a634a01a15f1ab16636c82e93b1bc36cf9d

See more details on using hashes here.

Provenance

The following attestation bundles were made for halton_meter-0.4.2.tar.gz:

Publisher: publish.yml on haltonlabs/halton-meter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file halton_meter-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: halton_meter-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 695.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for halton_meter-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 bf2eef0c87ef862f2fa60e0468f8b29dc0a98086284ed77fa995fd07dd09efb6
MD5 0cad64a390217bb736393fec48d43b11
BLAKE2b-256 ef8e59c619c41355b244bb7f2fe1032615f3cd81b3039b229068c7a8e9972206

See more details on using hashes here.

Provenance

The following attestation bundles were made for halton_meter-0.4.2-py3-none-any.whl:

Publisher: publish.yml on haltonlabs/halton-meter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page