Skip to main content

Local LLM API proxy — captures usage, attributes cost to projects.

Project description

Halton Meter

Local LLM API proxy. Observes outbound traffic, attributes every request to a project, computes exact cost, surfaces it via terminal reports and a loopback HTTP API.

Designed for solo developers, agencies, and in-house dev teams who use Claude Code, the Anthropic SDK, OpenAI, Gemini, xAI, or other LLM clients heavily and want transparency over what's being spent and on what.

License: Polyform Perimeter 1.0.0 PyPI Python


Platform support

macOS is the primary supported platform. Install + lifecycle + capture + reporting are exercised end-to-end on the maintainer's daily-driver before every release.

Linux is in public beta as of v0.1.24. Verified on Ubuntu 24.04 LTS against (a) a Docker + systemd-container soak run on every commit and (b) an AWS Lightsail instance with real systemd-logind, real loginctl enable-linger, real reboot survival, real Claude Code sessions captured end-to-end (Anthropic + Gemini interception confirmed). Other distros (Debian, Fedora, Arch) should work — same systemd --user mechanism — but are not yet covered by the release gate. Please open an issue with [linux-beta] in the subject if you hit anything; we want to graduate Linux to "primary supported" with broader distro coverage in v0.2.

Linux runs apps-mode only (env-var-only HTTPS_PROXY — no system-level proxy via networksetup-equivalent or transparent-proxy via iptables / nftables; that scope is on the post-v1.0 roadmap). The daemon + edge ship as systemd --user units under ~/.config/systemd/user/; the watchdog unit is macOS-only (the launchd-based networksetup poller has no Linux counterpart). See LINUX.md for the full Linux setup guide.

Windows is not yet implemented.


Quick install

Prerequisites

Requirement Notes
Python 3.11, 3.12, or 3.13 3.14 is not supported yet (pinned out — pydantic-core / rich wheels not ready). On macOS: brew install python@3.13.
pipx Recommended installer. brew install pipx && pipx ensurepath, or python3 -m pip install --user pipx.
macOS or Linux macOS (any recent version) or Linux with systemd --user and loginctl enable-linger. Windows is not yet implemented.
Admin password (one-time) Granted during init to install the mitmproxy CA cert into the macOS System keychain.

Everything else (mitmproxy, FastAPI, SQLAlchemy, structlog, Rich, Click, certifi) is bundled in the wheel. No Docker, no Postgres, no separate backend.

Install

# install
pipx install halton-meter

# verify
halton-meter --version

# one-time setup: cert trust + launchd supervisors + per-shell env block
halton-meter init --apps

# start metering
halton-meter start

# verify health
halton-meter status

After this, every terminal you open and every Spotlight-launched app routes its LLM API traffic through the meter. Existing terminals get the env on next shell open.

To stop metering: halton-meter stop. To remove entirely: halton-meter uninstall.

Graceful uninstall. uninstall stops metering and removes the launchd supervisor plists, but the edge process keeps running in pure-passthrough mode until next reboot. This is intentional — apps with HTTPS_PROXY=http://127.0.0.1:8081 baked into their environment (running terminals, IDEs, browsers) don't lose connectivity mid-flow. On reboot the edge dies with your session and isn't respawned. To kill it immediately: pkill -f "halton-meter edge". To also wipe ~/.halton-meter/db.sqlite and logs, add --purge.


What it does

Captures and costs every outbound LLM API call across these providers:

  • Anthropicapi.anthropic.com (/v1/messages, streaming + non-streaming, extended thinking, cache read/write)
  • OpenAIapi.openai.com (/v1/chat/completions, /v1/responses, /v1/embeddings; o-series reasoning tokens; prompt cache reads)
  • Geminigenerativelanguage.googleapis.com (/v1beta/models/<model>:generateContent, :streamGenerateContent, :embedContent, :batchEmbedContents; Gemini 2.5 thinking tokens; >200k tiered pricing; cached-content reads)
  • xAIapi.x.ai (/v1/chat/completions OpenAI-compat shape; /v1/messages Anthropic-compat shape)

Project attribution via HALTON_PROJECT env var, a .haltonrc file in the working directory, or the working directory name. macOS GUI apps get attributed via launchctl bundle-ID humanisation (mac:com.docker.dockerDocker).

Storage — local SQLite (~/.halton-meter/db.sqlite) with WAL. Token counts, latency, cost in integer millicents (no float drift), and full request/response bodies with credential redaction. Body capture default ON; opt out per project with halton-meter project <slug> set body-capture off. Background retention sweep keeps disk bounded (default 90 days).

Loopback HTTP API at 127.0.0.1:8765 for the cloud dashboard or any reader. JSON in, JSON out. Audit trail of every request, policy event, and reconciliation run.

Cost report — single-command client-billing surface:

halton-meter report                           # default window
halton-meter report --since 7d                # last 7 days
halton-meter report --from 2026-04-01 --to 2026-05-01
halton-meter report --vs-previous             # Δ% vs prior period per row
halton-meter report --by project              # filter to one breakdown
halton-meter report --csv > spend.csv         # export
halton-meter report --json | jq               # pipe to dashboard

Layout adapts to terminal width (60 / 80 / 100 / 120 cols). Includes share-of-spend bars, p95 latency, and per-project 7-day trend sparklines.

Failure mode — if the daemon is stopped, upgraded, or crashes, the always-on edge process transparently tunnels traffic direct to upstream. Your apps never see ConnectionRefused. Logging is observation infrastructure, never a single point of failure.


Pricing rates

Halton Meter ships with a bundled per-model pricing matrix sourced from each provider's public price page. Every captured request is costed against this matrix at write time and the rate-source is stamped on the row (requests.rates_source = "bundled-YYYY-MM-DD" or "override"), so every figure in a report or CSV export is traceable to a specific snapshot.

Three layers of freshness defence:

  1. Visible. halton-meter report prints Rates: bundled YYYY-MM-DD · N override(s) under the masthead.
  2. Overridable. halton-meter pricing set / list / reset records operator-set rates. Future captures use them immediately; historical halton-meter recompute-costs consults overrides first.
  3. Probed. halton-meter doctor does a 2s loopback fetch of rates-manifest.json and warns if the bundled date trails the latest published bundle. Fail-open: any network/parse error leaves the check silent. Disable with HALTON_METER_NO_RATES_PROBE=1.

Cost math itself never touches the network. The freshness probe is an opt-out signal — it never affects pricing, only the doctor row.


Three install modes

Mode Invoke What's metered Risk
env-var-only (default) halton-meter init Whatever you explicitly route via halton-meter run <cmd> (Claude Code, the Anthropic SDK, python my_script.py, an interactive subshell with --shell, …). None — never touches the system proxy.
apps halton-meter init --apps Spotlight/Dock-spawned apps that respect HTTPS_PROXY env (Claude Desktop, JetBrains IDEs, Cursor, Windsurf), and new shells launched after the install. Not browsers. None — system proxy still untouched.
full halton-meter init --full All of apps plus browsers and GUI apps that ignore HTTPS_PROXY env (every other client) via the macOS system proxy. May break HSTS browser sites if the CA isn't trusted by SecTrust. The daemon refuses to enable the proxy when cert verification fails.

Switching modes is one command — the install path reconciles all mode-specific state in either direction. No uninstall step required.

Decision tree

  • "I just want to meter my Claude Code / SDK calls."init (env-only default)
  • "I want every app I open from Spotlight metered, but I don't want my browser to break."init --apps
  • "I want every HTTPS request my Mac makes, including browser traffic."init --full

Branded CLI surfaces

Every user-facing command renders with the brand-orange Halton Meter mark and editorial sections:

  • halton-meter status — aggregate verdict (HEALTHY / DEGRADED / BROKEN), Component Health table with pill state indicators, ground-truth port resolution, sentinel state.
  • halton-meter doctor — row-by-row diagnostic with pill severity and copy-pasteable next-actions.
  • halton-meter report — masthead with date range, summary KV block, by-project + by-model tables with share-of-spend bars and 7-day sparklines.
  • halton-meter init / uninstall — sectioned step rows (supervisor (launchd) / certificate trust / shell environment), idempotency-aware.
  • halton-meter start / stop — masthead + step panel with PIDs and /health latency.

Layouts adapt to terminal width. The aggregate verdict is always honest — never says HEALTHY when truth disagrees.


Useful commands

halton-meter init [--apps|--full] [--no-shell-rc]   # install (re-runnable; idempotent)
halton-meter start | stop                           # supervisor control
halton-meter status [--json]                        # mode-aware HEALTHY / DEGRADED / BROKEN
halton-meter doctor [--json] [--curl]               # detailed diagnostics
halton-meter report [--since 7d | --from  --to ]  # cost report (see above)
halton-meter run <cmd> [args...]                    # exec wrapper that injects metering env
halton-meter run --shell                            # interactive subshell with metering env
halton-meter project <slug> set body-capture off    # opt out of body capture per project
halton-meter bodies show <request_id>               # inspect a captured body
halton-meter bodies stats                           # body-capture disk usage
halton-meter uninstall [--purge]                    # remove (graceful — see above)
halton-meter reset-proxy                            # emergency: disable system proxy
halton-meter version | --version                    # version

halton-meter doctor is the diagnostic of last resort: it prints every signal that matters (daemon health, cert trust, system proxy state, launchctl env, shell rc marker, env in current process) and ends with a top-line verdict and concrete next-actions. --curl adds an end-to-end TLS smoke test against https://example.com (hard-coded — never a provider domain).


Architecture (edge ↔ daemon split)

┌────────────────┐         ┌─────────────────────────┐
│ apps           │         │   halton-meter-edge     │
│ (Claude Code,  │ ─CONNECT──►  always-on TCP proxy  │
│  SDKs, ...)    │         │   listens on :8081      │
│ HTTPS_PROXY=...│         └────────┬─────────┬──────┘
└────────────────┘                  │         │
                              healthy?       not healthy
                                    │         │
                                    ▼         ▼
                       ┌────────────────┐  ┌─────────────────┐
                       │  halton-meter  │  │  upstream API   │
                       │  daemon        │  │  (direct tunnel,│
                       │                │  │   no metering)  │
                       │  mitmproxy +   │  └─────────────────┘
                       │  FastAPI +     │
                       │  SQLite        │
                       └────────────────┘

Three macOS LaunchAgents:

  • com.haltonlabs.meter.edge — always-on TCP CONNECT proxy on :8081. Apps see this in HTTPS_PROXY. When the daemon is healthy, chains CONNECTs to it for full metering. When the daemon is down, tunnels direct.
  • com.haltonlabs.meter — daemon (mitmproxy + FastAPI + SQLite) on :8090 internal + :8765 API.
  • com.haltonlabs.meter.watchdog — Layer-2 watchdog. Polls /health every 1.5s; disables the system proxy after ~7.5s of unresponsiveness so traffic flows direct.

halton-meter stop no longer breaks running apps. They keep working in passthrough until you halton-meter start again. halton-meter uninstall drops the supervisor plists; the edge keeps running in passthrough until reboot or manual pkill.


What it doesn't do

  • Doesn't wrap SDKs — your code stays exactly as it is
  • Doesn't intercept anything you don't ask it to — only configured LLM endpoints
  • Doesn't send your data anywhere by default — runs entirely locally
  • Doesn't break your workflow — if the proxy fails, traffic falls through to the real provider

Known limitations

Honest list, not a roadmap. Apps that ignore both the macOS system proxy AND HTTPS_PROXY env have no general capture path:

  • Go binaries with GODEBUG=netdns=go. Go's pure-Go resolver path bypasses HTTPS_PROXY env in some builds. Calls reach the provider unmetered.
  • libcurl callers using --insecure or with CURLOPT_PROXY not honoured. --insecure callers skip TLS verification entirely.
  • WSL2 networking on Windows hosts. Currently doesn't route WSL2 guest traffic through the Windows-side proxy. Linux-native and macOS are the supported platforms.
  • Hardcoded HTTP stacks that open raw TLS sockets to known provider IPs. Anything that bypasses both system proxy AND HTTPS_PROXY env will reach the provider unmetered.
  • Late-coming network interfaces. macOS network services are enumerated once when the daemon starts. If you plug in a new interface (Thunderbolt dock, USB-tethered iPhone), restart with halton-meter stop && halton-meter start.

Project

Halton Meter is source-available under the Polyform Perimeter License 1.0.0. Built by Halton Labs.

This package is the daemon component — the local proxy that intercepts, attributes, and costs LLM API traffic. The cloud dashboard and Phase 3 multi-tenant SaaS are developed separately.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

halton_meter-0.2.1.tar.gz (779.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

halton_meter-0.2.1-py3-none-any.whl (511.6 kB view details)

Uploaded Python 3

File details

Details for the file halton_meter-0.2.1.tar.gz.

File metadata

  • Download URL: halton_meter-0.2.1.tar.gz
  • Upload date:
  • Size: 779.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for halton_meter-0.2.1.tar.gz
Algorithm Hash digest
SHA256 4ebc9f7a9a44f812ac6db95114a5a563d55953588aee7ac88aa06e14e5308dd2
MD5 1cc4cf16c07c8ec44705caec5ac0eb3e
BLAKE2b-256 6b18cdc5dcf1a1d2d65d31ee03dde7b276ab52d64e481623a96e5678a4cc7f86

See more details on using hashes here.

File details

Details for the file halton_meter-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: halton_meter-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 511.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for halton_meter-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6ecf842b9c1d4903198806b1a02fe8a5fa8d184f7847e5093acbc2f1c9540218
MD5 2de3c7797e5bc3944da046e7e5fb9878
BLAKE2b-256 65b19fa04375535d31a17b69a2102eaa9edad660caf8ac43ab19917f637a4070

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page