Skip to main content

Local LLM API proxy — captures usage, attributes cost to projects.

Project description

Halton Meter

Local LLM API proxy. Observes outbound traffic, attributes every request to a project, computes exact cost, surfaces it via terminal reports and a loopback HTTP API.

Designed for solo developers, agencies, and in-house dev teams who use Claude Code, the Anthropic SDK, OpenAI, Gemini, xAI, or other LLM clients heavily and want transparency over what's being spent and on what.

License: Polyform Perimeter 1.0.0 PyPI Python


Quick install

Prerequisites

Requirement Notes
Python 3.11, 3.12, or 3.13 3.14 is not supported yet (pinned out — pydantic-core / rich wheels not ready). On macOS: brew install python@3.13.
pipx Recommended installer. brew install pipx && pipx ensurepath, or python3 -m pip install --user pipx.
macOS Currently the only supported platform. Linux + Windows arrive in the init flow.
Admin password (one-time) Granted during init to install the mitmproxy CA cert into the macOS System keychain.

Everything else (mitmproxy, FastAPI, SQLAlchemy, structlog, Rich, Click, certifi) is bundled in the wheel. No Docker, no Postgres, no separate backend.

Install

# install
pipx install halton-meter

# verify
halton-meter --version

# one-time setup: cert trust + launchd supervisors + per-shell env block
halton-meter init --apps

# start metering
halton-meter start

# verify health
halton-meter status

After this, every terminal you open and every Spotlight-launched app routes its LLM API traffic through the meter. Existing terminals get the env on next shell open.

To stop metering: halton-meter stop. To remove entirely: halton-meter uninstall.

Graceful uninstall. uninstall stops metering and removes the launchd supervisor plists, but the edge process keeps running in pure-passthrough mode until next reboot. This is intentional — apps with HTTPS_PROXY=http://127.0.0.1:8081 baked into their environment (running terminals, IDEs, browsers) don't lose connectivity mid-flow. On reboot the edge dies with your session and isn't respawned. To kill it immediately: pkill -f "halton-meter edge". To also wipe ~/.halton-meter/db.sqlite and logs, add --purge.


What it does

Captures and costs every outbound LLM API call across these providers:

  • Anthropicapi.anthropic.com (/v1/messages, streaming + non-streaming, extended thinking, cache read/write)
  • OpenAIapi.openai.com (/v1/chat/completions, /v1/responses, /v1/embeddings; o-series reasoning tokens; prompt cache reads)
  • Geminigenerativelanguage.googleapis.com (/v1beta/models/<model>:generateContent, :streamGenerateContent, :embedContent, :batchEmbedContents; Gemini 2.5 thinking tokens; >200k tiered pricing; cached-content reads)
  • xAIapi.x.ai (/v1/chat/completions OpenAI-compat shape; /v1/messages Anthropic-compat shape)

Project attribution via HALTON_PROJECT env var, a .haltonrc file in the working directory, or the working directory name. macOS GUI apps get attributed via launchctl bundle-ID humanisation (mac:com.docker.dockerDocker).

Storage — local SQLite (~/.halton-meter/db.sqlite) with WAL. Token counts, latency, cost in integer millicents (no float drift), and full request/response bodies with credential redaction. Body capture default ON; opt out per project with halton-meter project <slug> set body-capture off. Background retention sweep keeps disk bounded (default 90 days).

Loopback HTTP API at 127.0.0.1:8765 for the cloud dashboard or any reader. JSON in, JSON out. Audit trail of every request, policy event, and reconciliation run.

Cost report — single-command client-billing surface:

halton-meter report                           # default window
halton-meter report --since 7d                # last 7 days
halton-meter report --from 2026-04-01 --to 2026-05-01
halton-meter report --vs-previous             # Δ% vs prior period per row
halton-meter report --by project              # filter to one breakdown
halton-meter report --csv > spend.csv         # export
halton-meter report --json | jq               # pipe to dashboard

Layout adapts to terminal width (60 / 80 / 100 / 120 cols). Includes share-of-spend bars, p95 latency, and per-project 7-day trend sparklines.

Failure mode — if the daemon is stopped, upgraded, or crashes, the always-on edge process transparently tunnels traffic direct to upstream. Your apps never see ConnectionRefused. Logging is observation infrastructure, never a single point of failure.


Three install modes

Mode Invoke What's metered Risk
env-var-only (default) halton-meter init Whatever you explicitly route via halton-meter run <cmd> (Claude Code, the Anthropic SDK, python my_script.py, an interactive subshell with --shell, …). None — never touches the system proxy.
apps halton-meter init --apps Spotlight/Dock-spawned apps that respect HTTPS_PROXY env (Claude Desktop, JetBrains IDEs, Cursor, Windsurf), and new shells launched after the install. Not browsers. None — system proxy still untouched.
full halton-meter init --full All of apps plus browsers and GUI apps that ignore HTTPS_PROXY env (every other client) via the macOS system proxy. May break HSTS browser sites if the CA isn't trusted by SecTrust. The daemon refuses to enable the proxy when cert verification fails.

Switching modes is one command — the install path reconciles all mode-specific state in either direction. No uninstall step required.

Decision tree

  • "I just want to meter my Claude Code / SDK calls."init (env-only default)
  • "I want every app I open from Spotlight metered, but I don't want my browser to break."init --apps
  • "I want every HTTPS request my Mac makes, including browser traffic."init --full

Branded CLI surfaces

Every user-facing command renders with the brand-orange Halton Meter mark and editorial sections:

  • halton-meter status — aggregate verdict (HEALTHY / DEGRADED / BROKEN), Component Health table with pill state indicators, ground-truth port resolution, sentinel state.
  • halton-meter doctor — row-by-row diagnostic with pill severity and copy-pasteable next-actions.
  • halton-meter report — masthead with date range, summary KV block, by-project + by-model tables with share-of-spend bars and 7-day sparklines.
  • halton-meter init / uninstall — sectioned step rows (supervisor (launchd) / certificate trust / shell environment), idempotency-aware.
  • halton-meter start / stop — masthead + step panel with PIDs and /health latency.

Layouts adapt to terminal width. The aggregate verdict is always honest — never says HEALTHY when truth disagrees.


Useful commands

halton-meter init [--apps|--full] [--no-shell-rc]   # install (re-runnable; idempotent)
halton-meter start | stop                           # supervisor control
halton-meter status [--json]                        # mode-aware HEALTHY / DEGRADED / BROKEN
halton-meter doctor [--json] [--curl]               # detailed diagnostics
halton-meter report [--since 7d | --from  --to ]  # cost report (see above)
halton-meter run <cmd> [args...]                    # exec wrapper that injects metering env
halton-meter run --shell                            # interactive subshell with metering env
halton-meter project <slug> set body-capture off    # opt out of body capture per project
halton-meter bodies show <request_id>               # inspect a captured body
halton-meter bodies stats                           # body-capture disk usage
halton-meter uninstall [--purge]                    # remove (graceful — see above)
halton-meter reset-proxy                            # emergency: disable system proxy
halton-meter version | --version                    # version

halton-meter doctor is the diagnostic of last resort: it prints every signal that matters (daemon health, cert trust, system proxy state, launchctl env, shell rc marker, env in current process) and ends with a top-line verdict and concrete next-actions. --curl adds an end-to-end TLS smoke test against https://example.com (hard-coded — never a provider domain).


Architecture (edge ↔ daemon split)

┌────────────────┐         ┌─────────────────────────┐
│ apps           │         │   halton-meter-edge     │
│ (Claude Code,  │ ─CONNECT──►  always-on TCP proxy  │
│  SDKs, ...)    │         │   listens on :8081      │
│ HTTPS_PROXY=...│         └────────┬─────────┬──────┘
└────────────────┘                  │         │
                              healthy?       not healthy
                                    │         │
                                    ▼         ▼
                       ┌────────────────┐  ┌─────────────────┐
                       │  halton-meter  │  │  upstream API   │
                       │  daemon        │  │  (direct tunnel,│
                       │                │  │   no metering)  │
                       │  mitmproxy +   │  └─────────────────┘
                       │  FastAPI +     │
                       │  SQLite        │
                       └────────────────┘

Three macOS LaunchAgents:

  • com.haltonlabs.meter.edge — always-on TCP CONNECT proxy on :8081. Apps see this in HTTPS_PROXY. When the daemon is healthy, chains CONNECTs to it for full metering. When the daemon is down, tunnels direct.
  • com.haltonlabs.meter — daemon (mitmproxy + FastAPI + SQLite) on :8090 internal + :8765 API.
  • com.haltonlabs.meter.watchdog — Layer-2 watchdog. Polls /health every 1.5s; disables the system proxy after ~7.5s of unresponsiveness so traffic flows direct.

halton-meter stop no longer breaks running apps. They keep working in passthrough until you halton-meter start again. halton-meter uninstall drops the supervisor plists; the edge keeps running in passthrough until reboot or manual pkill.


What it doesn't do

  • Doesn't wrap SDKs — your code stays exactly as it is
  • Doesn't intercept anything you don't ask it to — only configured LLM endpoints
  • Doesn't send your data anywhere by default — runs entirely locally
  • Doesn't break your workflow — if the proxy fails, traffic falls through to the real provider

Known limitations

Honest list, not a roadmap. Apps that ignore both the macOS system proxy AND HTTPS_PROXY env have no general capture path:

  • Go binaries with GODEBUG=netdns=go. Go's pure-Go resolver path bypasses HTTPS_PROXY env in some builds. Calls reach the provider unmetered.
  • libcurl callers using --insecure or with CURLOPT_PROXY not honoured. --insecure callers skip TLS verification entirely.
  • WSL2 networking on Windows hosts. Currently doesn't route WSL2 guest traffic through the Windows-side proxy. Linux-native and macOS are the supported platforms.
  • Hardcoded HTTP stacks that open raw TLS sockets to known provider IPs. Anything that bypasses both system proxy AND HTTPS_PROXY env will reach the provider unmetered.
  • Late-coming network interfaces. macOS network services are enumerated once when the daemon starts. If you plug in a new interface (Thunderbolt dock, USB-tethered iPhone), restart with halton-meter stop && halton-meter start.

Project

Halton Meter is source-available under the Polyform Perimeter License 1.0.0. Built by Halton Labs.

This package is the daemon component — the local proxy that intercepts, attributes, and costs LLM API traffic. The cloud dashboard and Phase 3 multi-tenant SaaS are developed separately.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

halton_meter-0.1.22.21.tar.gz (656.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

halton_meter-0.1.22.21-py3-none-any.whl (423.3 kB view details)

Uploaded Python 3

File details

Details for the file halton_meter-0.1.22.21.tar.gz.

File metadata

  • Download URL: halton_meter-0.1.22.21.tar.gz
  • Upload date:
  • Size: 656.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for halton_meter-0.1.22.21.tar.gz
Algorithm Hash digest
SHA256 ad6b36842fdfc2c7e8b2f3217e709f2668215d0e8313c6a828c21b6e2ce2755c
MD5 9e270317ff0b37ab246e32929c853f48
BLAKE2b-256 13d2b698678839f40634843eb8b481416c43d1fdde99cebb4a69c27897d7821c

See more details on using hashes here.

Provenance

The following attestation bundles were made for halton_meter-0.1.22.21.tar.gz:

Publisher: publish.yml on haltonlabs/halton-meter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file halton_meter-0.1.22.21-py3-none-any.whl.

File metadata

File hashes

Hashes for halton_meter-0.1.22.21-py3-none-any.whl
Algorithm Hash digest
SHA256 8f7352a71bff137bc81228248a7074a9d5585c0f4f111fb276ac7ffb825d3004
MD5 56028db254121da08e0b7a6b3ed9f931
BLAKE2b-256 6d37cc28ef013fcee5e29617ef94b565112f861f833b4e664a3e9fbf06b47185

See more details on using hashes here.

Provenance

The following attestation bundles were made for halton_meter-0.1.22.21-py3-none-any.whl:

Publisher: publish.yml on haltonlabs/halton-meter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page