MCP server for AI agent token-cost telemetry + quota-window awareness across Anthropic, OpenAI, Gemini, Ollama, AWS Bedrock — per-agent + per-provider attribution, spend-spike anomaly detection, cheaper-model routing recommendations, monthly forecast, and 429-prediction + throttle-target tools that read anthropic-ratelimit-* headers to project rate-limit exhaustion before it happens. Reads cost-log JSONL with the standard schema; OpenClaw ~/.openclaw/cost-logs/ is supported natively.

These details have not been verified by PyPI

Project description

openclaw-cost-tracker-mcp

MCP server for AI agent token-cost telemetry — built for the operator who woke up to a surprise bill. Catches the failure modes vendor dashboards miss: silent re-routing to API billing (HERMES.md bug, 1,464 upvotes), unattended /loop overnight burns ($6k in 26 hours, real story), per-agent attribution buried behind aggregate provider totals, multi-day dashboard lag that surfaces the spike after the money is gone. Cross-provider: Anthropic, OpenAI, Gemini, Ollama, AWS Bedrock. Per-agent + per-provider attribution, spend-spike anomaly detection (per-agent median × threshold), cheaper-routing recommendations with 30-day savings estimates, monthly forecast. Reads any cost-log JSONL with the standard {request_id, timestamp, provider, model, agent_id, prompt_tokens, completion_tokens, cost_usd} schema. OpenClaw operators get native ~/.openclaw/cost-logs/ parsing; other deployments wrap their provider calls with the included logging shim or commission a Custom MCP Build adapter. Keywords: Claude API cost surprise, /loop overnight burn, HERMES.md billing routing, dashboard lag, AI agent FinOps, LLM cost attribution, token spend monitoring.

What it does

Production AI deployments are leaking money in ways the vendor dashboards don't surface in time. Real stories from the last 30 days on r/ClaudeAI:

A developer woke up to a $6,000 overnight burn from a single /loop command running 26 hours unattended on Claude Opus 4.7 (1,175 upvotes, 314 comments, May 1 2026). The Anthropic dashboard lagged by days; the spike was invisible until the limit email landed. Verbatim from the OP: "By the time it shows a spike, the money is already spent."
Another lost $200 because a string in a git commit silently routed Claude Code billing from Max subscription to API tier — the HERMES.md bug (1,464 upvotes, 202 comments). VENDOR-ACKNOWLEDGED: Om Patel @om_patel5 thread on X (1.4M views, 4.6K likes) elicited an Anthropic-side reply confirming "this was a bug with the 3rd party harness detection [and how we pull git status into the system prompt]" — i.e., it's a real bug in how Anthropic detects 3rd-party harnesses + ingests git context, not an isolated user error. Cost-tracker is the defense layer regardless of patch timing.
A third left a test loop running and woke up to a surprise $80 Claude bill. Verbatim: "No alerts. No cap. No warning. Just a bill."
A fourth got $1,800 in API charges in two days after the built-in claude-code-guide agent recommended a command that bypasses Max-plan billing.

In every case the dashboard lagged, the per-agent attribution was invisible, and Anthropic's own hard-cap mechanism either failed or arrived too late. This MCP server surfaces per-agent + per-provider cost attribution live, queryable from inside Claude or any MCP-aware client, before the bill arrives:

> claude: where did our LLM spend go this week?
[MCP tools: cost_overview + top_cost_drivers]

Total spend last 7 days: $42.18
By provider:
  Anthropic    $30.40 (72%) — claude-sonnet-4 dominates
  OpenAI       $9.20  (22%) — chat-bot agent
  Gemini       $1.86  (4%)  — cron-summarizer (cheap-route working)
  Ollama       $0.00  (local, free)

Top 3 cost drivers:
  data-extraction-agent      $28.50 (68%)
  chat-bot                   $9.20  (22%)
  cron-summarizer            $1.86  (4%)

1 anomaly flagged — see find_cost_anomalies for details.

> claude: any cheaper-routing opportunities?
[MCP tool: model_routing_recommendations]

Recommendation: data-extraction-agent currently runs claude-sonnet-4
with avg 400-token completions — extraction-style work that
gemini-2.5-flash usually handles at ~95% quality for ~5% the cost.
Estimated savings: $27.10/30d if migrated. Test on a 10% slice first.

v1.1 — quota-window awareness. The cost tracker now also answers "will my agent hit a 429 before the window resets?" before it actually does. This is the HERMES.md / overnight-burn detector — projecting per-dimension exhaustion ETA from your active burn rate vs the rate-limit headers your provider already returns:

> claude: am I about to 429?
[MCP tool: predict_429_in_window]

severity: CRITICAL
provider: anthropic
burn_rate_tokens_per_min: 8000
dimension_predictions:
  tokens          : will 429 in ~6.5 min  (resets in 22 min) ⚠
  output_tokens   : will 429 in ~11.7 min (resets in 22 min) ⚠
  input_tokens    : safe through reset (headroom 70k at reset)
  requests        : safe through reset (headroom 858 at reset)

summary: tokens projected to 429 in ~6.5 min at current burn (8000/min).
Window resets in 22 min — throttle now.

> claude: what burn rate keeps me safe?
[MCP tool: recommend_throttle_target target_buffer_pct=10]

target_buffer_pct: 10
targets:
  tokens          : current 8000/min, max safe 545/min — must throttle
  output_tokens   : current 3000/min, max safe 1136/min — must throttle
  input_tokens    : current 5000/min, max safe 6818/min — safe
  requests        : current 1/min,    max safe 35/min   — safe

summary: throttle required on tokens, output_tokens to keep 10% buffer at reset.

Why `openclaw-cost-tracker-mcp`

Four things existing tools (provider billing dashboards, generic FinOps tools, custom scripts, even paid SaaS like Lava or AgentShield) don't do well together:

Per-agent attribution, not just per-provider totals. Provider dashboards show "$X spent on Anthropic" — they can't tell you which of your six agents drove 78% of that. Cost tracker reads per-request cost-log JSONL and aggregates with the agent_id intact.
Cost-spike anomaly detection per agent. A single 120k-token paste into chat — or an unattended /loop running overnight — costs more than a week of normal traffic. The default 3x-median-per-agent threshold flags those before they show up in the month-end bill. Catches the same shape as the $6k overnight thread and the silent re-routing patterns that vendor dashboards miss.
Routing recommendations grounded in actual usage. Generic "use cheaper models" advice is useless. This server identifies specific agents whose volume + completion-length pattern suggests a cheaper provider would deliver the same outcome, with concrete 30-day savings estimates.
Local-only, MCP-native, no traffic interception. Lava + similar paid gateways sit between your app and the provider — your traffic routes through them. AgentShield is a callback for LangChain/CrewAI/OpenAI SDK, not MCP. Cost-tracker is read-only: it parses your existing cost logs and surfaces them via MCP tools your Claude conversation can query directly. No proxy, no traffic routing, no subscription, no vendor lock-in.

Built for the production-AI operator running real workloads with real spend that matters — whether on OpenClaw, Claude Code with claude -p, raw Anthropic API, OpenAI Assistants, or any combination.

Tool surface

Tool	What it returns
`cost_overview`	Total spend + by-provider + top agents + top models + anomaly count for a window
`costs_by_agent`	Per-agent breakdown with avg-cost-per-request + share of total
`costs_by_provider`	Per-provider breakdown with token counts
`find_cost_anomalies`	Requests flagged as 3x+ above their agent's median cost
`top_cost_drivers`	Top N spending agents + models, no other noise
`model_routing_recommendations`	Specific cheaper-model suggestions with 30d savings estimates
`forecast_monthly_cost`	Projects 30-day total + per-provider with confidence note
`current_quota_usage` (v1.1)	Per-dimension rate-limit utilization + most-pressured-dimension headline
`predict_429_in_window` (v1.1)	Projects per-dimension 429 ETA from recent burn rate. CRITICAL/WARNING/INFO severity
`recommend_throttle_target` (v1.1)	Max safe burn rate per dimension to land at `target_buffer_pct` headroom at reset

Resources:

cost://overview — 7-day snapshot
cost://forecast — 30-day projection
cost://anomalies — recent flagged anomalies
cost://quota (v1.1) — current quota-state snapshot

Prompts:

diagnose-cost-spike — walk a recent spike to its root cause + corrective action
weekly-cost-digest — 200-word weekly cost digest
diagnose-quota-pressure (v1.1) — walk current quota + burn rate, recommend a specific throttle action

Quickstart

Install

pip install openclaw-cost-tracker-mcp

Configure for Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "openclaw-cost": {
      "command": "python",
      "args": ["-m", "openclaw_cost_tracker_mcp"],
      "env": {
        "OPENCLAW_COST_BACKEND": "mock"
      }
    }
  }
}

Backends

Backend	Status	Description
`mock`	✅ v1.0	Sample data with deliberate anomalies + routing opportunities for protocol verification. v1.1: includes near-exhausted token-quota snapshot + recent loop-burst entries so `predict_429_in_window` returns CRITICAL out of the box
`openclaw-jsonl`	✅ v1.0	Parses OpenClaw's native cost-log JSONL files (default `~/.openclaw/cost-logs/`, configurable via `OPENCLAW_COST_LOGS`). v1.1: also parses optional `quota-snapshots.jsonl` in the same dir (one record per provider response, schema in `backends/openclaw_jsonl.py` docstring)
`provider-direct`	⏳ v1.2	Reads Anthropic + OpenAI billing APIs directly (no log shim required)

JSONL log format

Each line is one JSON record:

{"request_id":"req-abc123","timestamp":"2026-05-04T12:34:56Z","provider":"anthropic","model":"claude-sonnet-4","agent_id":"data-extraction-agent","skill_id":"extract-structured-data","prompt_tokens":8500,"completion_tokens":600,"total_tokens":9100,"cost_usd":0.0345,"duration_ms":4823}

If your OpenClaw deployment doesn't emit this format, wrap your provider calls with a small logging shim — sample shim in examples/.

Roadmap

Version	Scope	Status
v1.0	mock + openclaw-jsonl backends, 7 tools / 3 resources / 2 prompts, anomaly detection + routing + forecast, GitHub Actions CI matrix, PyPI Trusted Publishing	✅
v1.1	Quota-window awareness — `QuotaSnapshot` data model, `get_latest_quota_state()` backend method, 3 new tools (`current_quota_usage`, `predict_429_in_window`, `recommend_throttle_target`), `cost://quota` resource, `diagnose-quota-pressure` prompt. Reads `anthropic-ratelimit-*` headers (or equivalents) and projects per-dimension 429 ETA from recent burn rate. Folded in from research-pass-2 P01 candidate after incumbent validation against Lava + AgentShield + Nornr	✅
v1.2	`provider-direct` backend (Anthropic + OpenAI billing API integrations); backend federation; budget alerts + threshold breach detection; `/loop` overnight-burn detector (folded from P05 watch-list candidate)	⏳
v1.x	Per-channel cost attribution; webhook emitter for budget + quota alerts	⏳

Need this adapted to your stack?

If your AI deployment doesn't use OpenClaw's cost-log format — different agent harness, custom logging, AWS Bedrock metering, vendor billing API — and you want the same attribution + anomaly + routing visibility, that's a Custom MCP Build engagement.

Tier	Scope	Investment	Timeline
Simple	Single backend adapter for your existing cost-data source	$8,000–$10,000	1–2 weeks
Standard	Custom backend + custom anomaly rules + integration with your alerting	$15,000–$20,000	2–4 weeks
Complex	Multi-backend federation + budget enforcement + custom routing logic	$25,000–$35,000	4–8 weeks

To engage:

Email temur@pixelette.tech with subject Custom MCP Build inquiry
Include: a 1-paragraph description of your stack + which tier you're considering
Reply within 2 business days with a 30-min discovery call slot

This server is part of a production-AI infrastructure MCP suite — companion to silentwatch-mcp (cron silent-failure detection) and openclaw-health-mcp (deployment health). Install all three for full operational visibility.

Production AI audits

If you're running production AI and want an outside practitioner to score readiness, find the failure patterns already present (cost overruns being one of the most common), and write the corrective-action plan:

Tier	Scope	Investment	Timeline
Audit Lite	One system, top-5 findings, written report	$1,500	1 week
Audit Standard	Full audit, all 14 patterns, 5 Cs findings, 90-day follow-up	$3,000	2–3 weeks
Audit + Workshop	Standard audit + 2-day team workshop + first monthly audit included	$7,500	3–4 weeks

Same email channel: temur@pixelette.tech with subject AI audit inquiry.

Contributing

PRs welcome. Backends are pluggable — see src/openclaw_cost_tracker_mcp/backends/ for the contract.

To add a new backend:

Subclass CostBackend in backends/<your_backend>.py
Implement get_entries() (cost telemetry); optionally override get_latest_quota_state() if your data source carries rate-limit headers (default returns None for graceful degrade)
Register in backends/__init__.py
Add tests in tests/test_backend_<your_backend>.py

Bug reports + feature requests: open a GitHub issue.

License

MIT — see LICENSE.

Production-AI MCP Suite (Gumroad bundle) — this server plus 4 others in one curated bundle with a decision tree, day-one drill, and Custom MCP Build CTA. $99, or $49 with LAUNCH50 for the first 30 days.
silentwatch-mcp — cron silent-failure detection
openclaw-health-mcp — deployment health
openclaw-skill-vetter-mcp — ClawHub skill security vetting
openclaw-upgrade-orchestrator-mcp — read-only upgrade advisor + provider-side regression detection (v1.2)
openclaw-output-vetter-mcp — agent claim verification (inline grounding-check + swallowed-exception scanner + multi-turn transcript review)
AI Production Discipline Framework — Notion template, $29 — the methodology these MCP tools implement (cost overruns are pattern P3.x in the catalog)
SPEC.md — full server design

Built by Temur Khan — independent practitioner on production AI systems. Contact: temur@pixelette.tech

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.1.7

May 9, 2026

1.1.6

May 8, 2026

1.1.5

May 8, 2026

1.1.4

May 8, 2026

This version

1.1.3

May 6, 2026

1.1.2

May 5, 2026

1.1.1

May 5, 2026

1.1.0

May 4, 2026

1.0.0

May 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openclaw_cost_tracker_mcp-1.1.3.tar.gz (47.3 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openclaw_cost_tracker_mcp-1.1.3-py3-none-any.whl (34.1 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file openclaw_cost_tracker_mcp-1.1.3.tar.gz.

File metadata

Download URL: openclaw_cost_tracker_mcp-1.1.3.tar.gz
Upload date: May 6, 2026
Size: 47.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openclaw_cost_tracker_mcp-1.1.3.tar.gz
Algorithm	Hash digest
SHA256	`0708dcde69e0b986d676394ddf3e35c9f46eba24340bc9d81f8a95fc31194ab7`
MD5	`d224a7e5a65de2031adb36c37a91dfb2`
BLAKE2b-256	`ab5acd881c53a8740c8936d55e27e8890f566e063edf4a8e7c8803fca79b45ec`

See more details on using hashes here.

Provenance

The following attestation bundles were made for openclaw_cost_tracker_mcp-1.1.3.tar.gz:

Publisher: release.yml on temurkhan13/openclaw-cost-tracker-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: openclaw_cost_tracker_mcp-1.1.3.tar.gz
- Subject digest: 0708dcde69e0b986d676394ddf3e35c9f46eba24340bc9d81f8a95fc31194ab7
- Sigstore transparency entry: 1449324533
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: temurkhan13/openclaw-cost-tracker-mcp@7dd769f9b566de3e3e9d5b3fc0a9a70726f7e801
- Branch / Tag: refs/tags/v1.1.3
- Owner: https://github.com/temurkhan13
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7dd769f9b566de3e3e9d5b3fc0a9a70726f7e801
- Trigger Event: push

File details

Details for the file openclaw_cost_tracker_mcp-1.1.3-py3-none-any.whl.

File metadata

Download URL: openclaw_cost_tracker_mcp-1.1.3-py3-none-any.whl
Upload date: May 6, 2026
Size: 34.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for openclaw_cost_tracker_mcp-1.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e457c0ba7d76094e40db37cf9a503068db6b1b7c960cbd14fe383353350e707d`
MD5	`2464a90737e6ee213d14e9a06741c796`
BLAKE2b-256	`9dd97b1e77abc2fda89dde770eb70a4513ceae6df483c59d4b86ac44f6ca4e39`

See more details on using hashes here.

Provenance

The following attestation bundles were made for openclaw_cost_tracker_mcp-1.1.3-py3-none-any.whl:

Publisher: release.yml on temurkhan13/openclaw-cost-tracker-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: openclaw_cost_tracker_mcp-1.1.3-py3-none-any.whl
- Subject digest: e457c0ba7d76094e40db37cf9a503068db6b1b7c960cbd14fe383353350e707d
- Sigstore transparency entry: 1449324624
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: temurkhan13/openclaw-cost-tracker-mcp@7dd769f9b566de3e3e9d5b3fc0a9a70726f7e801
- Branch / Tag: refs/tags/v1.1.3
- Owner: https://github.com/temurkhan13
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@7dd769f9b566de3e3e9d5b3fc0a9a70726f7e801
- Trigger Event: push

openclaw-cost-tracker-mcp 1.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

openclaw-cost-tracker-mcp

What it does

Why openclaw-cost-tracker-mcp

Tool surface

Quickstart

Install

Configure for Claude Desktop

Backends

JSONL log format

Roadmap

Need this adapted to your stack?

Production AI audits

Contributing

License

Related

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Why `openclaw-cost-tracker-mcp`