Skip to main content

Scan codebases for LLM cost-waste anti-patterns. Find retry storms, missing prompt caching, unbounded conversation history, agent loops without iteration caps, and more — before you ship.

Project description

coffer-cli

Scan your code for LLM cost-waste anti-patterns before you ship.

coffer-cli is a static scanner for production AI code. It catches the mistakes that show up at month-end on your OpenAI / Anthropic bill — retry storms, missing prompt caching, unbounded conversation history, agent loops without iteration caps, SDK inits without timeouts, and more.

It is intentionally not a magic dollar estimator. Static analysis cannot know call volume; we leave that to live tracking. Instead, we surface structural risks that a careful reviewer would catch — but faster, in CI, on every commit.

pipx install coffer-cli

coffer scan ./my-app
coffer scan ./my-app --json     # for CI / Claude Code skill consumption
coffer prices                    # current model pricing table
coffer compare gpt-4o gpt-4o-mini
coffer install-skill             # install the Claude Code skill (see below)

What it catches (v0.1.0)

Detectors are organized by the four levers that drive LLM cost:

Lever Detector Severity
A: input tokens dynamic_before_static_cache_break — f-string interpolation in SYSTEM_PROMPT defeats OpenAI auto-cache and Anthropic cache_control 🚨 high
unbounded_conversation_historymessages.append(...) without truncation or summarization 🟡 med
uncached_large_prompt — ≥2,000-char hardcoded prompt without nearby cache_control 🟡 med
B: output tokens missing_max_tokens — LLM call without a max_tokens cap 🟡 med
reasoning_effort_high_defaultreasoning_effort="high" literal (up to ~20× extra reasoning tokens on trivial tasks) 🟡 med
D: number of calls llm_in_for_loop — N× cost; gather is a latency fix, not a cost fix 🟡 med
agent_loop_no_max_iterwhile True: containing an LLM call without an iteration cap (the $47K-incident pattern) 🚨 high
temperature_nonzero_with_cache_hint — cache layer nearby but temperature > 0 silently breaks it 🟡 med
E: architecture retry_loop_no_backoff — retry storm amplifies the bill 10× 🚨 high
sdk_init_no_timeout — default 600s lets a hung provider block your thread 🚨 high

Each finding includes a concrete fix and explains the cost angle explicitly (we do not conflate latency fixes with cost fixes).

Use with Claude Code (the skill)

The coffer-cost-review Claude Code skill in skills/ combines this scanner with Claude's semantic judgment. In Claude Code, ask "review my LLM costs" and the skill will:

  1. Run coffer scan <path> --json for deterministic findings
  2. Read each flagged file in context to filter false positives
  3. Add semantic-only checks the scanner cannot do (frontier model used for trivial tasks, free-form output where structured works, public endpoints without rate limit, ...)
  4. Produce a severity-ranked review with concrete code-diff fixes

Install (bundled with the CLI):

pipx install coffer-cli       # if you don't have it yet
coffer install-skill          # copies the skill to ~/.claude/skills/

Then open Claude Code and ask "review my LLM costs".

To uninstall: coffer uninstall-skill.

What it deliberately does NOT do

  • No invented dollar estimates. Call volume is unknowable from static code. We report severity, not numbers.
  • No proxy mode. Your LLM calls go directly to your providers.
  • No auto-rewrites. Suggestions only; you stay in control.

For live production cost tracking with per-feature and per-user attribution (the part static analysis genuinely can't do), see Cofferwise.

Exit codes

  • 0 — clean, or only medium/low findings
  • 1 — at least one high finding (use for CI gating)

Development

git clone https://github.com/neal-c611/coffer-cli
cd coffer-cli
uv sync --extra dev
uv run pytest

Patterns are detected by src/coffer_cli/patterns.py (regex-based, single-file scope) and rendered by src/coffer_cli/cli.py (typer + rich).

Contributions welcome. New detectors should:

  • Default to medium severity; reserve high for patterns that are demonstrably cost-amplifying in production
  • Include a test in tests/test_patterns.py showing both a positive case AND a negative case (the negative case is what keeps false-positive rate low)
  • Propose a cost fix, not a latency fix. Wrapping things in asyncio.gather does not reduce the bill.

License

Apache 2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coffer_cli-0.1.2.tar.gz (23.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coffer_cli-0.1.2-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file coffer_cli-0.1.2.tar.gz.

File metadata

  • Download URL: coffer_cli-0.1.2.tar.gz
  • Upload date:
  • Size: 23.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for coffer_cli-0.1.2.tar.gz
Algorithm Hash digest
SHA256 1fd41e8d5ef1aa176989280372f8c1390c30cb94d5187b6e5a3c5d0e6730608a
MD5 f5833173043a128fbce317167e1c96d1
BLAKE2b-256 2bd520807be6e3591d9c5997876c11e59b8a722781347c0fed65b655c4da38eb

See more details on using hashes here.

File details

Details for the file coffer_cli-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: coffer_cli-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.12 {"installer":{"name":"uv","version":"0.11.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for coffer_cli-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2c137c1b6c28bca97bb2ce5c91f1e79585c17f87f697f3117e3a316604b429c9
MD5 ff6add24a03aba3cbfb604aa65fe9306
BLAKE2b-256 ed10c5bc73940c0852aea894bbf1aeea9e44627655a9da0508d71dbe27ccbf72

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page