Skip to main content

Find LLM cost leaks before your bill does. Static analysis for Anthropic and OpenAI client code.

Project description

llmdoctor

Find LLM cost leaks before your bill does.

llmdoctor doctor is a static analyzer for Python code that calls Anthropic or OpenAI. It catches the patterns that quietly burn money in production:

  • Prompt-cache placement bugs that invalidate the cache on every call (the bug claude-mem itself shipped — their issue #1890)
  • Missing max_tokens caps where output tokens cost 3–10× input
  • Premium models (Opus, GPT-5) used for tiny prompts where a cheaper model would produce indistinguishable output
  • Large static system prompts left uncached

It's an advisor, not a runtime patcher. It reads your code, prints findings with rough cost-impact estimates, and exits.

Install

pip install llmdoctor
# or no-install:
pipx run llmdoctor doctor .

Usage

llmdoctor doctor .              # scan current directory
llmdoctor doctor src/agent.py   # scan one file
llmdoctor doctor . --json       # for CI / piping into other tools
llmdoctor doctor . --fail-on HIGH   # exit 1 if any HIGH-severity issue

What it looks like

╭─ llmdoctor doctor ─────────────────────────────────────────────╮
│ Scanned 14 file(s) under src/                                    │
│ Found 3 issue(s)  ·  2 HIGH · 1 MEDIUM                           │
│ Estimated potential savings: ~$340/month  (rough estimate)       │
╰──────────────────────────────────────────────────────────────────╯

╭─ [HIGH] TS001 Dynamic content before cache_control invalidates the cache ─╮
│   file:  src/agent.py:42                                                  │
│   code:  {"type": "text", "text": f"User said: {user_query}"},            │
│   why:   System block at index 0 contains dynamic content but appears     │
│          BEFORE the first block with cache_control. ...                   │
│   fix:   Move static content BEFORE the cache_control marker. Move        │
│          dynamic content into the messages array.                         │
│   estimate: ~$135.00/month  (assuming: 3000-token system prompt, 100      │
│             calls/day, 30-day month, 0.1× cache-read pricing)             │
│   docs:  https://docs.anthropic.com/.../prompt-caching                    │
╰───────────────────────────────────────────────────────────────────────────╯

Checks shipped in 0.1.0

Code Severity What it catches
TS001 HIGH Dynamic content placed before a cache_control marker (silently invalidates the prompt cache).
TS003 MEDIUM Large static system prompt without cache_control (missed cache opportunity).
TS010 HIGH OpenAI call with no max_tokens / max_completion_tokens (output cost unbounded).
TS011 MEDIUM max_tokens set suspiciously high (likely a copy-paste default that enables runaway completions).
TS020 MEDIUM Premium model (Opus, GPT-5, GPT-4-Turbo, GPT-4o) on a tiny static prompt where a cheaper tier would likely match quality.

How cost estimates are calculated

Estimates are heuristic, not invoice predictions. Each issue prints its assumptions (e.g. "100 calls/day, 30-day month, 3000-token system prompt"). Treat the numbers as order-of-magnitude. The tool's value is the finding and the fix; the dollar number is the attention-grabber.

Pricing table is in src/llmdoctor/pricing.py — verified 2026-04-30. Submit a PR if a model is missing or the price moves.

What this tool deliberately does NOT do (yet)

  • It does not patch your code. It reports, you fix.
  • It does not run your code. Static analysis only — safe on closed-source repos.
  • It does not measure live traffic. That's a different product (the SDK, coming next). The doctor is the first wedge.
  • It does not check JavaScript / TypeScript. Python only in 0.1.0.
  • It does not flag retry-storm patterns yet (planned: TS030).
  • It does not detect tool-definition duplication across calls (planned: TS040).

If your codebase doesn't import anthropic or openai directly (e.g. you use LangChain, LiteLLM, or hit the HTTP API), the doctor will produce no findings. Adapter checks for those frameworks are a next step.

Development

git clone https://github.com/Shahriyar-Khan27/llm-doctor
cd llmdoctor
pip install -e ".[dev]"
pytest

License

MIT.

Why we built this

We were designing a full LLM-cost SDK and went deep on prior art (summary doc). The single highest-leverage finding: prompt-cache placement bugs are everywhere, mostly invisible, and cost serious money. Even a competent tool like claude-mem shipped one to prod. Static analysis catches the whole class in seconds. So that's what shipped first.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmdoctor-0.1.0.tar.gz (25.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmdoctor-0.1.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file llmdoctor-0.1.0.tar.gz.

File metadata

  • Download URL: llmdoctor-0.1.0.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llmdoctor-0.1.0.tar.gz
Algorithm Hash digest
SHA256 27847f036015d048a40cddad432b91846d2543f568613342450e4b079989014e
MD5 34f240724063f6c90f603f9ca71d6eac
BLAKE2b-256 35f05cf5f46443aa1e2870bc558f193ffb0b779f0ae92fadaf68a992b35734a7

See more details on using hashes here.

File details

Details for the file llmdoctor-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llmdoctor-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llmdoctor-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4b4dbae8d48d9cd487ba95c395be7ddb58971b4bad45f5336a4f3e02bfe9b32e
MD5 a0f83880c2e4c3a895ea3de5ab3c8fbc
BLAKE2b-256 eff03433ecf40f03a92b6b3aacf5f764bc7a1343760bd1eb29e5f98cc0a6d6f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page