Track, visualize, and optimize LLM API spending. Two lines of code, zero config.

These details have not been verified by PyPI

Project links

Project description

LLM Cost Profiler

Find the money you're burning on LLM APIs. Two lines of code, zero config, instant visibility.

LLM Cost Report — Last 7 Days
========================================
Total: $847.32 | 2.4M tokens | 12,847 calls

By Feature:
  summarizer         $412.80  (48.7%)  ████████████████████
  chatbot            $203.11  (24.0%)  ████████████
  classifier          $89.40  (10.5%)  █████
  content_gen         $78.22   (9.2%)  ████
  extraction          $41.50   (4.9%)  ██
  untagged            $22.29   (2.6%)  █

Warnings:
  ⚠ summarizer: 34% of calls are retries ($140.15 wasted)
  ⚠ chatbot: avg 3,200 input tokens but only 180 output tokens (context bloat)
  ⚠ classifier: using gpt-4o but output is always <10 tokens (cheaper model works)

I ran this on my own project and found $1,240/month in waste — duplicate calls that should be cached, an expensive model doing a job a cheap one handles fine, and retry loops burning money on failures. All fixable in an afternoon.

Setup — 2 lines, 30 seconds

pip install llm-spend-profiler

from llm_cost_profiler import wrap
from openai import OpenAI

client = wrap(OpenAI())  # that's it. everything is tracked now.

Your code works exactly as before. Every API call is silently logged to a local SQLite database. If logging fails for any reason, it fails silently — your app is never affected.

Works with Anthropic too:

from anthropic import Anthropic
client = wrap(Anthropic())

And async clients:

from openai import AsyncOpenAI
client = wrap(AsyncOpenAI())

What you get

`llmcost report` — Where your money goes

llmcost report           # last 7 days
llmcost report --days 30 # last 30 days

Shows total spend, breakdown by feature and model, and automatic warnings about retry waste, context bloat, and overpriced model usage.

`llmcost hotspots` — Which lines of code cost the most

Top Cost Hotspots:
  1. features/summarizer.py:47   summarize_doc()    $412.80/week   4,201 calls  ████████████████████
  2. api/chat.py:123             handle_message()   $203.11/week   3,892 calls  ██████████
  3. pipeline/classify.py:34     classify_text()     $89.40/week   2,847 calls  ████

Auto-detected from the Python call stack. No manual annotation needed.

`llmcost compare` — What changed

Week-over-Week Comparison:
  Total: $847.32 → was $623.10 (+36% ⚠)

  Biggest increases:
    summarizer: +$180 (+77%) — call volume doubled
    chatbot: +$44 (+28%) — avg tokens per call increased

`llmcost optimize` — What to fix and how much you'll save

LLM Cost Optimization Report
========================================
Current monthly spend (projected): $2,847
Potential savings found: $1,240/month (43.5%)

  #1 CACHE — classifier.py:34                        [SAVE $310/mo]
     85% of calls are exact duplicates (723 of 847/week)
     → Add @cache decorator
     Confidence: HIGH

  #2 RETRY FIX — content_gen.py:112                   [SAVE $180/mo]
     28% retry rate from JSON parse errors
     → Fix prompt to return raw JSON
     Confidence: HIGH

  #3 MODEL DOWNGRADE — classifier.py:34               [SAVE $71/mo]
     Output is always <10 tokens, one of 5 fixed labels
     → Switch gpt-4o to gpt-4o-mini
     Confidence: MEDIUM

  #4 CONTEXT BLOAT — chatbot.py:123                   [SAVE $155/mo]
     Avg 3,200 input tokens, growing over conversation
     → Truncate history to last 5 messages
     Confidence: MEDIUM

Five analyses: cache detection, retry waste, model downgrade suggestions, context bloat detection, batching opportunities.

`llmcost dashboard` — Visual dashboard

llmcost dashboard  # opens http://127.0.0.1:8177

Dark-themed local web dashboard with:

Cost summary cards and feature treemap
Spend timeline chart (daily/hourly)
Model usage breakdown
Hotspots table
Optimization waterfall chart

Auto-refreshes every 30 seconds. Single HTML file, no npm, no build step.

Tag your calls

Group costs by feature, customer, environment — whatever matters to you:

from llm_cost_profiler import tag

with tag(feature="summarizer", customer="acme_corp"):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Summarize this..."}]
    )

Tags nest. Inner tags merge with outer tags.

Cache responses

Stop paying for duplicate calls:

from llm_cost_profiler import cache

@cache(ttl=3600)  # cache for 1 hour
def classify_text(text):
    return client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": f"Classify: {text}"}]
    )

classify_text("hello")  # API call, cached
classify_text("hello")  # instant, free

Store prompts (optional)

Enable prompt storage for deeper optimization analysis:

client = wrap(OpenAI(), store_prompts=True)

Disabled by default for privacy. When enabled, the optimizer can detect near-duplicate prompts and analyze what causes retry failures.

How it works

Wrapper: Transparent proxy pattern — intercepts SDK method calls without monkey-patching. Your client object behaves identically.
Storage: SQLite with WAL mode at ~/.llmcost/data.db. Thread-safe. All data stays local.
Pricing: Built-in table for OpenAI and Anthropic models. Prefix-matching handles versioned model names automatically.
Call site detection: Walks the Python call stack to find the file and line that triggered each API call.
Zero dependencies: Only uses the Python standard library. The OpenAI/Anthropic SDKs are detected at runtime, not required at install time.

Requirements

Python 3.9+
No required dependencies
Optional: openai and/or anthropic SDKs

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Apr 4, 2026

0.1.0

Apr 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_spend_profiler-0.1.1.tar.gz (35.6 kB view details)

Uploaded Apr 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_spend_profiler-0.1.1-py3-none-any.whl (31.6 kB view details)

Uploaded Apr 4, 2026 Python 3

File details

Details for the file llm_spend_profiler-0.1.1.tar.gz.

File metadata

Download URL: llm_spend_profiler-0.1.1.tar.gz
Upload date: Apr 4, 2026
Size: 35.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for llm_spend_profiler-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`6eb0e4fbbae3b0ba1b96fa94d2765ae046ff066d2278d57509a97c5c6ad1499c`
MD5	`151d462d87d5a63206a58c41da229365`
BLAKE2b-256	`a082a9381cb972d13072793f81a22c2c746e3f04082e318fbf9bd87e2db7669c`

See more details on using hashes here.

File details

Details for the file llm_spend_profiler-0.1.1-py3-none-any.whl.

File metadata

Download URL: llm_spend_profiler-0.1.1-py3-none-any.whl
Upload date: Apr 4, 2026
Size: 31.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for llm_spend_profiler-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`065423ed5072eb80dfe9cca491e8124dce9e9fbcc5be2a2d9fa2fc69d19da535`
MD5	`d029bc4a35406624c86c529a743a308c`
BLAKE2b-256	`408b718220674ac1af058249d0f3185b35f21c3dbe615f0fb363181ab524b595`

See more details on using hashes here.

llm-spend-profiler 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLM Cost Profiler

Setup — 2 lines, 30 seconds

What you get

`llmcost report` — Where your money goes

`llmcost hotspots` — Which lines of code cost the most

`llmcost compare` — What changed

`llmcost optimize` — What to fix and how much you'll save

`llmcost dashboard` — Visual dashboard

Tag your calls

Cache responses

Store prompts (optional)

How it works

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

llm-spend-profiler 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLM Cost Profiler

Setup — 2 lines, 30 seconds

What you get

llmcost report — Where your money goes

llmcost hotspots — Which lines of code cost the most

llmcost compare — What changed

llmcost optimize — What to fix and how much you'll save

llmcost dashboard — Visual dashboard

Tag your calls

Cache responses

Store prompts (optional)

How it works

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`llmcost report` — Where your money goes

`llmcost hotspots` — Which lines of code cost the most

`llmcost compare` — What changed

`llmcost optimize` — What to fix and how much you'll save

`llmcost dashboard` — Visual dashboard