Track LLM API costs per request. Know where your tokens go.

These details have not been verified by PyPI

Project links

Project description

LLM Cost Tracker

Track LLM API costs per request. Know where your tokens go.

Zero dependencies. Pure Python. Works with any LLM provider.

pip install llm-costlog

Quickstart

from llm_cost_tracker import CostTracker

tracker = CostTracker("./llm_costs.db")

# Track an OpenAI call
tracker.record(
    prompt_tokens=847,
    completion_tokens=234,
    model="gpt-4o-mini",
    provider="openai",
)

# Track an Anthropic call
tracker.record(
    prompt_tokens=1200,
    completion_tokens=890,
    model="claude-3-5-sonnet",
    provider="anthropic",
)

# Track a request you handled locally (no LLM call)
tracker.record(
    prompt_tokens=0,
    completion_tokens=0,
    model="gpt-4o-mini",
    provider="openai",
    route="local",
    prompt_text="where is the login function defined",
    intent="code_lookup",
)

# See where your money is going
report = tracker.report(window="7d")
print(f"Total cost: ${report['total_cost_usd']:.4f}")
print(f"Total tokens: {report['total_tokens']:,}")
print(f"Requests: {report['total_requests']}")
print(f"Local vs external: {report['local_count']} / {report['external_count']}")
print(f"Estimated savings: ${report['total_saved_full_modeled_usd']:.4f}")
print(f"Cost by model: {report['cost_by_model']}")

# The important part — how much are you wasting?
print(f"\n--- Waste Analysis ---")
print(f"Avoidable external requests: {report['avoidable_external_requests']}")
print(f"Money wasted on unnecessary LLM calls: ${report['avoidable_cost_usd']:.4f}")
print(f"Additional savings from model downgrades: ${report['potential_model_downgrade_savings_usd']:.4f}")
print(report['optimization_summary'])

What it tracks

Every call to tracker.record() stores:

Tokens used — prompt + completion, per request
Cost in USD — calculated from built-in pricing tables (40+ models)
Route — was this handled locally or sent to an LLM?
Counterfactual savings — if you handled it locally, how much did you save vs sending it to the LLM?
Model, provider, intent, session — slice your costs any way you want

Reports

# Last 7 days, grouped by model
report = tracker.report(window="7d", group_by="model")

# Last 24 hours, specific session
report = tracker.report(window="1d", session_key="user-123")

# All time
report = tracker.report()

Report fields:

total_requests, total_cost_usd, total_tokens
total_prompt_tokens, total_completion_tokens
local_count, external_count
total_saved_prompt_only_usd, total_saved_full_modeled_usd
requests_by_route, cost_by_model, cost_by_provider
tokens_by_model, savings_by_intent
avoidable_external_requests — requests sent to LLM that didn't need one
avoidable_cost_usd — money wasted on those unnecessary calls
avoidable_percent — what % of your external calls were avoidable
potential_model_downgrade_savings_usd — savings from using cheaper models
optimization_summary — human-readable summary of waste found
breakdown (when group_by is specified)

Snapshots (for dashboards & cron jobs)

# Capture a daily snapshot
snapshot = tracker.capture_snapshot(window_hours=24, job_name="daily-cost-report")
print(f"Net savings: ${snapshot['net_savings_conservative_usd']:.4f}")

# View recent snapshots
for s in tracker.snapshots(limit=7):
    print(f"{s['job_name']}: saved ${s['saved_full_modeled_usd']:.4f}, spent ${s['external_cost_usd']:.4f}")

Waste score trend (v0.2.0)

Track how your efficiency improves over time. The waste score is the percentage of external API calls that were avoidable.

trend = tracker.waste_score_trend(days=30)
print(trend["summary"])
# Waste score: 20.0% (↓ improving). 43 of 71 external calls were avoidable ($0.03 wasted) over 30 days.

print(f"Direction: {trend['direction']}")      # improving / worsening / stable
print(f"Current: {trend['current_score']}%")   # today's waste score
print(f"Best: {trend['best_score']}%")         # lowest waste score achieved

for point in trend["trend"]:
    print(f"  {point['date']}  waste={point['waste_score']}%  ({point['avoidable']}/{point['external']} avoidable)")

Output:

  2026-04-12  waste=75.0%  (12/16 avoidable)
  2026-04-14  waste=66.7%  (8/12 avoidable)
  2026-04-16  waste=50.0%  (4/8 avoidable)
  2026-04-18  waste=20.0%  (1/5 avoidable)

Built-in pricing (40+ models)

Pricing is built in for OpenAI, Anthropic, Google, Meta, Mistral, and DeepSeek models. Prices are USD per 1M tokens and auto-matched by model name.

from llm_cost_tracker import lookup_pricing

inp, out, source = lookup_pricing("gpt-4o-mini")
print(f"Input: ${inp}/1M tokens, Output: ${out}/1M tokens")
# Input: $0.15/1M tokens, Output: $0.6/1M tokens

Custom pricing:

from llm_cost_tracker.pricing import DEFAULT_PRICING
DEFAULT_PRICING["my-custom-model"] = (1.00, 3.00)  # input, output per 1M tokens

Integration examples

With OpenAI

import openai
from llm_cost_tracker import CostTracker

client = openai.OpenAI()
tracker = CostTracker("./costs.db")

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}],
)

tracker.record(
    prompt_tokens=response.usage.prompt_tokens,
    completion_tokens=response.usage.completion_tokens,
    model="gpt-4o-mini",
    provider="openai",
)

With Anthropic

import anthropic
from llm_cost_tracker import CostTracker

client = anthropic.Anthropic()
tracker = CostTracker("./costs.db")

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
)

tracker.record(
    prompt_tokens=response.usage.input_tokens,
    completion_tokens=response.usage.output_tokens,
    model="claude-sonnet-4-20250514",
    provider="anthropic",
)

With LiteLLM

import litellm
from llm_cost_tracker import CostTracker

tracker = CostTracker("./costs.db")

response = litellm.completion(model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello"}])

tracker.record(
    prompt_tokens=response.usage.prompt_tokens,
    completion_tokens=response.usage.completion_tokens,
    model="gpt-4o-mini",
    provider="openai",
)

How it works

SQLite database — all data stored locally in a single file. No external services.
Zero dependencies — pure Python stdlib. No numpy, no pandas, no requests.
WAL mode — concurrent reads while writing. Safe for multi-threaded apps.
Built-in pricing — 40+ models with auto-matching. Falls back gracefully for unknown models.
Counterfactual tracking — when you handle a request locally, it estimates what the LLM call would have cost, so you can see real savings.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.1

May 4, 2026

0.3.0

Apr 26, 2026

0.2.0

Apr 20, 2026

0.1.0

Apr 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_costlog-0.3.1.tar.gz (18.8 kB view details)

Uploaded May 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_costlog-0.3.1-py3-none-any.whl (15.5 kB view details)

Uploaded May 4, 2026 Python 3

File details

Details for the file llm_costlog-0.3.1.tar.gz.

File metadata

Download URL: llm_costlog-0.3.1.tar.gz
Upload date: May 4, 2026
Size: 18.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for llm_costlog-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`c5d94ce747623f02a8370e89e4d87301225eeba944781e032ac98535c75ee5d7`
MD5	`eb149d77030c80e235530769b30c3175`
BLAKE2b-256	`5876f7fe2e1ba8b64768382e5da8d3b5210c59beabaef6f25cdce2303c53f48c`

See more details on using hashes here.

File details

Details for the file llm_costlog-0.3.1-py3-none-any.whl.

File metadata

Download URL: llm_costlog-0.3.1-py3-none-any.whl
Upload date: May 4, 2026
Size: 15.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for llm_costlog-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`57532552eab61007d3c54c975d8466465cf436b8888543b79c64c9050c25ff59`
MD5	`2552ac846f8069b13e0e12afb2974c56`
BLAKE2b-256	`f8acd025de1ee3ec13c2d065ad1a381064f9c3cd75eeb3092aa63697536c4ee5`

See more details on using hashes here.

llm-costlog 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLM Cost Tracker

Quickstart

What it tracks

Reports

Snapshots (for dashboards & cron jobs)

Waste score trend (v0.2.0)

Built-in pricing (40+ models)

Integration examples

With OpenAI

With Anthropic

With LiteLLM

How it works

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes