Track LLM API costs per request. Know where your tokens go.
Project description
LLM Cost Tracker
Track LLM API costs per request. Know where your tokens go.
Zero dependencies. Pure Python. Works with any LLM provider.
pip install llm-cost-tracker
Quickstart
from llm_cost_tracker import CostTracker
tracker = CostTracker("./llm_costs.db")
# Track an OpenAI call
tracker.record(
prompt_tokens=847,
completion_tokens=234,
model="gpt-4o-mini",
provider="openai",
)
# Track an Anthropic call
tracker.record(
prompt_tokens=1200,
completion_tokens=890,
model="claude-3-5-sonnet",
provider="anthropic",
)
# Track a request you handled locally (no LLM call)
tracker.record(
prompt_tokens=0,
completion_tokens=0,
model="gpt-4o-mini",
provider="openai",
route="local",
prompt_text="where is the login function defined",
intent="code_lookup",
)
# See where your money is going
report = tracker.report(window="7d")
print(f"Total cost: ${report['total_cost_usd']:.4f}")
print(f"Total tokens: {report['total_tokens']:,}")
print(f"Requests: {report['total_requests']}")
print(f"Local vs external: {report['local_count']} / {report['external_count']}")
print(f"Estimated savings: ${report['total_saved_full_modeled_usd']:.4f}")
print(f"Cost by model: {report['cost_by_model']}")
# The important part — how much are you wasting?
print(f"\n--- Waste Analysis ---")
print(f"Avoidable external requests: {report['avoidable_external_requests']}")
print(f"Money wasted on unnecessary LLM calls: ${report['avoidable_cost_usd']:.4f}")
print(f"Additional savings from model downgrades: ${report['potential_model_downgrade_savings_usd']:.4f}")
print(report['optimization_summary'])
What it tracks
Every call to tracker.record() stores:
- Tokens used — prompt + completion, per request
- Cost in USD — calculated from built-in pricing tables (40+ models)
- Route — was this handled locally or sent to an LLM?
- Counterfactual savings — if you handled it locally, how much did you save vs sending it to the LLM?
- Model, provider, intent, session — slice your costs any way you want
Reports
# Last 7 days, grouped by model
report = tracker.report(window="7d", group_by="model")
# Last 24 hours, specific session
report = tracker.report(window="1d", session_key="user-123")
# All time
report = tracker.report()
Report fields:
total_requests,total_cost_usd,total_tokenstotal_prompt_tokens,total_completion_tokenslocal_count,external_counttotal_saved_prompt_only_usd,total_saved_full_modeled_usdrequests_by_route,cost_by_model,cost_by_providertokens_by_model,savings_by_intentavoidable_external_requests— requests sent to LLM that didn't need oneavoidable_cost_usd— money wasted on those unnecessary callsavoidable_percent— what % of your external calls were avoidablepotential_model_downgrade_savings_usd— savings from using cheaper modelsoptimization_summary— human-readable summary of waste foundbreakdown(whengroup_byis specified)
Snapshots (for dashboards & cron jobs)
# Capture a daily snapshot
snapshot = tracker.capture_snapshot(window_hours=24, job_name="daily-cost-report")
print(f"Net savings: ${snapshot['net_savings_conservative_usd']:.4f}")
# View recent snapshots
for s in tracker.snapshots(limit=7):
print(f"{s['job_name']}: saved ${s['saved_full_modeled_usd']:.4f}, spent ${s['external_cost_usd']:.4f}")
Built-in pricing (40+ models)
Pricing is built in for OpenAI, Anthropic, Google, Meta, Mistral, and DeepSeek models. Prices are USD per 1M tokens and auto-matched by model name.
from llm_cost_tracker import lookup_pricing
inp, out, source = lookup_pricing("gpt-4o-mini")
print(f"Input: ${inp}/1M tokens, Output: ${out}/1M tokens")
# Input: $0.15/1M tokens, Output: $0.6/1M tokens
Custom pricing:
from llm_cost_tracker.pricing import DEFAULT_PRICING
DEFAULT_PRICING["my-custom-model"] = (1.00, 3.00) # input, output per 1M tokens
Integration examples
With OpenAI
import openai
from llm_cost_tracker import CostTracker
client = openai.OpenAI()
tracker = CostTracker("./costs.db")
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
)
tracker.record(
prompt_tokens=response.usage.prompt_tokens,
completion_tokens=response.usage.completion_tokens,
model="gpt-4o-mini",
provider="openai",
)
With Anthropic
import anthropic
from llm_cost_tracker import CostTracker
client = anthropic.Anthropic()
tracker = CostTracker("./costs.db")
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}],
)
tracker.record(
prompt_tokens=response.usage.input_tokens,
completion_tokens=response.usage.output_tokens,
model="claude-sonnet-4-20250514",
provider="anthropic",
)
With LiteLLM
import litellm
from llm_cost_tracker import CostTracker
tracker = CostTracker("./costs.db")
response = litellm.completion(model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello"}])
tracker.record(
prompt_tokens=response.usage.prompt_tokens,
completion_tokens=response.usage.completion_tokens,
model="gpt-4o-mini",
provider="openai",
)
How it works
- SQLite database — all data stored locally in a single file. No external services.
- Zero dependencies — pure Python stdlib. No numpy, no pandas, no requests.
- WAL mode — concurrent reads while writing. Safe for multi-threaded apps.
- Built-in pricing — 40+ models with auto-matching. Falls back gracefully for unknown models.
- Counterfactual tracking — when you handle a request locally, it estimates what the LLM call would have cost, so you can see real savings.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_costlog-0.1.0.tar.gz.
File metadata
- Download URL: llm_costlog-0.1.0.tar.gz
- Upload date:
- Size: 14.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
583babe14cd29cddf61d7c716613a1567d577ea6be51d9943d745bcf451c633c
|
|
| MD5 |
907ba3143edd6fe8d10f13b0f4f084be
|
|
| BLAKE2b-256 |
236968e64edd21754a92e471f335d7c849cb6d1373de726767925f1f53aa7a04
|
File details
Details for the file llm_costlog-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llm_costlog-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b8b12b8c39d161dc59e41f02f3083885e883ebe212c98998163ade5078094b9
|
|
| MD5 |
4994238a897bf9d750feadbd5e7be6a4
|
|
| BLAKE2b-256 |
421a9de1c1c123f95dd8a22ecadaee45d246e45a0e87c8581b7228b9898ef237
|