Know where your AI money goes. Track LLM API costs per feature, per model, per user.
Project description
LLMSpend
Know where your AI money goes.
Track LLM API costs per feature, per model, per user. 2 lines of code. Zero dependencies. Local-first.
pip install llmspend
Quick Start
import anthropic
from llmspend import monitor
# Wrap your client — that's it
client = monitor.wrap(anthropic.Anthropic(), project="my-app")
# Use it exactly as before
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1000,
messages=[{"role": "user", "content": "Hello"}]
)
# Cost, tokens, and latency are now tracked automatically
Works with OpenAI too:
import openai
from llmspend import monitor
client = monitor.wrap(openai.OpenAI(), project="my-app")
# All chat.completions.create calls are now tracked
Tag by Feature
See exactly which part of your app is burning money:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1000,
messages=[{"role": "user", "content": query}],
llmspend={"feature": "chatbot", "user_id": "u_123"}
)
CLI
# Cost summary (last 24h by default)
llmspend stats
# Last 7 days, grouped by feature
llmspend stats --last 7d --by feature
# Most expensive individual calls
llmspend top
# Export all events as JSON
llmspend export
Output:
LLMSpend — Last 7d
──────────────────────────────────────────────────────
Total: $12.4320 across 2,847 calls
Group Calls Cost Avg ms
───────────────────────── ────── ────────── ────────
chatbot 1204 $7.2100 1180ms
search 893 $3.8900 640ms
summarizer 750 $1.3320 380ms
Local Dashboard
llmspend dashboard
Opens a local web dashboard at localhost:8888 — cost breakdown by model, feature, and time. Auto-refreshes. No account needed.
How It Works
monitor.wrap()patchesclient.messages.create(Anthropic) orclient.chat.completions.create(OpenAI)- Every API call is intercepted — tokens, cost, latency, and your tags are recorded
- Events flush to a local SQLite database at
~/.llmspend/events.dbevery 5 seconds via a background thread - Zero overhead on your API calls — logging happens asynchronously after the response returns
What Gets Tracked
Per API call:
- Provider, model, timestamp
- Input/output tokens
- Cost in USD (calculated from published pricing)
- Latency in ms
- Your custom tags (feature, user_id, or any key-value pair)
What is never tracked:
- Prompt content
- Response content
- API keys
Supported Models
Anthropic: Claude Opus 4, Sonnet 4, Haiku 4.5, and all dated variants
OpenAI: GPT-4o, GPT-4.1, o3, o4-mini, and all variants
Cost calculation uses prefix matching — claude-haiku-4-5-20251001 matches the claude-haiku-4 pricing tier. Unknown models are tracked with null cost (tokens and latency still recorded).
Configuration
from llmspend import monitor
# Default: logs to ~/.llmspend/events.db
monitor.configure()
# Custom local path
monitor.configure(local_path="/var/log/my-app/llmspend.db")
# Future: send to hosted dashboard
monitor.configure(backend_url="https://llmspend.dev", api_key="ls_...")
Design Principles
- Never crash your app. All tracking runs in try/except. If LLMSpend fails, your API call still works.
- Never store prompts. Only metadata (tokens, cost, timing). Your data stays private.
- Zero dependencies. Pure Python stdlib. No requests, no aiohttp, no protobuf.
- Local-first. Works offline. No account required. Your data stays on your machine.
Hosted Version
Coming soon at llmspend.dev — team dashboards, alerts, budget caps, and cost forecasting.
License
MIT — use it however you want.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmspend-0.1.2.tar.gz.
File metadata
- Download URL: llmspend-0.1.2.tar.gz
- Upload date:
- Size: 16.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9e2ce397615003120c4100006e6e9b2bf283b107635c9ad1c700223402918ef
|
|
| MD5 |
72d422aba5653429f17dc4b12f3b6635
|
|
| BLAKE2b-256 |
fbd794326cd73b81e77f424228eb79368c67ad64aaa8d13fa916421c829362e2
|
File details
Details for the file llmspend-0.1.2-py3-none-any.whl.
File metadata
- Download URL: llmspend-0.1.2-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b2319c5bafca21eb0705500fedb25e302354ee7309e25efee088c2915cf4d24
|
|
| MD5 |
6f4da54d87336cd90aef795a30eb9075
|
|
| BLAKE2b-256 |
621349f7305eb502fbc4d87dfeed6de0f0ec3780eb6da62a4eaf8f286ef8d572
|