Skip to main content

Know where your AI money goes. Track LLM API costs per feature, per model, per user.

Project description

LLMSpend

Know where your AI money goes.

Track LLM API costs per feature, per model, per user. 2 lines of code. Zero dependencies. Local-first.

pip install llmspend

Quick Start

import anthropic
from llmspend import monitor

# Wrap your client — that's it
client = monitor.wrap(anthropic.Anthropic(), project="my-app")

# Use it exactly as before
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1000,
    messages=[{"role": "user", "content": "Hello"}]
)
# Cost, tokens, and latency are now tracked automatically

Works with OpenAI too:

import openai
from llmspend import monitor

client = monitor.wrap(openai.OpenAI(), project="my-app")
# All chat.completions.create calls are now tracked

Tag by Feature

See exactly which part of your app is burning money:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1000,
    messages=[{"role": "user", "content": query}],
    llmspend={"feature": "chatbot", "user_id": "u_123"}
)

CLI

# Cost summary (last 24h by default)
llmspend stats

# Last 7 days, grouped by feature
llmspend stats --last 7d --by feature

# Most expensive individual calls
llmspend top

# Export all events as JSON
llmspend export

Output:

  LLMSpend — Last 7d
  ──────────────────────────────────────────────────────
  Total: $12.4320 across 2,847 calls

  Group                      Calls       Cost    Avg ms
  ───────────────────────── ────── ────────── ────────
  chatbot                     1204   $7.2100     1180ms
  search                       893   $3.8900      640ms
  summarizer                   750   $1.3320      380ms

Local Dashboard

llmspend dashboard

Opens a local web dashboard at localhost:8888 — cost breakdown by model, feature, and time. Auto-refreshes. No account needed.

How It Works

  1. monitor.wrap() patches client.messages.create (Anthropic) or client.chat.completions.create (OpenAI)
  2. Every API call is intercepted — tokens, cost, latency, and your tags are recorded
  3. Events flush to a local SQLite database at ~/.llmspend/events.db every 5 seconds via a background thread
  4. Zero overhead on your API calls — logging happens asynchronously after the response returns

What Gets Tracked

Per API call:

  • Provider, model, timestamp
  • Input/output tokens
  • Cost in USD (calculated from published pricing)
  • Latency in ms
  • Your custom tags (feature, user_id, or any key-value pair)

What is never tracked:

  • Prompt content
  • Response content
  • API keys

Supported Models

Anthropic: Claude Opus 4, Sonnet 4, Haiku 4.5, and all dated variants

OpenAI: GPT-4o, GPT-4.1, o3, o4-mini, and all variants

Cost calculation uses prefix matching — claude-haiku-4-5-20251001 matches the claude-haiku-4 pricing tier. Unknown models are tracked with null cost (tokens and latency still recorded).

Configuration

from llmspend import monitor

# Default: logs to ~/.llmspend/events.db
monitor.configure()

# Custom local path
monitor.configure(local_path="/var/log/my-app/llmspend.db")

# Future: send to hosted dashboard
monitor.configure(backend_url="https://llmspend.dev", api_key="ls_...")

Design Principles

  • Never crash your app. All tracking runs in try/except. If LLMSpend fails, your API call still works.
  • Never store prompts. Only metadata (tokens, cost, timing). Your data stays private.
  • Zero dependencies. Pure Python stdlib. No requests, no aiohttp, no protobuf.
  • Local-first. Works offline. No account required. Your data stays on your machine.

Hosted Version

Coming soon at llmspend.dev — team dashboards, alerts, budget caps, and cost forecasting.

License

MIT — use it however you want.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmspend-0.1.2.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmspend-0.1.2-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file llmspend-0.1.2.tar.gz.

File metadata

  • Download URL: llmspend-0.1.2.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for llmspend-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a9e2ce397615003120c4100006e6e9b2bf283b107635c9ad1c700223402918ef
MD5 72d422aba5653429f17dc4b12f3b6635
BLAKE2b-256 fbd794326cd73b81e77f424228eb79368c67ad64aaa8d13fa916421c829362e2

See more details on using hashes here.

File details

Details for the file llmspend-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: llmspend-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for llmspend-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1b2319c5bafca21eb0705500fedb25e302354ee7309e25efee088c2915cf4d24
MD5 6f4da54d87336cd90aef795a30eb9075
BLAKE2b-256 621349f7305eb502fbc4d87dfeed6de0f0ec3780eb6da62a4eaf8f286ef8d572

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page