Skip to main content

Know where your AI money goes. Track LLM API costs per feature, per model, per user.

Project description

LLMSpend

Know where your AI money goes.

Track LLM API costs per feature, per model, per user. 2 lines of code. Zero dependencies. Local-first.

pip install llmspend

Quick Start

import anthropic
from llmspend import monitor

# Wrap your client — that's it
client = monitor.wrap(anthropic.Anthropic(), project="my-app")

# Use it exactly as before
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1000,
    messages=[{"role": "user", "content": "Hello"}]
)
# Cost, tokens, and latency are now tracked automatically

Works with OpenAI too:

import openai
from llmspend import monitor

client = monitor.wrap(openai.OpenAI(), project="my-app")
# All chat.completions.create calls are now tracked

Tag by Feature

See exactly which part of your app is burning money:

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1000,
    messages=[{"role": "user", "content": query}],
    llmspend={"feature": "chatbot", "user_id": "u_123"}
)

CLI

# Cost summary (last 24h by default)
llmspend stats

# Last 7 days, grouped by feature
llmspend stats --last 7d --by feature

# Most expensive individual calls
llmspend top

# Export all events as JSON
llmspend export

Output:

  LLMSpend — Last 7d
  ──────────────────────────────────────────────────────
  Total: $12.4320 across 2,847 calls

  Group                      Calls       Cost    Avg ms
  ───────────────────────── ────── ────────── ────────
  chatbot                     1204   $7.2100     1180ms
  search                       893   $3.8900      640ms
  summarizer                   750   $1.3320      380ms

Local Dashboard

llmspend dashboard

Opens a local web dashboard at localhost:8888 — cost breakdown by model, feature, and time. Auto-refreshes. No account needed.

How It Works

  1. monitor.wrap() patches client.messages.create (Anthropic) or client.chat.completions.create (OpenAI)
  2. Every API call is intercepted — tokens, cost, latency, and your tags are recorded
  3. Events flush to a local SQLite database at ~/.llmspend/events.db every 5 seconds via a background thread
  4. Zero overhead on your API calls — logging happens asynchronously after the response returns

What Gets Tracked

Per API call:

  • Provider, model, timestamp
  • Input/output tokens
  • Cost in USD (calculated from published pricing)
  • Latency in ms
  • Your custom tags (feature, user_id, or any key-value pair)

What is never tracked:

  • Prompt content
  • Response content
  • API keys

Supported Models

Anthropic: Claude Opus 4, Sonnet 4, Haiku 4.5, and all dated variants

OpenAI: GPT-4o, GPT-4.1, o3, o4-mini, and all variants

Cost calculation uses prefix matching — claude-haiku-4-5-20251001 matches the claude-haiku-4 pricing tier. Unknown models are tracked with null cost (tokens and latency still recorded).

Configuration

from llmspend import monitor

# Default: logs to ~/.llmspend/events.db
monitor.configure()

# Custom local path
monitor.configure(local_path="/var/log/my-app/llmspend.db")

# Future: send to hosted dashboard
monitor.configure(backend_url="https://llmspend.dev", api_key="ls_...")

Design Principles

  • Never crash your app. All tracking runs in try/except. If LLMSpend fails, your API call still works.
  • Never store prompts. Only metadata (tokens, cost, timing). Your data stays private.
  • Zero dependencies. Pure Python stdlib. No requests, no aiohttp, no protobuf.
  • Local-first. Works offline. No account required. Your data stays on your machine.

Hosted Version

Coming soon at llmspend.dev — team dashboards, alerts, budget caps, and cost forecasting.

License

MIT — use it however you want.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmspend-0.1.1.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmspend-0.1.1-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file llmspend-0.1.1.tar.gz.

File metadata

  • Download URL: llmspend-0.1.1.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for llmspend-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4c2c8c1725876eb817e3d6c77e3fc54faec8084a9dfc92113c0809224af605a6
MD5 67ac4e8d87e927b520c7d6b9fc4f916d
BLAKE2b-256 1c42daa009eb3c8d32c2f7c4a4b90b161f95239a8d0dbccc308cb327ebe7960a

See more details on using hashes here.

File details

Details for the file llmspend-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: llmspend-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for llmspend-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9410609f39c3653d487199251e9cc5c1431eefb31a748adfadd4c0776cf105dc
MD5 518ff5b4944d79986f2deb68cc74fe8c
BLAKE2b-256 a99ea157d06f72af5080ad21ccce52b1ec54b2e4a9e214147168066ed3c2dae6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page