Track, tag, and report LLM API costs by feature, user, and model.
Project description
llmwatch
A lightweight Python library for LLM cost attribution — track, tag, and report LLM API costs by feature, user, and model.
Why llmwatch?
llmwatch is not an observability platform or proxy server. It's a lightweight Python library that integrates directly into your existing codebase.
Unlike solutions like Langfuse, LangSmith, or LiteLLM, llmwatch requires no external infrastructure, no API gateway, and no proxy setup. Just pip install llmwatch and add 3 lines of code to start tracking LLM costs.
Key differentiators:
- No proxy or gateway needed — Unlike LiteLLM and Helicone, which sit between your code and LLM APIs
- No external platform — Unlike Langfuse and LangSmith, which require cloud infrastructure
- Works with your existing SDK — Patch your OpenAI, Anthropic, Google, Cohere, or VoyageAI clients with
instrument(client) - Feature-level cost attribution — Tag LLM calls by feature, user, environment, and any custom dimension
- Minimal setup — 3 lines of code to get started
- 1000+ models — Bundled pricing data covering OpenAI, Anthropic, Google, and more
Quick Start
Async
from openai import AsyncOpenAI
from llmwatch.tracker import LLMWatch
client = AsyncOpenAI()
watcher = LLMWatch(client=client)
@watcher.tracked(feature="summarize", user_id="alice")
async def summarize(text: str) -> str:
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": f"Summarize: {text}"}],
)
return response.choices[0].message.content
result = await summarize("Long document text...")
Sync
from openai import OpenAI
from llmwatch.tracker import LLMWatch
client = OpenAI()
watcher = LLMWatch(client=client)
@watcher.tracked(feature="summarize", user_id="alice")
def summarize(text: str) -> str:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": f"Summarize: {text}"}],
)
return response.choices[0].message.content
result = summarize("Long document text...")
Features
- Automatic cost tracking — Instrument SDK clients to capture token usage and calculate costs without modifying your LLM calls
- Flexible tagging — Attach metadata to tracked calls with
@watcher.tracked(feature=..., user_id=..., environment=...) - Multi-provider support — OpenAI, Anthropic, Google, Cohere, VoyageAI (sync, async, and streaming)
- Reranker support — Auto-instrument Cohere and VoyageAI reranker SDKs, or use
record_usage()for any HTTP-based API - Bundled pricing — 1000+ models with up-to-date pricing data synced from pydantic/genai-prices
- Multiple database backends — SQLite (default), PostgreSQL, MySQL, MongoDB (Beanie ODM), Oracle, MSSQL
- Budget alerts — Set thresholds and trigger callbacks when spending exceeds limits
- Reporting and export — Generate cost summaries by feature, user, model, or provider (CSV/JSON)
- CLI tools — View reports, manage data, sync pricing
- Web dashboard — Optional interactive dashboard for cost visualization (
llmwatch dashboard) - Streaming support — Track costs for streaming responses (SSE, async streams)
Supported Providers
| Provider | Sync | Async | Streaming | Models |
|---|---|---|---|---|
| OpenAI | O | O | O | GPT-5.4, o4-mini, o3, o1, GPT-4o, etc. |
| Anthropic | O | O | O | Claude Opus 4.6, Claude Sonnet 4.6, Claude Haiku 4.5, etc. |
| O | O | O | Gemini 3.1, Gemini 2.5, Gemini 2.0, etc. | |
| Cohere | O | O | - | Rerank v3.5, Rerank v4.0, etc. |
| VoyageAI | O | O | - | Rerank 2.5, Rerank 2, etc. |
Installation
pip install llmwatch
# or
uv add llmwatch
Optional Database Backends
pip install llmwatch[pg] # PostgreSQL
pip install llmwatch[mysql] # MySQL
pip install llmwatch[mongo] # MongoDB (Beanie ODM)
pip install llmwatch[dashboard] # Web dashboard (Starlette + Uvicorn)
Usage
Basic Tracking
from openai import AsyncOpenAI
from llmwatch.tracker import LLMWatch
client = AsyncOpenAI()
watcher = LLMWatch(client=client)
@watcher.tracked(feature="chat", user_id="user123", environment="production")
async def chat_response(prompt: str) -> str:
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
)
return response.choices[0].message.content
# Costs are tracked automatically
result = await chat_response("Hello, how are you?")
Budget Alerts
watcher = LLMWatch(client=client)
async def on_budget_exceeded(record):
print(f"Budget exceeded: ${record.cost_usd:.4f} on feature={record.tags.feature}")
watcher.budget.add_rule(
max_cost_usd=0.50,
callback=on_budget_exceeded,
feature="summarize",
)
Reporting
Programmatic
summary = await watcher.report.by_feature(period="7d")
print(f"Total cost: ${summary.total_cost_usd:.4f}")
for b in summary.breakdowns:
print(f" {b.group_value}: ${b.total_cost_usd:.4f} ({b.total_requests} calls)")
# Also available: by_user_id(), by_model(), by_provider()
await watcher.report.export_csv("costs.csv", group_by="feature", period="30d")
await watcher.report.export_json("costs.json", group_by="model", period="7d")
CLI
llmwatch report --group-by feature --period 7d
llmwatch export costs.csv --format csv
llmwatch pricing list --provider openai
llmwatch pricing sync
Web Dashboard
pip install llmwatch[dashboard]
llmwatch dashboard
# Opens at http://localhost:8000
Multiple Database Backends
By default, llmwatch uses SQLite (~/.llmwatch/usage.db). Switch to other backends by passing a storage instance:
from llmwatch.tracker import LLMWatch
from llmwatch.databases.sqlalchemy import Storage
# PostgreSQL
watcher = LLMWatch(
client=client,
storage=Storage("postgresql+asyncpg://user:password@localhost/llmwatch"),
)
# MySQL
watcher = LLMWatch(
client=client,
storage=Storage("mysql+aiomysql://user:password@localhost/llmwatch"),
)
MongoDB
from llmwatch.tracker import LLMWatch
from llmwatch.databases.mongo import MongoStorage
watcher = LLMWatch(
client=client,
storage=MongoStorage("mongodb://localhost:27017", database="llmwatch"),
)
Manual Recording (for HTTP-based APIs)
For providers without a Python SDK (e.g., Jina reranker via httpx), use record_usage():
import httpx
response = await httpx.AsyncClient().post(
"https://api.jina.ai/v1/rerank",
headers={"Authorization": f"Bearer {JINA_API_KEY}"},
json={"model": "jina-reranker-v3", "query": query, "documents": docs},
)
data = response.json()
await watcher.record_usage(
model="jina-reranker-v3",
provider="jina",
input_tokens=data["usage"]["total_tokens"],
feature="search",
)
Custom Provider Registration
Register your own provider extractor and instrumentor:
from llmwatch.extractors.base import register_extractor
from llmwatch.instrument import register_instrumentor
register_extractor("my_llm", my_extract_fn, module_prefix="my_llm_sdk")
register_instrumentor("my_llm", my_instrumentor_fn)
CLI Reference
| Command | Description |
|---|---|
llmwatch report |
Generate cost report (--group-by, --period) |
llmwatch export |
Export usage records to CSV or JSON |
llmwatch prune |
Delete old records by date |
llmwatch stats |
Show database statistics |
llmwatch pricing list |
List pricing data by provider |
llmwatch pricing sync |
Sync pricing data from upstream |
llmwatch dashboard |
Start interactive web dashboard |
How It Works
- Instrument —
LLMWatch(client=client)patches the SDK client's methods - Extract — On each LLM call, extractors normalize the response (handles OpenAI, Anthropic, Google, Cohere, VoyageAI, streaming)
- Calculate —
calculate_cost()computes USD cost using bundled pricing data - Store —
Storage.save()persists theUsageRecordto your database - Tag —
@watcher.tracked()provides tag context (feature, user_id, environment) - Alert — Optional
BudgetAlertcallbacks trigger when thresholds are exceeded - Report —
Reportergenerates cost summaries grouped by feature, user, model, or provider
LLM Call
| (via instrumented SDK client)
Extract Response -> Calculate Cost -> Save Record + Tags
|
Database
| (queried by Reporter)
Reports, Dashboards, Exports
Development
uv sync --group dev
uv run pytest tests/ -v
uv run ruff check src/ tests/
uv run mypy src/llmwatch/
License
MIT
Pricing data sourced from pydantic/genai-prices.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmwatch-0.3.0.tar.gz.
File metadata
- Download URL: llmwatch-0.3.0.tar.gz
- Upload date:
- Size: 163.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48c4fbef71206c490306e1370859fec912bb8b11797d47bfbe62ca71dc6ccc35
|
|
| MD5 |
5f34b696c0942fb31038dd852aa6d5e6
|
|
| BLAKE2b-256 |
e4048f949738a5e62ef027ece3d7a264a2be12905ff89e7355141fb3346a02b3
|
Provenance
The following attestation bundles were made for llmwatch-0.3.0.tar.gz:
Publisher:
publish.yml on DanMeon/llmwatch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmwatch-0.3.0.tar.gz -
Subject digest:
48c4fbef71206c490306e1370859fec912bb8b11797d47bfbe62ca71dc6ccc35 - Sigstore transparency entry: 1178031751
- Sigstore integration time:
-
Permalink:
DanMeon/llmwatch@22b1e258ce75447e910f8e618b9ea8226d5d6293 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/DanMeon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@22b1e258ce75447e910f8e618b9ea8226d5d6293 -
Trigger Event:
release
-
Statement type:
File details
Details for the file llmwatch-0.3.0-py3-none-any.whl.
File metadata
- Download URL: llmwatch-0.3.0-py3-none-any.whl
- Upload date:
- Size: 57.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5311b4a08c6856dcf0dc742fcfff25743ff21444838fc8542466d4a967f941d1
|
|
| MD5 |
3b713140564f61f54e14d06e09e281cd
|
|
| BLAKE2b-256 |
ff45a4f4c0bd92c2a02b055d39b8a9ac11a8b37b076025230b0d1f06ab358c47
|
Provenance
The following attestation bundles were made for llmwatch-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on DanMeon/llmwatch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmwatch-0.3.0-py3-none-any.whl -
Subject digest:
5311b4a08c6856dcf0dc742fcfff25743ff21444838fc8542466d4a967f941d1 - Sigstore transparency entry: 1178032158
- Sigstore integration time:
-
Permalink:
DanMeon/llmwatch@22b1e258ce75447e910f8e618b9ea8226d5d6293 -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/DanMeon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@22b1e258ce75447e910f8e618b9ea8226d5d6293 -
Trigger Event:
release
-
Statement type: