Skip to main content

Track, tag, and report LLM API costs by feature, user, and model.

Project description

llmwatch banner

llmwatch

A lightweight Python library for LLM cost attribution — track, tag, and report LLM API costs by feature, user, and model.

CI PyPI version Python 3.11+ License: MIT

Why llmwatch?

llmwatch is not an observability platform or proxy server. It's a lightweight Python library that integrates directly into your existing codebase.

Unlike solutions like Langfuse, LangSmith, or LiteLLM, llmwatch requires no external infrastructure, no API gateway, and no proxy setup. Just pip install llmwatch and add 3 lines of code to start tracking LLM costs.

Key differentiators:

  • No proxy or gateway needed — Unlike LiteLLM and Helicone, which sit between your code and LLM APIs
  • No external platform — Unlike Langfuse and LangSmith, which require cloud infrastructure
  • Works with your existing SDK — Patch your OpenAI, Anthropic, Google, Cohere, or VoyageAI clients with instrument(client)
  • Feature-level cost attribution — Tag LLM calls by feature, user, environment, and any custom dimension
  • Minimal setup — 3 lines of code to get started
  • 1000+ models — Bundled pricing data covering OpenAI, Anthropic, Google, and more

Quick Start

Async

from openai import AsyncOpenAI
from llmwatch.tracker import LLMWatch

client = AsyncOpenAI()
watcher = LLMWatch(client=client)

@watcher.tracked(feature="summarize", user_id="alice")
async def summarize(text: str) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": f"Summarize: {text}"}],
    )
    return response.choices[0].message.content

result = await summarize("Long document text...")

Sync

from openai import OpenAI
from llmwatch.tracker import LLMWatch

client = OpenAI()
watcher = LLMWatch(client=client)

@watcher.tracked(feature="summarize", user_id="alice")
def summarize(text: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": f"Summarize: {text}"}],
    )
    return response.choices[0].message.content

result = summarize("Long document text...")

Features

  • Automatic cost tracking — Instrument SDK clients to capture token usage and calculate costs without modifying your LLM calls
  • Flexible tagging — Attach metadata to tracked calls with @watcher.tracked(feature=..., user_id=..., environment=...)
  • Multi-provider support — OpenAI, Anthropic, Google, Cohere, VoyageAI (sync, async, and streaming)
  • Reranker support — Auto-instrument Cohere and VoyageAI reranker SDKs, or use record_usage() for any HTTP-based API
  • Bundled pricing — 1000+ models with up-to-date pricing data synced from pydantic/genai-prices
  • Multiple database backends — SQLite (default), PostgreSQL, MySQL, MongoDB (Beanie ODM), Oracle, MSSQL
  • Budget alerts — Set thresholds and trigger callbacks when spending exceeds limits
  • Reporting and export — Generate cost summaries by feature, user, model, or provider (CSV/JSON)
  • CLI tools — View reports, manage data, sync pricing
  • Web dashboard — Optional interactive dashboard for cost visualization (llmwatch dashboard)
  • Streaming support — Track costs for streaming responses (SSE, async streams)

Supported Providers

Provider Sync Async Streaming Models
OpenAI O O O GPT-5.4, o4-mini, o3, o1, GPT-4o, etc.
Anthropic O O O Claude Opus 4.6, Claude Sonnet 4.6, Claude Haiku 4.5, etc.
Google O O O Gemini 3.1, Gemini 2.5, Gemini 2.0, etc.
Cohere O O - Rerank v3.5, Rerank v4.0, etc.
VoyageAI O O - Rerank 2.5, Rerank 2, etc.

Installation

pip install llmwatch
# or
uv add llmwatch

Optional Database Backends

pip install llmwatch[pg]         # PostgreSQL
pip install llmwatch[mysql]      # MySQL
pip install llmwatch[mongo]      # MongoDB (Beanie ODM)
pip install llmwatch[dashboard]  # Web dashboard (Starlette + Uvicorn)

Usage

Basic Tracking

from openai import AsyncOpenAI
from llmwatch.tracker import LLMWatch

client = AsyncOpenAI()
watcher = LLMWatch(client=client)

@watcher.tracked(feature="chat", user_id="user123", environment="production")
async def chat_response(prompt: str) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
    )
    return response.choices[0].message.content

# Costs are tracked automatically
result = await chat_response("Hello, how are you?")

Budget Alerts

watcher = LLMWatch(client=client)

async def on_budget_exceeded(record):
    print(f"Budget exceeded: ${record.cost_usd:.4f} on feature={record.tags.feature}")

watcher.budget.add_rule(
    max_cost_usd=0.50,
    callback=on_budget_exceeded,
    feature="summarize",
)

Reporting

Programmatic

summary = await watcher.report.by_feature(period="7d")
print(f"Total cost: ${summary.total_cost_usd:.4f}")
for b in summary.breakdowns:
    print(f"  {b.group_value}: ${b.total_cost_usd:.4f} ({b.total_requests} calls)")

# Also available: by_user_id(), by_model(), by_provider()
await watcher.report.export_csv("costs.csv", group_by="feature", period="30d")
await watcher.report.export_json("costs.json", group_by="model", period="7d")

CLI

llmwatch report --group-by feature --period 7d
llmwatch export costs.csv --format csv
llmwatch pricing list --provider openai
llmwatch pricing sync

Web Dashboard

pip install llmwatch[dashboard]
llmwatch dashboard
# Opens at http://localhost:8000

Multiple Database Backends

By default, llmwatch uses SQLite (~/.llmwatch/usage.db). Switch to other backends by passing a storage instance:

from llmwatch.tracker import LLMWatch
from llmwatch.databases.sqlalchemy import Storage

# PostgreSQL
watcher = LLMWatch(
    client=client,
    storage=Storage("postgresql+asyncpg://user:password@localhost/llmwatch"),
)

# MySQL
watcher = LLMWatch(
    client=client,
    storage=Storage("mysql+aiomysql://user:password@localhost/llmwatch"),
)

MongoDB

from llmwatch.tracker import LLMWatch
from llmwatch.databases.mongo import MongoStorage

watcher = LLMWatch(
    client=client,
    storage=MongoStorage("mongodb://localhost:27017", database="llmwatch"),
)

Manual Recording (for HTTP-based APIs)

For providers without a Python SDK (e.g., Jina reranker via httpx), use record_usage():

import httpx

response = await httpx.AsyncClient().post(
    "https://api.jina.ai/v1/rerank",
    headers={"Authorization": f"Bearer {JINA_API_KEY}"},
    json={"model": "jina-reranker-v3", "query": query, "documents": docs},
)
data = response.json()

await watcher.record_usage(
    model="jina-reranker-v3",
    provider="jina",
    input_tokens=data["usage"]["total_tokens"],
    feature="search",
)

Custom Provider Registration

Register your own provider extractor and instrumentor:

from llmwatch.extractors.base import register_extractor
from llmwatch.instrument import register_instrumentor

register_extractor("my_llm", my_extract_fn, module_prefix="my_llm_sdk")
register_instrumentor("my_llm", my_instrumentor_fn)

CLI Reference

Command Description
llmwatch report Generate cost report (--group-by, --period)
llmwatch export Export usage records to CSV or JSON
llmwatch prune Delete old records by date
llmwatch stats Show database statistics
llmwatch pricing list List pricing data by provider
llmwatch pricing sync Sync pricing data from upstream
llmwatch dashboard Start interactive web dashboard

How It Works

  1. InstrumentLLMWatch(client=client) patches the SDK client's methods
  2. Extract — On each LLM call, extractors normalize the response (handles OpenAI, Anthropic, Google, Cohere, VoyageAI, streaming)
  3. Calculatecalculate_cost() computes USD cost using bundled pricing data
  4. StoreStorage.save() persists the UsageRecord to your database
  5. Tag@watcher.tracked() provides tag context (feature, user_id, environment)
  6. Alert — Optional BudgetAlert callbacks trigger when thresholds are exceeded
  7. ReportReporter generates cost summaries grouped by feature, user, model, or provider
LLM Call
  | (via instrumented SDK client)
Extract Response -> Calculate Cost -> Save Record + Tags
  |
Database
  | (queried by Reporter)
Reports, Dashboards, Exports

Development

uv sync --group dev
uv run pytest tests/ -v
uv run ruff check src/ tests/
uv run mypy src/llmwatch/

License

MIT


Pricing data sourced from pydantic/genai-prices.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmwatch-0.3.0.tar.gz (163.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmwatch-0.3.0-py3-none-any.whl (57.3 kB view details)

Uploaded Python 3

File details

Details for the file llmwatch-0.3.0.tar.gz.

File metadata

  • Download URL: llmwatch-0.3.0.tar.gz
  • Upload date:
  • Size: 163.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llmwatch-0.3.0.tar.gz
Algorithm Hash digest
SHA256 48c4fbef71206c490306e1370859fec912bb8b11797d47bfbe62ca71dc6ccc35
MD5 5f34b696c0942fb31038dd852aa6d5e6
BLAKE2b-256 e4048f949738a5e62ef027ece3d7a264a2be12905ff89e7355141fb3346a02b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmwatch-0.3.0.tar.gz:

Publisher: publish.yml on DanMeon/llmwatch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llmwatch-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: llmwatch-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 57.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for llmwatch-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5311b4a08c6856dcf0dc742fcfff25743ff21444838fc8542466d4a967f941d1
MD5 3b713140564f61f54e14d06e09e281cd
BLAKE2b-256 ff45a4f4c0bd92c2a02b055d39b8a9ac11a8b37b076025230b0d1f06ab358c47

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmwatch-0.3.0-py3-none-any.whl:

Publisher: publish.yml on DanMeon/llmwatch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page