Skip to main content

Track LLM costs in LangChain applications with zero code changes

Project description

AgentCost SDK

Zero-friction LLM cost tracking for LangChain applications.

Installation

pip install agentcost

Or install from source:

cd agentcost-sdk
pip install -e .

Quick Start

from agentcost import track_costs

# 2 lines to add cost tracking!
track_costs.init(api_key="your_api_key", project_id="my-project")

# Your existing code works unchanged
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4")
response = llm.invoke("Hello!")  # Automatically tracked

Features

  • Zero Code Changes: Monkey patches LangChain - your code works as-is
  • Automatic Tracking: Captures all invoke(), ainvoke(), stream(), astream() calls
  • Accurate Tokens: Uses tiktoken for precise token counting
  • Real-Time Costs: Calculates costs using up-to-date model pricing
  • Batched Sending: Efficient network usage (size-based + time-based batching)
  • Rate Limiting: Built-in rate limiter to protect your backend
  • Local Mode: Test without a backend

Configuration

track_costs.init(
    # Required for cloud mode
    api_key="sk_...",
    project_id="my-project",

    # Optional settings
    base_url="https://api.agentcost.tech",   # Your backend URL
    batch_size=10,                          # Events before auto-flush
    flush_interval=5.0,                     # Seconds between flushes
    debug=True,                             # Enable debug logging
    default_agent_name="my-agent",          # Default agent tag
    local_mode=False,                       # Store locally (no backend)
    enabled=True,                           # Enable/disable tracking

    # Custom pricing (overrides defaults)
    custom_pricing={
        "my-custom-model": {"input": 0.001, "output": 0.002}
    },

    # Global metadata (attached to all events)
    global_metadata={
        "environment": "production",
        "version": "1.0.0"
    }
)

Agent Tagging

Tag LLM calls by agent for granular analytics:

# Option 1: Set default agent
track_costs.set_agent_name("router-agent")

# Option 2: Context manager (recommended)
with track_costs.agent("technical-agent"):
    llm.invoke("How do I fix this?")  # Tagged as "technical-agent"

with track_costs.agent("billing-agent"):
    llm.invoke("What's my balance?")  # Tagged as "billing-agent"

Metadata

Attach custom metadata for filtering/grouping:

# Persistent metadata
track_costs.add_metadata("user_id", "user_123")
track_costs.add_metadata("tenant_id", "acme_corp")

# Temporary metadata (context manager)
with track_costs.metadata(conversation_id="conv_456", step="routing"):
    llm.invoke("Route this query")

Local Testing

Test without running a backend:

track_costs.init(local_mode=True, debug=True)

# Make LLM calls
llm.invoke("Hello!")
llm.invoke("World!")

# Retrieve captured events
events = track_costs.get_local_events()
for event in events:
    print(f"Model: {event['model']}")
    print(f"Tokens: {event['total_tokens']}")
    print(f"Cost: ${event['cost']:.6f}")

Streaming Support

Streaming calls are automatically tracked:

# Sync streaming
for chunk in llm.stream("Tell me a story"):
    print(chunk.content, end="")
# Event recorded after stream completes

# Async streaming
async for chunk in llm.astream("Tell me a story"):
    print(chunk.content, end="")
# Event recorded after stream completes

Event Structure

Each tracked event contains:

{
    "agent_name": "my-agent",
    "model": "gpt-4",
    "input_tokens": 150,
    "output_tokens": 80,
    "total_tokens": 230,
    "cost": 0.0093,            # USD, real-time calculated
    "latency_ms": 1234,        # Measured latency
    "timestamp": "2026-01-23T10:30:45.123Z",
    "success": True,
    "error": None,
    "streaming": False,
    "metadata": {"conversation_id": "conv_456"}
}

Dynamic Pricing (Real-Time Updates)

The SDK automatically fetches the latest pricing from the backend. This means:

  • No code changes when model prices change
  • Pricing is cached for 24 hours (efficient)
  • Falls back to built-in defaults if backend is unavailable

How It Works

# SDK automatically fetches pricing from backend
track_costs.init(
    api_key="...",
    project_id="...",
    base_url="http://localhost:8000",  # Pricing fetched from here
)

# Prices are fetched once and cached
# GET http://localhost:8000/v1/pricing → {"pricing": {"gpt-4": {"input": 0.03, ...}}}

Manually Update Pricing

from agentcost.cost_calculator import refresh_pricing, update_pricing

# Force refresh from backend
refresh_pricing()

# Or manually set pricing (doesn't require backend)
update_pricing({
    "my-custom-model": {"input": 0.001, "output": 0.002}
})

Backend Pricing API

# Get all pricing
curl http://localhost:8000/v1/pricing

# Get specific model
curl http://localhost:8000/v1/pricing/gpt-4

# Update pricing (admin)
curl -X POST http://localhost:8000/v1/pricing \
  -H "Content-Type: application/json" \
  -d '{"gpt-4": {"input": 0.025, "output": 0.05}}'

Supported Models (30+)

Provider Models
OpenAI gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-3.5-turbo, o1, o1-mini
Anthropic claude-3-opus/sonnet/haiku, claude-3.5-sonnet/haiku
Google gemini-pro, gemini-1.5-pro/flash, gemini-2.0-flash
Groq llama-3.1-8b/70b, llama-3.3-70b, mixtral-8x7b
DeepSeek deepseek-chat, deepseek-coder, deepseek-reasoner
Cohere command, command-r, command-r-plus
Mistral mistral-small/medium/large

Statistics

stats = track_costs.get_stats()
print(f"Events sent: {stats['batcher']['events_sent']}")
print(f"Batches sent: {stats['batcher']['batches_sent']}")

Graceful Shutdown

track_costs.flush()     # Send pending events
track_costs.shutdown()  # Full shutdown

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentcost-0.1.0.tar.gz (30.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentcost-0.1.0-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file agentcost-0.1.0.tar.gz.

File metadata

  • Download URL: agentcost-0.1.0.tar.gz
  • Upload date:
  • Size: 30.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for agentcost-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ab7501b4e73bbea974cd60aa689ee85ae14334a1235aafb6ac4b0c8b0bf4e98e
MD5 01ddfee23ded77aa77d0eb0c9b494f7f
BLAKE2b-256 1d8164f214f68a40590045ea2ef6e9ef2254c36f07b19ea3bea2f8a2f9454459

See more details on using hashes here.

File details

Details for the file agentcost-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: agentcost-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for agentcost-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e11036b49a71c1dc0b14d18d66c88da710170f51a32e180e891f31a2ceb83286
MD5 fe0e0b8fe6c7c17b641a608f89c4c6ed
BLAKE2b-256 145ed981c5671c0752f8f0135ec39b65a32272de9d9823c27dcd0061a1adc0f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page