Skip to main content

Track LLM costs across OpenAI, Anthropic, LangChain, and 1900+ models with zero code changes

Project description

AgentCost SDK

Zero-friction LLM cost tracking for LangChain applications.

Installation

pip install agentcost

Or install from source:

cd agentcost-sdk
pip install -e .

Quick Start

from agentcost import track_costs

# 2 lines to add cost tracking!
track_costs.init(api_key="your_api_key", project_id="my-project")

# Your existing code works unchanged
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4")
response = llm.invoke("Hello!")  # Automatically tracked

Features

  • Zero Code Changes: Monkey patches LangChain - your code works as-is
  • Automatic Tracking: Captures all invoke(), ainvoke(), stream(), astream() calls
  • Accurate Tokens: Uses tiktoken for precise token counting
  • Real-Time Costs: Calculates costs using up-to-date model pricing
  • Batched Sending: Efficient network usage (size-based + time-based batching)
  • Rate Limiting: Built-in rate limiter to protect your backend
  • Local Mode: Test without a backend

Configuration

track_costs.init(
    # Required for cloud mode
    api_key="sk_...",
    project_id="my-project",

    # Optional settings
    base_url="https://api.agentcost.tech",   # Your backend URL
    batch_size=10,                          # Events before auto-flush
    flush_interval=5.0,                     # Seconds between flushes
    debug=True,                             # Enable debug logging
    default_agent_name="my-agent",          # Default agent tag
    local_mode=False,                       # Store locally (no backend)
    enabled=True,                           # Enable/disable tracking

    # Custom pricing (overrides defaults)
    custom_pricing={
        "my-custom-model": {"input": 0.001, "output": 0.002}
    },

    # Global metadata (attached to all events)
    global_metadata={
        "environment": "production",
        "version": "1.0.0"
    }
)

Agent Tagging

Tag LLM calls by agent for granular analytics:

# Option 1: Set default agent
track_costs.set_agent_name("router-agent")

# Option 2: Context manager (recommended)
with track_costs.agent("technical-agent"):
    llm.invoke("How do I fix this?")  # Tagged as "technical-agent"

with track_costs.agent("billing-agent"):
    llm.invoke("What's my balance?")  # Tagged as "billing-agent"

Metadata

Attach custom metadata for filtering/grouping:

# Persistent metadata
track_costs.add_metadata("user_id", "user_123")
track_costs.add_metadata("tenant_id", "acme_corp")

# Temporary metadata (context manager)
with track_costs.metadata(conversation_id="conv_456", step="routing"):
    llm.invoke("Route this query")

Local Testing

Test without running a backend:

track_costs.init(local_mode=True, debug=True)

# Make LLM calls
llm.invoke("Hello!")
llm.invoke("World!")

# Retrieve captured events
events = track_costs.get_local_events()
for event in events:
    print(f"Model: {event['model']}")
    print(f"Tokens: {event['total_tokens']}")
    print(f"Cost: ${event['cost']:.6f}")

Streaming Support

Streaming calls are automatically tracked:

# Sync streaming
for chunk in llm.stream("Tell me a story"):
    print(chunk.content, end="")
# Event recorded after stream completes

# Async streaming
async for chunk in llm.astream("Tell me a story"):
    print(chunk.content, end="")
# Event recorded after stream completes

Event Structure

Each tracked event contains:

{
    "agent_name": "my-agent",
    "model": "gpt-4",
    "input_tokens": 150,
    "output_tokens": 80,
    "total_tokens": 230,
    "cost": 0.0093,            # USD, real-time calculated
    "latency_ms": 1234,        # Measured latency
    "timestamp": "2026-01-23T10:30:45.123Z",
    "success": True,
    "error": None,
    "streaming": False,
    "metadata": {"conversation_id": "conv_456"}
}

Dynamic Pricing (Real-Time Updates)

The SDK automatically fetches the latest pricing from the backend. This means:

  • No code changes when model prices change
  • Pricing is cached for 24 hours (efficient)
  • Falls back to built-in defaults if backend is unavailable

How It Works

# SDK automatically fetches pricing from backend
track_costs.init(
    api_key="...",
    project_id="...",
    base_url="http://localhost:8000",  # If running locally
)

# Prices are fetched once and cached
# GET http://localhost:8000/v1/pricing → {"pricing": {"gpt-4": {"input": 0.03, ...}}}

Manually Update Pricing

from agentcost.cost_calculator import refresh_pricing, update_pricing

# Force refresh from backend
refresh_pricing()

# Or manually set pricing (doesn't require backend)
update_pricing({
    "my-custom-model": {"input": 0.001, "output": 0.002}
})

Backend Pricing API

# Get all pricing
curl http://localhost:8000/v1/pricing

# Get specific model
curl http://localhost:8000/v1/pricing/gpt-4

# Update pricing (admin)
curl -X POST http://localhost:8000/v1/pricing \
  -H "Content-Type: application/json" \
  -d '{"gpt-4": {"input": 0.025, "output": 0.05}}'

Supported Models (1900+)

The SDK supports 1900+ models across 45+ providers through dynamic pricing sync with the backend. Models are automatically updated when pricing changes.

Provider Models
OpenAI gpt-4, gpt-4-turbo, gpt-4o, gpt-4o-mini, gpt-3.5-turbo, o1, o1-mini
Anthropic claude-3-opus/sonnet/haiku, claude-3.5-sonnet/haiku, claude-4-opus
Google gemini-pro, gemini-1.5-pro/flash, gemini-2.0-flash
Groq llama-3.1-8b/70b, llama-3.2-3b, llama-3.3-70b, mixtral-8x7b
DeepSeek deepseek-chat, deepseek-coder, deepseek-reasoner
Cohere command, command-light, command-r, command-r-plus
Mistral mistral-small/medium/large
Together AI llama-3-70b/8b-chat, meta-llama models
Replicate Various open-source models
OpenRouter Aggregated models from multiple providers
Perplexity pplx models
xAI Grok models
Amazon Amazon Nova, Titan models
Azure Azure OpenAI models
AWS Bedrock models (Claude, Llama, Mistral)
Anyscale Anyscale endpoints
Cerebras Cerebras models
Cloudflare Workers AI models
Databricks DBRX, Meta Llama models
DeepInfra Various hosted models
Fireworks Fireworks AI models
Hyperbolic Hyperbolic models
Jina AI Embedding models
Lambda Lambda models
MiniMax MiniMax models
Moonshot Moonshot models
Sambanova Samba models
Voyage Embedding models
IBM watsonx models
AI21 AI21 Labs models
Aleph Alpha Aleph Alpha models
Novita Novita hosted models
Gradient AI Gradient endpoints
Dashscope Dashscope models (Alibaba)
Friendliai Friendliai models
GMI GMI models
Llamagate Llamagate models
Morph Morph models
NLP Cloud NLP Cloud endpoints
Nscale Nscale models
Oracle OCI generative models
OVHCloud OVHCloud models
Vercel Vercel AI Gateway, v0 models
Weights & Biases Wandb models
Zai Zai models

Note: The full list of 1900+ models is dynamically loaded from the backend. Run track_costs.init() with a valid API key to access all supported models.

Statistics

stats = track_costs.get_stats()
print(f"Events sent: {stats['batcher']['events_sent']}")
print(f"Batches sent: {stats['batcher']['batches_sent']}")

Graceful Shutdown

track_costs.flush()     # Send pending events
track_costs.shutdown()  # Full shutdown

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentcost-0.1.2.tar.gz (38.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentcost-0.1.2-py3-none-any.whl (35.9 kB view details)

Uploaded Python 3

File details

Details for the file agentcost-0.1.2.tar.gz.

File metadata

  • Download URL: agentcost-0.1.2.tar.gz
  • Upload date:
  • Size: 38.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for agentcost-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6124eef5166a36da8fc858bb1779a80ac10a0ded0bc6c8fbb4819600f655fe78
MD5 52393d277fa1fe7ae1e154aaf419dfca
BLAKE2b-256 11153bb5c88a60ebfcb38805545f3191679412f3567b855dc108ec9194152ac7

See more details on using hashes here.

File details

Details for the file agentcost-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: agentcost-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 35.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for agentcost-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 47fc4cd6ecdcca94461912d01315ade9a733867441eb62f953a89ef9b0b17961
MD5 07767a46b2602388068648115ca9b8c1
BLAKE2b-256 fa32220d412af5f3d598cd47b5300be5894cb12698b85644a2955c884b521781

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page