Skip to main content

Real-time cost tracking, budget enforcement, and usage analytics for LLM applications

Project description

LLM Cost Guard

PyPI version Python 3.9+ License: MIT

Real-time cost tracking, budget enforcement, and usage analytics for LLM applications. Supports OpenAI, Anthropic, AWS Bedrock, and more.

Features

  • Real-time Cost Tracking: Track costs as they happen, not when the bill arrives
  • Budget Enforcement: Set limits with configurable actions (warn, throttle, block)
  • Multi-Provider Support: OpenAI, Anthropic, AWS Bedrock, Google Vertex AI
  • LangChain Integration: Native callback support for LangChain applications
  • Rate Limiting: Control request rates per model, provider, or custom tags
  • Hierarchical Tracking: Group related LLM calls with spans
  • Flexible Storage: In-memory, SQLite, PostgreSQL, Redis, DynamoDB backends
  • Zero External Dependencies: Works offline with no external services required

Installation

pip install llm-cost-guard

With optional integrations:

# LangChain support
pip install llm-cost-guard[langchain]

# AWS Bedrock support
pip install llm-cost-guard[bedrock]

# All optional dependencies
pip install llm-cost-guard[all]

Quick Start

Basic Usage

from llm_cost_guard import CostTracker

tracker = CostTracker()

# Decorator-based tracking
@tracker.track
def my_llm_call():
    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    return response

result = my_llm_call()

# Check costs
print(tracker.last_call().total_cost)  # $0.0015

With Budget Enforcement

from llm_cost_guard import CostTracker, Budget, BudgetAction

tracker = CostTracker(
    budgets=[
        Budget(
            name="daily",
            limit=10.00,
            period="day",
            action=BudgetAction.WARN
        ),
        Budget(
            name="monthly",
            limit=500.00,
            period="month",
            action=BudgetAction.BLOCK
        ),
    ]
)

# Get notified when approaching limits
@tracker.on_budget_warning
def handle_warning(budget, current):
    print(f"Warning: Budget '{budget.name}' at {current/budget.limit*100:.0f}%")

@tracker.on_budget_exceeded
def handle_exceeded(budget):
    print(f"Budget '{budget.name}' exceeded!")

Manual Recording

# For custom integrations
record = tracker.record(
    provider="openai",
    model="gpt-4o",
    input_tokens=1234,
    output_tokens=567,
    tags={"team": "search", "feature": "autocomplete"}
)

print(record.total_cost)  # $0.0208

Wrapped Clients

from llm_cost_guard import CostTracker
from llm_cost_guard.clients import TrackedOpenAI

tracker = CostTracker()
client = TrackedOpenAI(tracker=tracker)

# Automatic tracking - no decorators needed
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

LangChain Integration

from llm_cost_guard import CostTracker
from llm_cost_guard.integrations.langchain import CostTrackingCallback

tracker = CostTracker()

llm = ChatOpenAI(
    model="gpt-4o",
    callbacks=[CostTrackingCallback(tracker)]
)

result = llm.invoke("Hello!")
print(tracker.last_call().total_cost)

Hierarchical Tracking (Spans)

# Track costs for complex operations like agents
with tracker.span("customer_support_agent", tags={"user_id": "123"}) as span:
    result = agent.invoke(query)
    
    print(span.total_cost)      # $0.45 (sum of all calls)
    print(span.call_count)      # 5
    print(span.models_used)     # ["gpt-4o", "gpt-3.5-turbo"]

Configuration

Storage Backends

# In-memory (default, development)
tracker = CostTracker(backend="memory")

# SQLite (single-machine persistence)
tracker = CostTracker(backend="sqlite:///costs.db")

# PostgreSQL (production)
tracker = CostTracker(backend="postgresql://user:pass@host/db")

# Redis (distributed, real-time)
tracker = CostTracker(backend="redis://localhost:6379/0")

Rate Limiting

from llm_cost_guard import CostTracker, RateLimit

tracker = CostTracker(
    rate_limits=[
        RateLimit(
            name="requests-per-minute",
            limit=100,
            period="minute",
            scope="global"
        ),
        RateLimit(
            name="user-requests",
            limit=10,
            period="minute",
            scope="tag:user_id"
        )
    ]
)

Fail-Safe Modes

tracker = CostTracker(
    # Block LLM calls if tracking fails (strict)
    on_tracking_failure="block",
    
    # Allow LLM calls but log warning (available)
    # on_tracking_failure="allow",
    
    # Use in-memory fallback temporarily
    # on_tracking_failure="fallback",
)

CLI

# View current costs
llm-cost-guard status

# Generate report
llm-cost-guard report --period day --group-by model

# Check health
llm-cost-guard health

# List supported models and pricing
llm-cost-guard models --provider openai

# Export data
llm-cost-guard export --format csv --output costs.csv

Supported Providers

Provider Models
OpenAI GPT-4o, GPT-4, GPT-3.5, o1, Embeddings, DALL-E
Anthropic Claude 3.5, Claude 3, Claude 2
AWS Bedrock Claude, Titan, Llama, Mistral, Cohere
Google Vertex AI Gemini 1.5, Gemini 1.0, PaLM 2

Reporting

# Daily summary
tracker.daily_report()

# Cost by model
tracker.report_by_model(period="week")

# Query with filters
report = tracker.get_costs(
    start_date="2024-01-01",
    end_date="2024-01-31",
    tags={"team": "search"},
    group_by=["model", "feature"]
)

# Export to DataFrame
df = tracker.to_dataframe()

Security

  • No API key logging: Keys are never stored, logged, or transmitted
  • No prompt storage by default: Only metadata (tokens, cost) stored
  • PII redaction: Optional redaction for user IDs
  • Encryption support: For SQL/Redis backends
tracker = CostTracker(
    store_prompts=False,          # Default: never store prompts
    redact_user_ids=True,         # Hash user IDs in storage
)

Custom Pricing

For negotiated enterprise rates:

tracker = CostTracker(
    pricing_overrides={
        "openai/gpt-4": {
            "input_cost_per_1k": 0.02,    # Your negotiated rate
            "output_cost_per_1k": 0.04,
        }
    }
)

Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests.

License

MIT License - see LICENSE for details.

Author

Prashant Dudami

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_cost_guard-0.1.1.tar.gz (56.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_cost_guard-0.1.1-py3-none-any.whl (51.9 kB view details)

Uploaded Python 3

File details

Details for the file llm_cost_guard-0.1.1.tar.gz.

File metadata

  • Download URL: llm_cost_guard-0.1.1.tar.gz
  • Upload date:
  • Size: 56.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for llm_cost_guard-0.1.1.tar.gz
Algorithm Hash digest
SHA256 aa2699dfb2204b8659d7cfd73817bb486648b2c66a83355546b7b61c1d2f74d6
MD5 4f1d44c22529a11954421cb26de25472
BLAKE2b-256 1b9cf10cc019889470efd7c82a257a6ff323bd2d71733b00c8118b3f3cb137a3

See more details on using hashes here.

File details

Details for the file llm_cost_guard-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: llm_cost_guard-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 51.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for llm_cost_guard-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9229ca913187c905c1009bdab8d2958b4756e8e25c0997c125b97db03fa6c72a
MD5 c7b9c7809995796956f2fc87af0706e7
BLAKE2b-256 fe753d9fddeaed45d47a6333a59405ae15c9c4b0616522e62954965f75e4e699

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page