Real-time cost tracking, budget enforcement, and usage analytics for LLM applications

These details have not been verified by PyPI

Project links

Project description

LLM Cost Guard

Real-time cost tracking, budget enforcement, and usage analytics for LLM applications. Supports OpenAI, Anthropic, AWS Bedrock, and more.

Features

Real-time Cost Tracking: Track costs as they happen, not when the bill arrives
Budget Enforcement: Set limits with configurable actions (warn, throttle, block)
Multi-Provider Support: OpenAI, Anthropic, AWS Bedrock, Google Vertex AI
LangChain Integration: Native callback support for LangChain applications
Rate Limiting: Control request rates per model, provider, or custom tags
Hierarchical Tracking: Group related LLM calls with spans
Flexible Storage: In-memory, SQLite, PostgreSQL, Redis, DynamoDB backends
Zero External Dependencies: Works offline with no external services required

Installation

pip install llm-cost-guard

With optional integrations:

# LangChain support
pip install llm-cost-guard[langchain]

# AWS Bedrock support
pip install llm-cost-guard[bedrock]

# All optional dependencies
pip install llm-cost-guard[all]

Quick Start

Basic Usage

from llm_cost_guard import CostTracker

tracker = CostTracker()

# Decorator-based tracking
@tracker.track
def my_llm_call():
    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    return response

result = my_llm_call()

# Check costs
print(tracker.last_call().total_cost)  # $0.0015

With Budget Enforcement

from llm_cost_guard import CostTracker, Budget, BudgetAction

tracker = CostTracker(
    budgets=[
        Budget(
            name="daily",
            limit=10.00,
            period="day",
            action=BudgetAction.WARN
        ),
        Budget(
            name="monthly",
            limit=500.00,
            period="month",
            action=BudgetAction.BLOCK
        ),
    ]
)

# Get notified when approaching limits
@tracker.on_budget_warning
def handle_warning(budget, current):
    print(f"Warning: Budget '{budget.name}' at {current/budget.limit*100:.0f}%")

@tracker.on_budget_exceeded
def handle_exceeded(budget):
    print(f"Budget '{budget.name}' exceeded!")

Manual Recording

# For custom integrations
record = tracker.record(
    provider="openai",
    model="gpt-4o",
    input_tokens=1234,
    output_tokens=567,
    tags={"team": "search", "feature": "autocomplete"}
)

print(record.total_cost)  # $0.0208

Wrapped Clients

from llm_cost_guard import CostTracker
from llm_cost_guard.clients import TrackedOpenAI

tracker = CostTracker()
client = TrackedOpenAI(tracker=tracker)

# Automatic tracking - no decorators needed
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

LangChain Integration

from llm_cost_guard import CostTracker
from llm_cost_guard.integrations.langchain import CostTrackingCallback

tracker = CostTracker()

llm = ChatOpenAI(
    model="gpt-4o",
    callbacks=[CostTrackingCallback(tracker)]
)

result = llm.invoke("Hello!")
print(tracker.last_call().total_cost)

Hierarchical Tracking (Spans)

# Track costs for complex operations like agents
with tracker.span("customer_support_agent", tags={"user_id": "123"}) as span:
    result = agent.invoke(query)
    
    print(span.total_cost)      # $0.45 (sum of all calls)
    print(span.call_count)      # 5
    print(span.models_used)     # ["gpt-4o", "gpt-3.5-turbo"]

Configuration

Storage Backends

# In-memory (default, development)
tracker = CostTracker(backend="memory")

# SQLite (single-machine persistence)
tracker = CostTracker(backend="sqlite:///costs.db")

# PostgreSQL (production)
tracker = CostTracker(backend="postgresql://user:pass@host/db")

# Redis (distributed, real-time)
tracker = CostTracker(backend="redis://localhost:6379/0")

Rate Limiting

from llm_cost_guard import CostTracker, RateLimit

tracker = CostTracker(
    rate_limits=[
        RateLimit(
            name="requests-per-minute",
            limit=100,
            period="minute",
            scope="global"
        ),
        RateLimit(
            name="user-requests",
            limit=10,
            period="minute",
            scope="tag:user_id"
        )
    ]
)

Fail-Safe Modes

tracker = CostTracker(
    # Block LLM calls if tracking fails (strict)
    on_tracking_failure="block",
    
    # Allow LLM calls but log warning (available)
    # on_tracking_failure="allow",
    
    # Use in-memory fallback temporarily
    # on_tracking_failure="fallback",
)

CLI

# View current costs
llm-cost-guard status

# Generate report
llm-cost-guard report --period day --group-by model

# Check health
llm-cost-guard health

# List supported models and pricing
llm-cost-guard models --provider openai

# Export data
llm-cost-guard export --format csv --output costs.csv

Supported Providers

Provider	Models
OpenAI	GPT-4o, GPT-4, GPT-3.5, o1, Embeddings, DALL-E
Anthropic	Claude 3.5, Claude 3, Claude 2
AWS Bedrock	Claude, Titan, Llama, Mistral, Cohere
Google Vertex AI	Gemini 1.5, Gemini 1.0, PaLM 2

Reporting

# Daily summary
tracker.daily_report()

# Cost by model
tracker.report_by_model(period="week")

# Query with filters
report = tracker.get_costs(
    start_date="2024-01-01",
    end_date="2024-01-31",
    tags={"team": "search"},
    group_by=["model", "feature"]
)

# Export to DataFrame
df = tracker.to_dataframe()

Security

No API key logging: Keys are never stored, logged, or transmitted
No prompt storage by default: Only metadata (tokens, cost) stored
PII redaction: Optional redaction for user IDs
Encryption support: For SQL/Redis backends

tracker = CostTracker(
    store_prompts=False,          # Default: never store prompts
    redact_user_ids=True,         # Hash user IDs in storage
)

Audit Logging (v0.2.0+)

Enterprise-ready audit trails for compliance:

from llm_cost_guard import CostTracker, FileAuditBackend

# Enable audit logging
tracker = CostTracker(
    audit_enabled=True,
    audit_backend=FileAuditBackend("audit.log"),
)

# Query audit history
events = tracker.audit.query(
    event_type=AuditEventType.BUDGET_EXCEEDED,
    start_date="2024-01-01",
)

# Get budget-specific history
history = tracker.audit.get_budget_history("daily")

Audit events include:

Budget created/modified/deleted
Budget warnings and exceeded events
Rate limit exceeded events
Tracking failures and fallback activations

Observability Metrics (v0.2.0+)

Track health and degradation:

# Get tracker metrics
metrics = tracker.get_metrics()
print(metrics)
# {
#   "backend_failures": 0,
#   "fallback_activations": 0,
#   "budget_exceeded_count": 3,
#   "tracking_errors": 0,
#   "using_fallback": False,
# }

# Health check
health = tracker.health_check()
print(health.healthy)  # True/False
print(health.errors)   # List of issues

Custom Pricing

For negotiated enterprise rates:

tracker = CostTracker(
    pricing_overrides={
        "openai/gpt-4": {
            "input_cost_per_1k": 0.02,    # Your negotiated rate
            "output_cost_per_1k": 0.04,
        }
    }
)

Current Limitations

Being transparent about what's not yet production-ready:

Feature	Status	Notes
Distributed budgets (Redis)	✅ v0.2.0	Atomic operations with Lua scripts
Audit logging	✅ v0.2.0	File and logging backends
Graceful degradation metrics	✅ v0.2.0	Track failures and fallbacks
PostgreSQL backend	🚧 Planned	Use SQLite or Redis for now
DynamoDB backend	🚧 Planned	Use SQLite or Redis for now
Encryption at rest	🚧 Planned	Use encrypted volumes as workaround
Multi-tenancy optimization	🚧 Planned	Use tag-scoped budgets for now
Streaming cost estimation	⚠️ Limited	Actual cost tracked on completion
Fine-tuning cost tracking	❌ Not supported

Recommended for Production

Deployment Size	Backend	Notes
Single instance	SQLite	Simple, no setup
Multiple instances	Redis	Distributed budget enforcement
High-volume (>1k req/s)	Redis	With sampling (coming soon)

Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests.

License

MIT License - see LICENSE for details.

Author

Prashant Dudami - AI/ML Architect & LLM Infrastructure Expert

Website: prashantdudami.com
LinkedIn: linkedin.com/in/prashantdudami
GitHub: github.com/prashantdudami

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

Feb 4, 2026

This version

0.2.0

Feb 3, 2026

0.1.2

Feb 3, 2026

0.1.1

Feb 3, 2026

0.1.0

Feb 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_cost_guard-0.2.0.tar.gz (64.6 kB view details)

Uploaded Feb 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_cost_guard-0.2.0-py3-none-any.whl (61.4 kB view details)

Uploaded Feb 3, 2026 Python 3

File details

Details for the file llm_cost_guard-0.2.0.tar.gz.

File metadata

Download URL: llm_cost_guard-0.2.0.tar.gz
Upload date: Feb 3, 2026
Size: 64.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for llm_cost_guard-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`c133f67ce1b00baebac3b127ed24b700eb60834394493df0459190c1ec73e742`
MD5	`a2b003f8e7c2a3de929931af442e94f4`
BLAKE2b-256	`3ffb40f7ce03f4da9e499df30de84422214df98a2efde775389b85bfd837512f`

See more details on using hashes here.

File details

Details for the file llm_cost_guard-0.2.0-py3-none-any.whl.

File metadata

Download URL: llm_cost_guard-0.2.0-py3-none-any.whl
Upload date: Feb 3, 2026
Size: 61.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.16

File hashes

Hashes for llm_cost_guard-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`48d1e9ffad0a0a12ca783672b2cf3a31a4da23ab6b910a570123a9de0aa33fd6`
MD5	`6c6c9405b011f248d561f7e0afcfc9ed`
BLAKE2b-256	`7188f4c7408f22c29e0436871241dd57d12ae2629118110d47f057d0a905a6b7`

See more details on using hashes here.

llm-cost-guard 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLM Cost Guard

Features

Installation

Quick Start

Basic Usage

With Budget Enforcement

Manual Recording

Wrapped Clients

LangChain Integration

Hierarchical Tracking (Spans)

Configuration

Storage Backends

Rate Limiting

Fail-Safe Modes

CLI

Supported Providers

Reporting

Security

Audit Logging (v0.2.0+)

Observability Metrics (v0.2.0+)

Custom Pricing

Current Limitations

Recommended for Production

Contributing

License

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes