Skip to main content

Budget enforcement and cost tracking for LLM applications. Prevents runaway API spend from bugs or prompt injection.

Project description

ai-cost-guard

Budget enforcement and cost tracking for LLM applications.

Stop runaway API spend from bugs, prompt injection, or retry loops — before it hits your credit card.

from ai_cost_guard import CostGuard

guard = CostGuard(weekly_budget_usd=5.00)

@guard.protect(model="anthropic/claude-haiku-4-5-20251001")
def call_claude(prompt: str):
    return client.messages.create(...)   # blocked if budget exceeded

Why this exists

When you build with LLMs, three things will eventually go wrong:

  1. A bug creates an infinite retry loop — you wake up to a $300 bill.
  2. A prompt injection attack causes your app to make thousands of unexpected calls.
  3. A junior dev accidentally calls GPT-4o instead of GPT-4o-mini in a tight loop.

ai-cost-guard is a hard stop. It tracks every LLM call, accumulates cost, and raises BudgetExceededError before the next call goes through.

Zero runtime dependencies. Pure Python stdlib. Works with any LLM provider.


Install

pip install ai-cost-guard

Or from source:

git clone https://github.com/yourusername/ai-cost-guard
cd ai-cost-guard
pip install -e ".[dev]"

Quick Start

Decorator (simplest)

from ai_cost_guard import CostGuard
import anthropic

client = anthropic.Anthropic()
guard = CostGuard(weekly_budget_usd=5.00)

@guard.protect(model="anthropic/claude-haiku-4-5-20251001", purpose="summarizer")
def summarize(text: str):
    return client.messages.create(
        model="claude-haiku-4-5-20251001",
        max_tokens=256,
        messages=[{"role": "user", "content": f"Summarize: {text}"}],
    )

Manual check + record

guard.check_budget("openai/gpt-4o", estimated_input=500, estimated_output=200)

response = openai_client.chat.completions.create(...)

guard.record(
    model="openai/gpt-4o",
    input_tokens=response.usage.prompt_tokens,
    output_tokens=response.usage.completion_tokens,
)

Dry-run mode (test without real calls)

guard = CostGuard(weekly_budget_usd=5.00, dry_run=True)
# All calls raise BudgetExceededError("DRY RUN") — safe to use in CI

CLI

# Show current spend vs budget
ai-cost-guard status

# List all calls this period
ai-cost-guard calls

# List all registered models with pricing
ai-cost-guard models

# Check if a model call would be allowed given a budget
ai-cost-guard check anthropic/claude-sonnet-4-6 5.00

# Reset the tracker
ai-cost-guard reset

Supported Providers

Provider Models
Anthropic claude-haiku-4-5, claude-sonnet-4-6, claude-opus-4-6
OpenAI gpt-4o, gpt-4o-mini, gpt-3.5-turbo
Google gemini-1.5-flash, gemini-1.5-pro
Ollama (local) qwen2.5:7b, llama3.2:3b, mistral:7b (always $0.00)

Adding a new model:

from ai_cost_guard import PROVIDERS

PROVIDERS["myprovider/mymodel"] = {
    "input":  1.00 / 1_000_000,   # per token
    "output": 4.00 / 1_000_000,
}

Security properties

  • Hard budget cap — raises BudgetExceededError before the call, not after.
  • No network calls — all data stored locally in ~/.ai-cost-guard/cost_log.json.
  • Atomic writes — cost log uses temp-file + rename to prevent corruption.
  • Zero dependencies — nothing to supply-chain attack.
  • Audit trail — every call logged with timestamp, model, tokens, and purpose.

See SECURITY.md for full security policy.


How it compares

Tool Hard budget stop Multi-provider Zero deps Local storage
ai-cost-guard
LangChain callbacks ❌ (observe only)
OpenAI usage limits N/A
Manual tracking depends depends

Running tests

pip install -e ".[dev]"
pytest tests/ -v

Contributing

PRs welcome. Please:

  • Keep zero runtime dependencies.
  • Add tests for new providers.
  • Update pricing when providers change rates.

License

MIT — free to use, modify, and distribute.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_cost_guard-0.1.0.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_cost_guard-0.1.0-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file ai_cost_guard-0.1.0.tar.gz.

File metadata

  • Download URL: ai_cost_guard-0.1.0.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.2

File hashes

Hashes for ai_cost_guard-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8b8e147e622007d2035e41132aebe1d585ca658800dc6f67aff165d2af43da3a
MD5 4f8a04086bbb2b769bf42a45cd678479
BLAKE2b-256 1fd88aeb19bab73787d87ddba83aad088832b0bb65d73beb7ac491b74d7c26b3

See more details on using hashes here.

File details

Details for the file ai_cost_guard-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ai_cost_guard-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.2

File hashes

Hashes for ai_cost_guard-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e39a94cb504b4305451d932294a519bcf10e9661ec373d849654141aae48ab82
MD5 f7556bd0875061d1d6780556bd8813e2
BLAKE2b-256 584e2e5f5d48b00ebfee2edb4ba1f8517afea30f1ff396568b75c1c03a191a63

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page