Skip to main content

LLM token cost monitoring, budget enforcement, and optimization.

Project description

TokenShield

๐Ÿ›ก๏ธ TokenShield

Real-time token cost monitoring, budget enforcement, and optimization for LLM applications.

Python 3.11+ License: MIT Tests Coverage

Stop burning money on LLM API calls. TokenShield gives you per-request cost tracking, budget gates, and automatic optimization โ€” before the invoice arrives.


The Problem

Month 1:  $50    "This is cheap!"
Month 2:  $200   "Growth is normal"
Month 3:  $3,400 "WHAT HAPPENED?!"

LLM costs are invisible until the bill arrives. A single misconfigured loop, a verbose system prompt, or an unbound tool list can 10x your spend overnight.

The Solution

from tokenshield import Shield, BudgetPolicy

shield = Shield(
    model="gpt-4o",
    policy=BudgetPolicy(
        max_cost_per_request=0.05,     # $0.05 per request
        max_cost_per_hour=2.00,        # $2/hour
        max_cost_per_day=20.00,        # $20/day
        alert_threshold_pct=80,        # Alert at 80% of any limit
    )
)

# Wrap any LLM call
result = shield.call(
    messages=[{"role": "user", "content": "Summarize this order"}],
    tools=tool_schemas,
)

print(shield.report())
# โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
# โ”‚ Requests today:     142         โ”‚
# โ”‚ Tokens (in/out):    89K / 12K   โ”‚
# โ”‚ Cost today:         $4.23       โ”‚
# โ”‚ Budget remaining:   $15.77      โ”‚
# โ”‚ Avg cost/request:   $0.030      โ”‚
# โ”‚ Most expensive:     search (48%)โ”‚
# โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Features

Feature Description
Cost Tracking Per-request, per-hour, per-day cost accumulation with model-aware pricing
Budget Gates Hard limits that reject calls before they execute (no surprise bills)
Alert Hooks Webhook/callback when approaching budget thresholds
Token Estimation Pre-flight token count estimation before calling the API
Model Pricing DB Built-in pricing for GPT-4o, Claude, Gemini, Mistral, and custom models
Optimization Tips Automatic suggestions: "Your system prompt is 4,200 tokens โ€” consider trimming"
Dashboard Export JSON/CSV export for cost dashboards and observability tools
Async Support Full async/await support for high-throughput applications

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     Your Application                      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                          โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚  โ”‚ shield   โ”‚โ”€โ”€โ†’โ”‚ estimatorโ”‚โ”€โ”€โ†’โ”‚ budget_gate          โ”‚ โ”‚
โ”‚  โ”‚ .call()  โ”‚   โ”‚ (tokens) โ”‚   โ”‚ (allow / reject)     โ”‚ โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚       โ”‚                                     โ”‚            โ”‚
โ”‚       โ”‚         โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚       โ”‚         โ”‚ tracker  โ”‚โ†โ”€โ”€โ”‚ LLM API call         โ”‚ โ”‚
โ”‚       โ”‚         โ”‚ (costs)  โ”‚   โ”‚ (litellm / openai)   โ”‚ โ”‚
โ”‚       โ”‚         โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚       โ”‚              โ”‚                                   โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚  โ”‚ reporter                โ”‚   โ”‚ alert_hooks          โ”‚ โ”‚
โ”‚  โ”‚ (dashboard / export)    โ”‚   โ”‚ (webhook / callback) โ”‚ โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚                                                          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Quick Start

pip install tokenshield

Basic Usage

from tokenshield import Shield

shield = Shield(model="gpt-4o")

# Track a call (wrap your existing LLM call)
result = shield.call(messages=[...])

# Check current spend
print(f"Today: ${shield.tracker.cost_today:.2f}")

Budget Enforcement

from tokenshield import Shield, BudgetPolicy

shield = Shield(
    model="gpt-4o",
    policy=BudgetPolicy(max_cost_per_request=0.10)
)

try:
    result = shield.call(messages=huge_prompt)
except shield.BudgetExceeded as e:
    print(f"Blocked! Estimated cost ${e.estimated_cost:.3f} exceeds limit")

Alert Hooks

shield = Shield(
    model="gpt-4o",
    policy=BudgetPolicy(max_cost_per_day=20.00, alert_threshold_pct=80),
    on_alert=lambda msg: slack.post(channel="#llm-costs", text=msg),
)

Optimization Suggestions

tips = shield.optimize(messages, tools)
# [
#   "System prompt is 3,800 tokens (63% of input). Consider compressing.",
#   "18 tools bound but only 3 used. Use dynamic tool binding to save ~2,250 tokens.",
#   "History has 45 messages. Consider windowing to last 20.",
# ]

Pricing Database

Built-in pricing (updated monthly):

Model Input ($/1M) Output ($/1M) Context
gpt-4o $2.50 $10.00 128K
gpt-4o-mini $0.15 $0.60 128K
claude-3.5-sonnet $3.00 $15.00 200K
claude-3-haiku $0.25 $1.25 200K
gemini-1.5-pro $1.25 $5.00 1M
mistral-large $2.00 $6.00 128K

Add custom models:

shield.pricing.add("my-finetuned-model", input=5.00, output=15.00)

Documentation

License

MIT โ€” see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenshield_ai-2.0.0.tar.gz (12.5 kB view details)

Uploaded Source

File details

Details for the file tokenshield_ai-2.0.0.tar.gz.

File metadata

  • Download URL: tokenshield_ai-2.0.0.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for tokenshield_ai-2.0.0.tar.gz
Algorithm Hash digest
SHA256 7db181e421fd98d1682a8cc9c9e221138f364e8a792320941dd7ca522a1a80f4
MD5 3be56b705efac8b87e767ffcf9e22780
BLAKE2b-256 06d9b1d0db13be39e0e498679532bf7395c947c1e6ac8019a2662fd9e49ec6a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page