Lightweight token tracking, cost management, and budget enforcement for LLM API calls

These details have not been verified by PyPI

Project links

Project description

TokenBudget

Stop bleeding money on LLM API calls.

A lightweight Python library for tracking tokens, managing costs, and enforcing budgets across all major LLM providers.

The Problem

You're building with LLMs and:

Costs spiral out of control with no visibility
No idea which API calls are eating your budget
Production bills that make you cry
Clunky observability platforms that require external services

There's no simple pip install library that just works.

The Solution

from tokenbudget import TokenTracker, budget

tracker = TokenTracker()
client = tracker.wrap_openai(openai.OpenAI())

# Every call is tracked automatically
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

print(tracker.usage)
# Usage(total_tokens=25, total_cost_usd=0.000375, calls=1)

That's it. No platforms, no external services, no configuration.

Features

Token Tracking - Automatic tracking for OpenAI, Anthropic, Google Cost Calculation - Built-in pricing database (always up-to-date) Budget Enforcement - Decorators to prevent overspending Response Caching - Save money with zero-cost cached responses Usage Reports - Beautiful tables + CSV/JSON exports Multi-Provider - One tracker for all your LLM calls Thread-Safe - Works seamlessly in concurrent applications Async Support - Works with async clients out of the box

Installation

# Basic installation
pip install tokenbudget

# With OpenAI support
pip install tokenbudget[openai]

# With Anthropic support
pip install tokenbudget[anthropic]

# With everything
pip install tokenbudget[all]

Quick Examples

1. Basic Tracking

from tokenbudget import TokenTracker
import openai

tracker = TokenTracker()
client = tracker.wrap_openai(openai.OpenAI())

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is 2+2?"}]
)

print(f"Tokens: {tracker.usage.total_tokens}")
print(f"Cost: ${tracker.usage.total_cost_usd:.6f}")

2. Budget Enforcement

from tokenbudget import budget, BudgetExceeded

@budget(max_cost_usd=1.00, max_tokens=50000)
def my_llm_pipeline(data):
    # All LLM calls inside are tracked
    # Raises BudgetExceeded if limit is hit
    result = process_with_llm(data)
    return result

# Or as context manager
with budget(max_cost_usd=0.50) as ctx:
    response = client.chat.completions.create(...)
    print(f"Remaining: ${ctx.remaining_budget:.4f}")

3. Multi-Provider Tracking

import openai
import anthropic
from tokenbudget import TokenTracker

tracker = TokenTracker()

# Track OpenAI
openai_client = tracker.wrap_openai(openai.OpenAI())
openai_client.chat.completions.create(model="gpt-4o", ...)

# Track Anthropic
anthropic_client = tracker.wrap_anthropic(anthropic.Anthropic())
anthropic_client.messages.create(model="claude-sonnet-4-5", ...)

# Combined reporting
print(f"Total cost: ${tracker.total_cost_usd:.4f}")
print(tracker.usage_by_provider)

4. Response Caching

tracker = TokenTracker(cache="memory")
client = tracker.wrap_openai(openai.OpenAI())

# First call - costs money
response1 = client.chat.completions.create(model="gpt-4o", ...)

# Identical call - FREE (cached)
response2 = client.chat.completions.create(model="gpt-4o", ...)

stats = tracker.cache_stats
print(f"Saved: ${stats.saved_cost_usd:.4f}")

5. Usage Reports

from tokenbudget import generate_table_report

print(generate_table_report(tracker))

# Output:
# ┌─────────────────────────────────────┐
# │ TokenBudget Usage Report            │
# ├─────────────────────────────────────┤
# │ Provider   │ Calls │ Tokens │ Cost  │
# │ openai     │   15  │ 12.3k  │ $0.24 │
# │ anthropic  │    8  │  8.1k  │ $0.18 │
# ├─────────────────────────────────────┤
# │ Total      │   23  │ 20.4k  │ $0.42 │
# └─────────────────────────────────────┘

# Export to CSV/JSON
tracker.export_csv("usage.csv")
tracker.export_json("usage.json")

Supported Models

OpenAI gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo, o1, o1-mini, o3-mini

Anthropic claude-opus-4-5, claude-sonnet-4-5, claude-haiku-4-5, claude-3-5-sonnet, claude-3-opus

Google gemini-2.0-flash, gemini-2.0-pro, gemini-1.5-pro, gemini-1.5-flash

Need a custom model? Easy:

from tokenbudget import register_model

register_model(
    "my-custom-model",
    input_per_1k=0.001,
    output_per_1k=0.002,
    provider="custom"
)

API Reference

TokenTracker

tracker = TokenTracker(cache=None)  # cache: "memory", "disk", or None

Methods:

wrap_openai(client) - Wrap OpenAI client
wrap_anthropic(client) - Wrap Anthropic client
track(model, prompt_tokens, completion_tokens, provider) - Manual tracking
reset() - Reset all statistics

Properties:

usage - Overall usage stats
usage_by_provider - Per-provider breakdown
total_cost_usd - Total cost across all calls
cache_stats - Cache hit/miss statistics

Budget Enforcement

@budget(max_cost_usd=None, max_tokens=None, tracker=None)
def my_function():
    ...

# Or as context manager
with budget(max_cost_usd=1.0) as ctx:
    ...
    print(ctx.remaining_budget)
    print(ctx.current_usage)

Exceptions:

BudgetExceeded - Cost limit exceeded
TokenLimitReached - Token limit exceeded

Pricing

from tokenbudget import get_price, register_model, calculate_cost

# Get model pricing
price = get_price("gpt-4o")
print(price.input_per_1k, price.output_per_1k)

# Calculate cost
cost = calculate_cost("gpt-4o", input_tokens=1000, output_tokens=500)

# Register custom model
register_model("my-model", input_per_1k=0.001, output_per_1k=0.002)

Reports

from tokenbudget import generate_table_report, export_csv, export_json

# Pretty table
print(generate_table_report(tracker))

# Export
export_csv(tracker, "usage.csv")
export_json(tracker, "usage.json")

Custom Providers

Support any LLM provider:

from tokenbudget.providers.custom import CustomProvider

custom = CustomProvider(
    tracker=tracker,
    provider_name="my-llm-service",
    extract_model=lambda r: r["model"],
    extract_prompt_tokens=lambda r: r["usage"]["input"],
    extract_completion_tokens=lambda r: r["usage"]["output"],
)

# Track your custom response
custom.track(api_response)

Contributing

Contributions welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Author

Built by a developer tired of surprise LLM bills.

If this saved you money, consider starring the repo.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Feb 17, 2026

This version

0.1.1

Feb 15, 2026

0.1.0

Feb 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenbudget-0.1.1.tar.gz (16.3 kB view details)

Uploaded Feb 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tokenbudget-0.1.1-py3-none-any.whl (18.3 kB view details)

Uploaded Feb 15, 2026 Python 3

File details

Details for the file tokenbudget-0.1.1.tar.gz.

File metadata

Download URL: tokenbudget-0.1.1.tar.gz
Upload date: Feb 15, 2026
Size: 16.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for tokenbudget-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`3a0d1e6ca6a57330a679dcf3749ae05ae7ebde1b0efcf0a473e3d8c4f30d84f0`
MD5	`188abd3c73d6b540c6b28ef641bea6ae`
BLAKE2b-256	`cbac45b0d42a4a5357265e094efbe8983b43c6a987386f279562d2c277d0b7bf`

See more details on using hashes here.

File details

Details for the file tokenbudget-0.1.1-py3-none-any.whl.

File metadata

Download URL: tokenbudget-0.1.1-py3-none-any.whl
Upload date: Feb 15, 2026
Size: 18.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for tokenbudget-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2d50a7a88e77f25854ea97432872b479bf1af6265fffe39d1bfdb9fe2c257e71`
MD5	`4039d18e5648e4d7cdac5ae24aaf21a7`
BLAKE2b-256	`7858789b6fd9a2bad1d98eb0e6e4697457620a74414f593a8d857eef47cedbed`

See more details on using hashes here.

tokenbudget 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TokenBudget

The Problem

The Solution

Features

Installation

Quick Examples

1. Basic Tracking

2. Budget Enforcement

3. Multi-Provider Tracking

4. Response Caching

5. Usage Reports

Supported Models

API Reference

TokenTracker

Budget Enforcement

Pricing

Reports

Custom Providers

Contributing

License

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes