Cost circuit breaker for AI agents — guard your OpenAI spend with automatic downgrade and kill switch.
Project description
TokenFence
Cost circuit breaker for AI agents. Guard your LLM spend with automatic model downgrade and kill switch. Supports OpenAI, Anthropic Claude, Google Gemini, and DeepSeek.
Install
pip install tokenfence[openai]
Quick Start
import openai
from tokenfence import guard
client = guard(
openai.OpenAI(),
budget='$0.50',
fallback='gpt-4o-mini',
on_limit='stop',
)
# Use exactly like a normal OpenAI client
response = client.chat.completions.create(
model='gpt-4o',
messages=[{'role': 'user', 'content': 'Hello'}],
)
# Check spend
print(client.tokenfence.spent) # 0.0023
print(client.tokenfence.remaining) # 0.4977
print(client.tokenfence.calls) # 1
Anthropic Claude
import anthropic
from tokenfence import guard
client = guard(
anthropic.Anthropic(),
budget='$1.00',
fallback='claude-3-haiku-20240307',
on_limit='stop',
)
# Use exactly like a normal Anthropic client
response = client.messages.create(
model='claude-3-5-sonnet-20241022',
max_tokens=1024,
messages=[{'role': 'user', 'content': 'Hello'}],
)
# Check spend
print(client.tokenfence.spent) # 0.00105
print(client.tokenfence.remaining) # 0.99895
Async Support
For async applications (the standard in production agent pipelines), use async_guard:
import openai
from tokenfence import async_guard
client = async_guard(
openai.AsyncOpenAI(),
budget='$0.50',
fallback='gpt-4o-mini',
on_limit='stop',
)
# Use exactly like a normal async OpenAI client
response = await client.chat.completions.create(
model='gpt-4o',
messages=[{'role': 'user', 'content': 'Hello'}],
)
print(client.tokenfence.spent)
Works with anthropic.AsyncAnthropic too:
import anthropic
from tokenfence import async_guard
client = async_guard(
anthropic.AsyncAnthropic(),
budget='$1.00',
fallback='claude-3-haiku-20240307',
on_limit='raise',
)
response = await client.messages.create(
model='claude-3-5-sonnet-20241022',
max_tokens=1024,
messages=[{'role': 'user', 'content': 'Hello'}],
)
How It Works
- Track — every
chat.completions.create()call records token usage and calculates cost. - Downgrade — when cumulative spend hits the threshold (default 80% of budget), the model is transparently swapped to your fallback.
- Kill switch — when the budget is fully consumed:
on_limit='stop'— returns a synthetic response explaining the budget was exceeded.on_limit='warn'— logs a warning but allows the call through.on_limit='raise'— raisesBudgetExceeded.
API
guard(client, *, budget, fallback=None, on_limit='stop', threshold=0.8)
| Parameter | Type | Description |
|---|---|---|
client |
openai.OpenAI |
An OpenAI client instance |
budget |
str | float |
Max spend — '$0.50' or 0.50 |
fallback |
str | None |
Model to downgrade to when threshold is hit |
on_limit |
str |
'stop', 'warn', or 'raise' |
threshold |
float |
Fraction of budget at which downgrade kicks in (0.0–1.0) |
async_guard(client, *, budget, fallback=None, on_limit='stop', threshold=0.8)
Same parameters as guard(), but for async clients (openai.AsyncOpenAI, anthropic.AsyncAnthropic).
client.tokenfence
| Attribute | Description |
|---|---|
.spent |
Total USD spent so far |
.remaining |
USD remaining in budget |
.calls |
Number of tracked API calls |
.budget |
The configured budget |
.reset() |
Reset spend tracking to zero |
Limits on Free Tier
The free Hobby tier includes 50K tracked requests/month. For production workloads:
| Tier | Requests | Price |
|---|---|---|
| Hobby | 50K/mo | Free |
| Pro | 500K/mo | $49/mo |
| Team | 2M/mo | $149/mo |
→ Upgrade to Pro at tokenfence.dev — 7-day free trial, no credit card required to start.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokenfence-0.3.2.tar.gz.
File metadata
- Download URL: tokenfence-0.3.2.tar.gz
- Upload date:
- Size: 14.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
820a4c7f115f7bd9a9e4da0fa0c150f5e0a49ce8b62ad3e1530d84be7414b7b6
|
|
| MD5 |
fc863ad2b75e3a050dce04ab7d2f066e
|
|
| BLAKE2b-256 |
dbbf525d26af96c831b74eeb7fa9a2516708107078ebea8502d994d219747c39
|
File details
Details for the file tokenfence-0.3.2-py3-none-any.whl.
File metadata
- Download URL: tokenfence-0.3.2-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a224ec87cf26d7a1738182df49fca0ed69cec013b49d2bb48a67e32ce4a1e6db
|
|
| MD5 |
f520787845c3428879fcc8017244c538
|
|
| BLAKE2b-256 |
ea2bb285d2aef6541a839e49fd6f6041673b6a643b0ae6ee9216756d068ef98f
|