Budget: pre-flight cost caps, spend attribution, and circuit-breakers for LLM calls.
Project description
cendor-tokenguard
Stop runaway LLM bills, and get per-feature / per-user cost attribution for free. One decorator, one context manager. No dashboard, no account, no infra.
Caught a $40 runaway loop before it ran away — and told you which feature spent the rest.
·
pip install cendor-tokenguard
from cendor.core import instrument
from cendor.tokenguard import budget, track, report
client = instrument(openai_client) # wrap once; tokenguard subscribes, never patches
@budget(usd=0.50, on_exceed="downgrade", downgrade={"gpt-4o": "gpt-4o-mini"})
def answer(q: str) -> str:
with track(feature="support_bot", user_id="alice"): # ambient attribution, zero bookkeeping
resp = client.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": q}])
return resp.choices[0].message.content
for row in report(group_by=["feature", "user_id"]): # where did the money go?
print(row["tags"], row["usd"], row["calls"])
Highlights
- Pre-flight circuit breaker —
on_exceed="block"raises before an over-budget call runs;"downgrade"reroutes to a cheaper model pre-flight;"truncate"degrades;"raise"stops a runaway loop; or call your own function. - Reasoning models, handled — you can't predict a thinking model's hidden reasoning pre-flight, so
on_exceed="clamp"injects the provider's own token ceiling (max_completion_tokens/max_tokens) sized to the remaining budget — the call is capped server-side instead of overspending.report()breaks outreasoning_tokens, and the cumulative gate enforces on exact usage (which already includes reasoning). Seedocs/tokenguard.md→ Reasoning models. - Decorator and context manager — budgets nest (an inner downgrade never masks an outer hard cap); config is validated at creation (a typo'd
on_exceedor a map-lessdowngradeis aValueError, never a silent no-op). - Cost attribution, free —
track(feature=…, user_id=…)tags ambient spend viacontextvars(sync + async);report(group_by=[…])shows where the money went, reasoning tokens included. - Cost as a test assertion —
report().assert_under(usd=0.05, feature="search"). - Pre-flight projection —
estimate(model, messages)prices a call without making it. - Durable + bounded — pluggable
use_sink(tokenguard.sinks.SQLiteSink / OTelSink); FIFO-bounded in-memory buffer (configure(max_records=…),dropped()). - No silent USD blind spots — a call whose model isn't in the price table records
$0, so a USD cap can't bite. tokenguard warns once per model (UnpricedModelWarning) and counts these inunpriced_calls()/report()'sunpriced_calls;configure(on_unpriced="raise")makeson_exceed="block"reject them. A token cap is unaffected — tokens are counted regardless of price. - Thread-safe, with one caveat — the spend buffer and
SQLiteSinkare lock-guarded for concurrent emits, but budgets/tags areContextVar-based:asynciotasks inherit them, a plainthreading.Threaddoes not (carry them withcontextvars.copy_context()).
Streaming timing — post-flight raise/truncate fire when a stream is consumed, not when it's launched (the call is accounted once the chunk iterator drains). A loop that launches many streams before draining them can overspend — drain each stream before the next, or use a pre-flight mode (block/downgrade/clamp), which is unaffected.
Wrap-around — it rides the call you already make. Offline and standalone — bundled prices, no account.
See docs/tokenguard.md · CHANGELOG. Part of the Cendor stack — github.com/cendorhq/Cendor. Powered by PowerAI Labs. Apache-2.0; provided "as is", without warranty — use at your own risk (LICENSE §7–8).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cendor_tokenguard-1.0.0.tar.gz.
File metadata
- Download URL: cendor_tokenguard-1.0.0.tar.gz
- Upload date:
- Size: 25.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2da5baf17e8a1691d4598b936231658f35e5401e9baa57deb477b2e536691858
|
|
| MD5 |
133e76e594c10f71a49e2a96f10166c6
|
|
| BLAKE2b-256 |
24c068867dfa3833a67bf1f1a506e6e2e2d4c4e5371242be4fd07f77d0351c3a
|
Provenance
The following attestation bundles were made for cendor_tokenguard-1.0.0.tar.gz:
Publisher:
release.yml on cendorhq/Cendor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cendor_tokenguard-1.0.0.tar.gz -
Subject digest:
2da5baf17e8a1691d4598b936231658f35e5401e9baa57deb477b2e536691858 - Sigstore transparency entry: 2063270971
- Sigstore integration time:
-
Permalink:
cendorhq/Cendor@1733d9d073230ac9448221f660fce4ab07a42c33 -
Branch / Tag:
refs/tags/tokenguard-v1.0.0 - Owner: https://github.com/cendorhq
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@1733d9d073230ac9448221f660fce4ab07a42c33 -
Trigger Event:
push
-
Statement type:
File details
Details for the file cendor_tokenguard-1.0.0-py3-none-any.whl.
File metadata
- Download URL: cendor_tokenguard-1.0.0-py3-none-any.whl
- Upload date:
- Size: 19.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
455f10d963a8f14d116ff3f0463cce66f598f597f7aef2be1e72b38affd34d51
|
|
| MD5 |
ebffd40baf801e95a5942edf9e53e4a4
|
|
| BLAKE2b-256 |
f3ba98a7920a8f94c45154db782de4d5fc38344999fefec484aed9a127679e21
|
Provenance
The following attestation bundles were made for cendor_tokenguard-1.0.0-py3-none-any.whl:
Publisher:
release.yml on cendorhq/Cendor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cendor_tokenguard-1.0.0-py3-none-any.whl -
Subject digest:
455f10d963a8f14d116ff3f0463cce66f598f597f7aef2be1e72b38affd34d51 - Sigstore transparency entry: 2063271074
- Sigstore integration time:
-
Permalink:
cendorhq/Cendor@1733d9d073230ac9448221f660fce4ab07a42c33 -
Branch / Tag:
refs/tags/tokenguard-v1.0.0 - Owner: https://github.com/cendorhq
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@1733d9d073230ac9448221f660fce4ab07a42c33 -
Trigger Event:
push
-
Statement type: