Local budget guardrail for AI agents — hard-stops a runaway loop before its next LLM call crosses a spend ceiling. No account, no network.
Project description
floe-guard
A local budget guardrail for AI agents. It hard-stops your agent before its next LLM call when it would cross a spend ceiling — so a runaway loop dies at $0.10 instead of $4,000. No account, no signup, no network. Runs in your process.
pip install floe-guard
from floe_guard import BudgetGuard
guard = BudgetGuard(limit_usd=5.00) # your ceiling
guard.check() # before each LLM call — raises if it'd cross
response = call_your_llm(...) # your existing call
guard.record("gpt-4o", response.usage.prompt_tokens, response.usage.completion_tokens)
When the next call would cross the ceiling, the guard raises BudgetExceeded and
prints:
BUDGET EXCEEDED — call blocked
spent so far: $5.001250 | ceiling: $5.000000
The next call would cross your budget; floe-guard stopped your agent before it ran.
Animated demo coming — run python examples/runaway_loop.py to watch it stop a loop live.
See it stop a loop (no API key needed)
python examples/runaway_loop.py
This rigs a loop against a stub LLM — no real API key, no account, no network.
It prices each fake gpt-4o call offline and the guard halts the loop after a few
iterations. This is the reproducible "stop the loop" demo.
How it works
The guard sits in the call path, not on an event bus. A passive listener is told about spend after the fact and can't halt anything — so enforcement has to be the thing standing in front of the next call:
check()runs before each LLM call. It predicts the next call's cost from the last one and raisesBudgetExceededif that would cross your ceiling — the call never runs. (A running-total check also catches an overshoot if an estimate came in low.)record(model, prompt_tokens, completion_tokens)runs after each response. It prices the tokens offline from a bundled LiteLLM cost map and adds the USD to a running total.
Unpriceable models fail closed
If a model isn't in the cost map and you didn't supply a price, the guard warns
loudly and refuses (UnpriceableModelError) rather than silently treat it as
free — you can't cap spend you can't measure. Give it a price to enforce it:
from floe_guard import BudgetGuard, ManualPrice
guard = BudgetGuard(
limit_usd=5.00,
price_overrides={"my-self-hosted-model": ManualPrice(1e-6, 2e-6)}, # USD/token
)
# or, set fail_closed=False to warn-and-skip for models you accept un-metered.
Framework adapters (optional extras)
CrewAI
pip install floe-guard[crewai]
from crewai import Crew
from floe_guard import BudgetGuard
from floe_guard.integrations.crewai import guard_crew
guard = BudgetGuard(limit_usd=1.00)
guard_crew(guard) # one line — enforces across the whole crew
Crew(agents=[...], tasks=[...]).kickoff()
CrewAI runs on LiteLLM, so one callback caps every agent and task under a single budget.
LiteLLM
pip install floe-guard[litellm]
from floe_guard import BudgetGuard
from floe_guard.integrations.litellm import guarded_completion
guard = BudgetGuard(limit_usd=1.00)
response = guarded_completion(guard, model="gpt-4o", messages=[...])
Prefer the LiteLLM-native callback? Register budget_guard_callback(guard) on
litellm.callbacks.
Coming next
LangChain (callback) and the Vercel AI SDK (TypeScript middleware) are next. Open an issue if you want one sooner.
Honest about what this is
floe-guard is a local, estimate-based guardrail. It prices tokens from a vendored cost map inside your process:
- The cost map can drift as vendors change prices — refresh it like any snapshot.
- It only sees the vendors you instrument.
- A determined agent or a bug could route around an in-process check.
It's genuinely useful on its own, and it's honest about its limits. No inflated metrics, no "zero defaults" claims — it's a free local stop, not a vault.
Upgrade to hosted Floe
When you need the ceiling to be un-bypassable and cross-vendor, hosted Floe moves enforcement server-side against a real credit line:
- Un-bypassable — enforced at the spend rail, not in your process.
- Cross-vendor — one budget over LLM tokens and paid (x402) tool calls.
- Team budgets + analytics — shared ceilings, per-agent isolation, spend history.
Set FLOE_API_KEY and floe-guard exposes a hook to delegate enforcement to
hosted Floe (see src/floe_guard/hosted.py — wiring
the live endpoint is in progress; the local guard is fully functional today).
→ dev-dashboard.floelabs.xyz · floelabs.xyz
Development
pip install -e ".[dev]"
pytest
ruff check .
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file floe_guard-0.1.0.tar.gz.
File metadata
- Download URL: floe_guard-0.1.0.tar.gz
- Upload date:
- Size: 17.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
628b20ff9350d621a9d0008c525d4a91030c8e4995d0579b2656e2b77c5b51dc
|
|
| MD5 |
f7b50ea4e035219ffc01b11b1dbc117b
|
|
| BLAKE2b-256 |
cd23951e4cffdb6c7ef57c4b0ac13deb819fc4da7ce6d755611f69f08dc6266f
|
File details
Details for the file floe_guard-0.1.0-py3-none-any.whl.
File metadata
- Download URL: floe_guard-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a73dc38a9c4c1c45d9788782970b04855657fe3ae54cd59486f9ba5541ef0040
|
|
| MD5 |
943f3bfff97890a7ad6428c6f48f4b8a
|
|
| BLAKE2b-256 |
45e31c52163dfaab3f4563920cb8a78efd64ca59c44ba518f632654623838b89
|