LLM budget enforcement and cost tracking. Zero config โ with budget(max_usd=1.00): run_agent(). Or: shekel run agent.py --budget 1. Works with LangGraph, CrewAI, raw OpenAI/Anthropic/Gemini.
Project description
with budget(max_usd=5.00):
run_my_agent() # hard stop at $5. no SDK changes. no config. just works.
shekel run agent.py --budget 5 # or enforce without touching code at all
I woke up to a $47 AWS bill from a LangGraph agent that spent the night retrying a failed tool call. OpenAI was happy to keep charging. I built shekel so you don't have to learn that lesson yourself.
Install
pip install shekel[openai] # OpenAI
pip install shekel[anthropic] # Anthropic
pip install shekel[all] # OpenAI + Anthropic + LiteLLM + Gemini + HuggingFace
pip install shekel[cli] # shekel run โ enforce budgets without touching code
Works with everything
If it calls OpenAI or Anthropic under the hood, shekel sees it โ zero integration code needed.
| Provider | Framework | |
|---|---|---|
| OpenAI ยท Anthropic ยท Gemini | LangChain ยท LangGraph | Auto-patched |
| HuggingFace ยท LiteLLM ยท Groq | CrewAI ยท OpenAI Agents SDK | Auto-patched |
| MCP ยท AutoGen ยท LlamaIndex | Any custom wrapper | Auto-patched |
Every pattern you'll actually use
Hard cap โ the one that saves you
from shekel import budget
with budget(max_usd=5.00):
run_my_agent()
# raises BudgetExceededError the moment spend crosses $5
No wrapping your OpenAI client. No decorators. No SDK replacement. shekel monkey-patches the provider on context entry and restores it on exit. Your existing code runs unchanged.
Warn before the limit hits
with budget(max_usd=5.00, warn_at=0.8) as b:
run_my_agent()
# logs a warning at $4.00, raises at $5.00
Track spend without enforcing
with budget() as b:
run_my_agent()
print(f"that cost ${b.spent:.4f}")
Switch to a cheaper model instead of crashing
with budget(max_usd=1.00, fallback={"at_pct": 0.8, "model": "gpt-4o-mini"}) as b:
run_my_agent()
# switches gpt-4o โ gpt-4o-mini at $0.80, hard stops at $1.00
Cap tool calls โ stop the infinite search loop
from shekel import tool
@tool(price=0.01) # charge $0.01 per call + count toward the cap
def web_search(query: str) -> str: ...
@tool # free โ just count calls
def read_file(path: str) -> str: ...
with budget(max_usd=5.00, max_tool_calls=20) as b:
run_my_agent()
# ToolBudgetExceededError on call 21 โ before the tool runs
print(b.summary()) # LLM spend + tool spend broken out by tool name
Auto-intercepted with zero config: LangChain, MCP, CrewAI, OpenAI Agents SDK.
Per-stage budget control
with budget(max_usd=10.00, name="pipeline") as pipeline:
with budget(max_usd=2.00, name="research"):
results = search_web(query) # capped at $2
with budget(max_usd=5.00, name="analysis"):
report = analyze(results) # capped at $5
print(pipeline.tree())
# pipeline: $4.80 / $10.00
# research: $1.20 / $2.00
# analysis: $3.60 / $5.00
Children auto-cap to the parent's remaining balance. b.tree() gives you a live visual breakdown.
LangGraph โ per-node circuit breaking
with budget(max_usd=10.00, name="graph") as b:
b.node("fetch_data", max_usd=0.50) # NodeBudgetExceededError before node runs
b.node("summarize", max_usd=1.00)
app = graph.compile()
app.invoke({"query": "..."})
print(b.tree())
# graph: $0.84 / $10.00
# [node] fetch_data: $0.12 / $0.50 (24%)
# [node] summarize: $0.72 / $1.00 (72%)
Shekel patches StateGraph.add_node() transparently โ no graph changes needed.
LangChain โ per-chain circuit breaking
with budget(max_usd=5.00, name="pipeline") as b:
b.chain("retriever", max_usd=0.20) # ChainBudgetExceededError before chain runs
b.chain("summarizer", max_usd=1.00)
retriever_chain.invoke({"query": "..."})
summarizer_chain.invoke({"doc": "..."})
Shekel patches Runnable._call_with_config and RunnableSequence.invoke โ zero changes to your chains.
CrewAI โ per-agent and per-task circuit breaking
from shekel.exceptions import AgentBudgetExceededError, TaskBudgetExceededError
try:
with budget(max_usd=5.00, name="crew") as b:
b.agent(researcher.role, max_usd=2.00) # use agent.role directly
b.agent(writer.role, max_usd=1.00)
b.task(research_task.name, max_usd=1.50) # use task.name directly
b.task(write_task.name, max_usd=0.80)
crew.kickoff(inputs={"topic": "AI"})
except TaskBudgetExceededError as e:
print(f"Task '{e.task_name}' over budget: ${e.spent:.4f} / ${e.limit:.2f}")
except AgentBudgetExceededError as e:
print(f"Agent '{e.agent_name}' over budget")
print(b.tree())
# crew: $2.84 / $5.00
# [agent] Senior Researcher: $1.92 / $2.00 (96.0%)
# [agent] Content Writer: $0.92 / $1.00 (92.0%)
# [task] research: $1.92 / $1.50 (128.0%)
# [task] write: $0.92 / $0.80 (115.0%)
Shekel patches Agent.execute_task transparently. Gate order: task cap โ agent cap โ global (most specific first).
Distributed budgets โ enforce across multiple processes
from shekel.backends.redis import RedisBackend
backend = RedisBackend() # reads REDIS_URL from env; fail-closed by default
with budget("$5/hr + 100 calls/hr", name="api-tier", backend=backend) as b:
response = client.chat.completions.create(...)
# Atomic Lua-script enforcement โ one Redis round-trip per call
# BudgetConfigMismatchError if the same name is reused with different limits
Works with AsyncRedisBackend for async workflows. Circuit breaker built in โ configurable threshold + cooldown. Fail-open or fail-closed.
Rolling-window rate limits
with budget("$5/hr", name="api-tier") as b:
response = await client.chat.completions.create(...)
# BudgetExceededError carries retry_after so callers know when the window resets
Multi-cap: budget("$5/hr + 100 calls/hr") โ USD and call-count windows are independent.
Accumulate across sessions
session = budget(max_usd=20.00, name="session")
with session: run_step_1() # $3.20
with session: run_step_2() # $8.10
with session: run_step_3() # raises at $20
print(f"total: ${session.spent:.2f}")
Enforce from the CLI โ zero code changes
Don't want to touch the code at all? Don't.
pip install shekel[cli]
shekel run agent.py --budget 5
# exit 0 = under budget | exit 1 = budget exceeded โ CI-friendly
Drop it into any pipeline:
# Shell / cron / Docker
AGENT_BUDGET_USD=5 shekel run agent.py
# GitHub Actions
- run: shekel run agent.py --budget 5
# Docker โ operator sets budget at runtime, no rebuild needed
ENTRYPOINT ["shekel", "run", "agent.py"]
# docker run -e AGENT_BUDGET_USD=5 my-agent-image
Key flags:
--budget 5 # hard stop in USD
--warn-at 0.8 # log warning at 80%, hard stop at 100%
--max-llm-calls 20 # cap by call count instead of spend
--max-tool-calls 50 # cap agent tool calls
--warn-only # log but never exit 1 (soft guardrail)
--dry-run # track costs, no enforcement
--output json # machine-readable spend summary for log pipelines
--budget-file shekel.toml # load limits from config file
What the spend summary looks like
with budget(max_usd=5.00) as b:
run_my_agent()
print(b.summary())
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
shekel spend summary
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Total: $1.2450 / $5.00 (25%)
gpt-4o: $1.1320 (5 calls)
Input: 45.2k tokens โ $0.1130
Output: 11.3k tokens โ $1.1320
Tool spend: $0.1130 (9 tool calls)
web_search $0.090 (9 calls) [langchain]
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Or machine-readable:
shekel run agent.py --budget 5 --output json
# {"spent": 1.245, "limit": 5.0, "calls": 5, "tool_calls": 9, "status": "ok", "model": "gpt-4o"}
The decorator
from shekel import with_budget
@with_budget(max_usd=0.10)
def summarize(text: str) -> str:
return client.chat.completions.create(...).choices[0].message.content
# budget enforced independently on every call
How it works
shekel monkey-patches openai.chat.completions.create and anthropic.messages.create on __enter__ and restores originals on __exit__. Spend is tracked in a ContextVar โ concurrent agents in the same process never share state. Nested with budget() blocks form a tree; child spend rolls up automatically.
No background threads. No external services. No API keys. Nothing leaves your machine.
Observability
- Langfuse โ cost streaming, circuit-break events, budget hierarchy in Langfuse spans
- OpenTelemetry โ 9 instruments:
shekel.llm.cost_usd,shekel.budget.utilization,shekel.budget.spend_rate,shekel.tool.calls_total, and more
from shekel.otel import ShekelMeter
meter = ShekelMeter() # attaches to global MeterProvider; silent no-op if OTel absent
Supported models
Built-in pricing for GPT-4o, GPT-4o-mini, o1, o3, Claude 3.5/3/3.7 Sonnet, Claude 3 Haiku/Opus, Gemini 2.0/2.5 Flash/Pro, and more.
pip install shekel[all-models] # 400+ models via tokencost
shekel models # list all bundled models and pricing
shekel estimate --model gpt-4o --input-tokens 1000 --output-tokens 500
API quick reference
budget(
max_usd=5.00, # hard USD cap
warn_at=0.8, # warn at 80%
max_llm_calls=50, # cap by call count
max_tool_calls=100, # cap tool dispatches
tool_prices={"web_search": 0.01}, # charge per tool
fallback={"at_pct": 0.8, "model": "gpt-4o-mini"}, # switch instead of crash
name="my-agent", # required for nesting + temporal budgets
backend=RedisBackend(), # distributed enforcement across processes
)
budget("$5/hr + 100 calls/hr", name="api-tier") # multi-cap rolling-window
Component caps โ all chainable, all raise before the component executes:
b.node("fetch_data", max_usd=0.50) # LangGraph node โ NodeBudgetExceededError
b.chain("retriever", max_usd=0.20) # LangChain chain โ ChainBudgetExceededError
b.agent("researcher", max_usd=1.00) # CrewAI agent โ AgentBudgetExceededError
b.task("summarize", max_usd=0.50) # CrewAI task โ TaskBudgetExceededError
Exceptions โ all subclass BudgetExceededError, so one except catches everything:
| Exception | Raised when | Key fields |
|---|---|---|
BudgetExceededError |
Global cap hit | spent, limit, model, retry_after |
NodeBudgetExceededError |
LangGraph node cap hit | node_name, spent, limit |
AgentBudgetExceededError |
CrewAI agent cap hit | agent_name, spent, limit |
TaskBudgetExceededError |
CrewAI task cap hit | task_name, spent, limit |
ChainBudgetExceededError |
LangChain chain cap hit | chain_name, spent, limit |
ToolBudgetExceededError |
Tool call cap hit | tool_name, calls_used, calls_limit |
BudgetConfigMismatchError |
Redis name reused with different limits | โ |
Security
Every PR and push to main runs CodeQL, Trivy, Bandit, and pip-audit. See the Security tab for results.
Documentation
- Quick Start
- CLI Reference
- Docker & Containers
- Nested Budgets
- Tool Budgets
- Temporal Budgets
- LangGraph Integration
- CrewAI Integration
- API Reference
Contributing
See CONTRIBUTING.md. PRs welcome โ especially new framework adapters.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shekel-1.0.1.tar.gz.
File metadata
- Download URL: shekel-1.0.1.tar.gz
- Upload date:
- Size: 306.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd0b79c59306b93eab592377cbd74cde688c8c1609c4ab2fc91bead4878579cb
|
|
| MD5 |
aa44551954ae2eb4ea0f250cc9699d17
|
|
| BLAKE2b-256 |
fb21acf3799b1b6f724be9db57031aa4ffb3395c93b4f59215ab95a3e5766a33
|
Provenance
The following attestation bundles were made for shekel-1.0.1.tar.gz:
Publisher:
publish.yml on arieradle/shekel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
shekel-1.0.1.tar.gz -
Subject digest:
dd0b79c59306b93eab592377cbd74cde688c8c1609c4ab2fc91bead4878579cb - Sigstore transparency entry: 1125733063
- Sigstore integration time:
-
Permalink:
arieradle/shekel@26ee18343ff3eb084aa532ce07100858e1f68846 -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/arieradle
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@26ee18343ff3eb084aa532ce07100858e1f68846 -
Trigger Event:
push
-
Statement type:
File details
Details for the file shekel-1.0.1-py3-none-any.whl.
File metadata
- Download URL: shekel-1.0.1-py3-none-any.whl
- Upload date:
- Size: 73.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0076e3a732dd0397c0c7415fb0f3f4ccb409dab932ec8504b0e62a3c7b039051
|
|
| MD5 |
0f75123482bea6dfbd221b479aabcfec
|
|
| BLAKE2b-256 |
ce1418859e714d7275042213fcd73a8e497db7684feed3d7d0483066c9010a13
|
Provenance
The following attestation bundles were made for shekel-1.0.1-py3-none-any.whl:
Publisher:
publish.yml on arieradle/shekel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
shekel-1.0.1-py3-none-any.whl -
Subject digest:
0076e3a732dd0397c0c7415fb0f3f4ccb409dab932ec8504b0e62a3c7b039051 - Sigstore transparency entry: 1125733117
- Sigstore integration time:
-
Permalink:
arieradle/shekel@26ee18343ff3eb084aa532ce07100858e1f68846 -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/arieradle
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@26ee18343ff3eb084aa532ce07100858e1f68846 -
Trigger Event:
push
-
Statement type: