LLM cost tracking and budget enforcement. Zero config — with budget(max_usd=1.00): run_agent(). Works with LangGraph, CrewAI, raw OpenAI/Anthropic.
Project description
shekel
LLM cost tracking and budget enforcement for Python. One line. Zero config.
with budget(max_usd=1.00):
run_my_agent() # raises BudgetExceededError if spend exceeds $1.00
I spent $47 debugging a LangGraph retry loop. The agent kept failing, LangGraph kept retrying, and OpenAI kept charging — all while I slept. I built shekel so you don't have to learn that lesson yourself.
⚡️ What's New in v0.2.3: Nested Budgets
Control costs for multi-stage AI workflows with hierarchical budget tracking:
with budget(max_usd=10.00, name="AI Research Assistant") as workflow:
# Research phase: $2 budget
with budget(max_usd=2.00, name="research"):
sources = search_papers() # $0.80
summaries = summarize(sources) # $1.10
# Analysis phase: $5 budget
with budget(max_usd=5.00, name="analysis"):
insights = analyze(summaries) # $3.50
report = draft_report(insights) # $1.20
# Final polish (parent budget)
final = polish_report(report) # $0.60
print(f"Total cost: ${workflow.spent:.2f}") # $7.20
print(workflow.tree())
# AI Research Assistant: $7.20 / $10.00 (direct: $0.60)
# research: $1.90 / $2.00 (direct: $1.90)
# analysis: $4.70 / $5.00 (direct: $4.70)
Why you'll love this:
- 🎯 Per-stage budgets — Cap each phase independently
- 🔒 Auto-capping — Child budgets can't exceed parent's remaining budget
- 📊 Cost attribution — See exactly where money was spent
- 🚫 Safety rails — Parent can't spend while child is active
- 🌳 Visual tree — Debug complex workflows instantly
Install
pip install shekel[openai] # OpenAI
pip install shekel[anthropic] # Anthropic
pip install shekel[all] # Both
pip install shekel[all-models] # Both + tokencost (400+ model pricing)
pip install shekel[cli] # CLI tools (shekel estimate, shekel models)
Quick Start
Simple Budget Enforcement
from shekel import budget, BudgetExceededError
# Enforce a hard cap
try:
with budget(max_usd=1.00, warn_at=0.8) as b:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
print(f"Spent ${b.spent:.4f}")
except BudgetExceededError as e:
print(f"Budget exceeded: ${e.spent:.2f} > ${e.limit:.2f}")
Track Without Limits
# Track spend without enforcing a limit
with budget() as b:
run_my_agent()
print(f"Cost: ${b.spent:.4f}")
Fallback to Cheaper Model
# Fall back to gpt-4o-mini instead of raising
with budget(max_usd=0.50, fallback="gpt-4o-mini") as b:
response = client.chat.completions.create(
model="gpt-4o", # Will switch to gpt-4o-mini if needed
messages=[{"role": "user", "content": prompt}]
)
if b.model_switched:
print(f"Switched to {b.fallback} at ${b.switched_at_usd:.4f}")
Accumulating Sessions
# Budget variables accumulate across multiple uses
session = budget(max_usd=5.00, name="session")
with session:
run_step_1() # Spends $1.50
with session:
run_step_2() # Spends $2.00
print(f"Session total: ${session.spent:.2f}") # $3.50
🌳 Nested Budgets (v0.2.3)
Perfect for multi-stage agents, research workflows, and production AI pipelines.
Real-World Example: AI Research Agent
from shekel import budget
def research_agent(topic: str, max_budget: float = 10.0):
"""Research agent with per-stage budget control."""
with budget(max_usd=max_budget, name="research_agent") as agent:
# Phase 1: Web search ($2 budget)
with budget(max_usd=2.00, name="web_search") as search:
results = search_web(topic)
if search.spent > 1.50:
print("⚠️ Search phase used 75% of budget")
# Phase 2: Content analysis ($5 budget)
with budget(max_usd=5.00, name="analysis") as analysis:
key_points = extract_insights(results)
themes = identify_themes(key_points)
# Phase 3: Report generation ($3 budget)
with budget(max_usd=3.00, name="report_gen") as report:
draft = generate_report(themes)
final = refine_report(draft)
# Print cost breakdown
print(agent.tree())
return final
# Run the agent
report = research_agent("AI safety alignment", max_budget=15.0)
Auto-Capping: Smart Budget Management
with budget(max_usd=10.00, name="workflow") as workflow:
# Spend $7 on initial processing
process_data() # Spends $7.00
# Child wants $5, but only $3 left
# Shekel automatically caps child to $3!
with budget(max_usd=5.00, name="final_step") as step:
print(f"Requested: $5.00")
print(f"Actual limit: ${step.limit:.2f}") # $3.00 (auto-capped!)
generate_output() # Won't exceed $3
Hierarchical Cost Attribution
with budget(max_usd=50.00, name="production_pipeline") as pipeline:
with budget(max_usd=10.00, name="ingestion"):
ingest_data()
with budget(max_usd=20.00, name="processing"):
with budget(max_usd=8.00, name="validation"):
validate_data()
with budget(max_usd=12.00, name="transformation"):
transform_data()
with budget(max_usd=15.00, name="output"):
generate_report()
# Detailed breakdown
print(f"Total: ${pipeline.spent:.2f}")
print(f"Direct spend: ${pipeline.spent_direct:.2f}")
print(f"Child spend: ${pipeline.spent_by_children:.2f}")
print(f"\nFull tree:")
print(pipeline.tree())
Track-Only Children
# Parent enforces budget, but track children without limits
with budget(max_usd=20.00, name="workflow") as workflow:
# This child has no limit (max_usd=None)
with budget(max_usd=None, name="exploration"):
explore_options() # Tracked but unlimited
# This child is limited
with budget(max_usd=5.00, name="finalization"):
finalize()
print(f"Exploration cost: ${workflow.children[0].spent:.2f}")
print(f"Total cost: ${workflow.spent:.2f}")
Advanced Features
Async Support
async with budget(max_usd=1.00) as b:
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
Note: Async nesting not yet supported in v0.2.3. Use sync nested budgets or single-level async.
Decorator Pattern
from shekel import with_budget
@with_budget(max_usd=0.10)
def call_llm(prompt: str):
return client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
Custom Pricing
# Override model pricing
with budget(
max_usd=1.00,
price_per_1k_tokens={"input": 0.001, "output": 0.003}
) as b:
call_custom_model()
Spend Summary
with budget(max_usd=2.00) as b:
run_my_agent()
print(b.summary())
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# shekel spend summary
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
# Total: $1.2450 / $2.00 (62%)
#
# gpt-4o: $1.2450 (5 calls)
# Input: 45.2k tokens → $0.1130
# Output: 11.3k tokens → $1.1320
# ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
CLI
# Estimate cost before running
shekel estimate --model gpt-4o --input-tokens 1000 --output-tokens 500
# Model: gpt-4o
# Input tokens: 1,000
# Output tokens: 500
# Estimated cost: $0.007500
# List all bundled models with pricing
shekel models
shekel models --provider openai
shekel models --provider anthropic
API Reference
budget(...)
| Parameter | Type | Default | Description |
|---|---|---|---|
max_usd |
float | None |
None |
Hard spend cap in USD. None = track only. |
name |
str | None |
None |
v0.2.3: Budget name. Required for nesting. |
warn_at |
float | None |
None |
Fraction of limit (0.0–1.0) at which to warn. |
on_exceed |
Callable | None |
None |
Callback at warn_at threshold. Receives (spent, limit). |
fallback |
str | None |
None |
Model to switch to when max_usd is hit. Same provider only. |
on_fallback |
Callable | None |
None |
Callback on fallback switch. Receives (spent, limit, fallback_model). |
hard_cap |
float | None |
max_usd * 2 |
Absolute ceiling when fallback is active. |
price_per_1k_tokens |
dict | None |
None |
Override pricing: {"input": 0.001, "output": 0.003}. |
persistent |
bool |
False |
DEPRECATED v0.2.3: Budgets always accumulate now. |
Properties
| Property | Type | Description |
|---|---|---|
spent |
float |
Total USD spent (includes children). |
remaining |
float | None |
USD remaining (based on effective limit). |
limit |
float | None |
Effective limit (auto-capped if nested). |
name |
str | None |
Budget name. |
parent |
Budget | None |
v0.2.3: Parent budget, or None if root. |
children |
list[Budget] |
v0.2.3: List of child budgets. |
active_child |
Budget | None |
v0.2.3: Currently active child. |
full_name |
str |
v0.2.3: Hierarchical path (e.g., "workflow.research"). |
spent_direct |
float |
v0.2.3: Direct spend (excluding children). |
spent_by_children |
float |
v0.2.3: Sum of all child spend. |
model_switched |
bool |
True if fallback was activated. |
switched_at_usd |
float | None |
Spend level when fallback triggered. |
fallback_spent |
float |
Cost on the fallback model. |
Methods
| Method | Returns | Description |
|---|---|---|
summary() |
str |
Formatted spend summary with model breakdown. |
summary_data() |
dict |
Structured spend data as dictionary. |
tree() |
str |
v0.2.3: Visual hierarchy of budget tree. |
reset() |
None |
Reset spend tracking (only outside context). |
BudgetExceededError
| Attribute | Description |
|---|---|
spent |
Total spend when limit was hit. |
limit |
The configured max_usd. |
model |
Model that triggered the error. |
tokens |
{"input": N, "output": N} from the last call. |
Supported Models
| Model | Input / 1k | Output / 1k |
|---|---|---|
| gpt-4o | $0.00250 | $0.01000 |
| gpt-4o-mini | $0.000150 | $0.000600 |
| o1 | $0.01500 | $0.06000 |
| o1-mini | $0.00300 | $0.01200 |
| gpt-3.5-turbo | $0.000500 | $0.001500 |
| claude-3-5-sonnet-20241022 | $0.00300 | $0.01500 |
| claude-3-haiku-20240307 | $0.000250 | $0.001250 |
| claude-3-opus-20240229 | $0.01500 | $0.07500 |
| gemini-1.5-flash | $0.0000750 | $0.000300 |
| gemini-1.5-pro | $0.00125 | $0.00500 |
Versioned model names resolve automatically — gpt-4o-2024-08-06 maps to gpt-4o.
For unlisted models: pass price_per_1k_tokens or install shekel[all-models] for 400+ models via tokencost.
Framework Integration
Works seamlessly with:
- LangGraph — Budget entire agent workflows
- CrewAI — Per-agent budget tracking
- AutoGen — Multi-agent cost control
- LlamaIndex — RAG pipeline budgets
- Haystack — Document processing budgets
Any framework that calls openai or anthropic under the hood works automatically. See examples/ for demos.
How It Works
- Monkey-patching — Wraps
openai.chat.completions.create()andanthropic.messages.create()on context entry - ContextVar isolation — Each
budget()stores its counter in a ContextVar; concurrent agents never share state - Hierarchical tracking — Parent/child relationships track spend propagation automatically
- Ref-counted patching — Nested contexts patch only once
- Zero config — No API keys, no external services
Migration Guide (v0.2.2 → v0.2.3)
Breaking Changes
Budget variables now accumulate by default:
# v0.2.2: Budget reset on each entry
b = budget(max_usd=10.00)
with b: spend_1() # Spends $2
with b: spend_2() # Was $0, now spends $2 more
# v0.2.2: b.spent == $2
# v0.2.3: b.spent == $4 ⚠️ ACCUMULATES!
Migration:
- If you relied on reset behavior: Create new
budget()instances instead - If you used
persistent=True: Remove it (now the default)
Names required for nesting:
# v0.2.3: Names required when nesting
with budget(max_usd=10, name="parent"): # ✅ Required
with budget(max_usd=5, name="child"): # ✅ Required
work()
New Features
- ✅ Nested budgets with automatic propagation
- ✅ Auto-capping to parent's remaining budget
- ✅
tree()method for visual hierarchy - ✅
spent_directandspent_by_childrenproperties - ✅
full_namefor hierarchical naming - ✅ Max nesting depth of 5 levels
Documentation
Full documentation: arieradle.github.io/shekel
Contributing
See CONTRIBUTING.md.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shekel-0.2.3.tar.gz.
File metadata
- Download URL: shekel-0.2.3.tar.gz
- Upload date:
- Size: 86.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2029a64e96f943a2570b69b201b0f2dec5bfca9ef272c00b851140bd43c79516
|
|
| MD5 |
5475242b6376ffd503d4d3492619224d
|
|
| BLAKE2b-256 |
dee2b381ebc810d8959032306c2746766366e2597fd4cc2626c70ad9d47eb390
|
Provenance
The following attestation bundles were made for shekel-0.2.3.tar.gz:
Publisher:
publish.yml on arieradle/shekel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
shekel-0.2.3.tar.gz -
Subject digest:
2029a64e96f943a2570b69b201b0f2dec5bfca9ef272c00b851140bd43c79516 - Sigstore transparency entry: 1074322827
- Sigstore integration time:
-
Permalink:
arieradle/shekel@3495df0ee6b414cc3301c8190c7ae41d83e47a90 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/arieradle
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3495df0ee6b414cc3301c8190c7ae41d83e47a90 -
Trigger Event:
push
-
Statement type:
File details
Details for the file shekel-0.2.3-py3-none-any.whl.
File metadata
- Download URL: shekel-0.2.3-py3-none-any.whl
- Upload date:
- Size: 21.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3804fd25598395ba78bf4c0b468f025663ff4e7f1ef8d93c4b029bb89058e28
|
|
| MD5 |
41332346555249c964e623429bc8db78
|
|
| BLAKE2b-256 |
a0c3f8ec1c19021d4647381b8f9a7ffe859aba546ec1ba5122ce190cf80228a7
|
Provenance
The following attestation bundles were made for shekel-0.2.3-py3-none-any.whl:
Publisher:
publish.yml on arieradle/shekel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
shekel-0.2.3-py3-none-any.whl -
Subject digest:
e3804fd25598395ba78bf4c0b468f025663ff4e7f1ef8d93c4b029bb89058e28 - Sigstore transparency entry: 1074322852
- Sigstore integration time:
-
Permalink:
arieradle/shekel@3495df0ee6b414cc3301c8190c7ae41d83e47a90 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/arieradle
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3495df0ee6b414cc3301c8190c7ae41d83e47a90 -
Trigger Event:
push
-
Statement type: