LLM cost reservation ledger — pre-flight reserve, commit, release.
Project description
Snipz
An LLM cost reservation ledger for Python. Cap your spend per user, per tenant, per feature — and never overshoot, even under concurrent load.
Status: v0.1.x — pre-1.0. The engine is feature-complete and the cap-correctness benchmark passes on real Postgres. API may shift before v1.0; pin
snipz>=0.1,<0.2to allow patches and forbid breaks.
async with await budget.reserve(Scope("user", "u_42"), Decimal("10")) as r:
response = await call_anthropic(...)
await r.observe(price(response)) # exit auto-commits at observed cost
# on exception: auto-release; the cap is never overshot
Why
Every team building LLM features rebuilds cost guardrails from scratch. Existing libraries (LiteLLM BudgetManager, Shekel) follow an estimate-then-record pattern — they check the cap, run the call, then log the spend. Under concurrent load this lets two requests both pass a cap check at $4.95 of a $5.00 cap and both run, blowing the cap.
Snipz is a reservation ledger: every call holds budget inside a transaction with SELECT … FOR UPDATE (or BEGIN IMMEDIATE on SQLite) before the LLM runs, commits the actual cost on success, releases on failure. The cap-check and the ledger insert are a single atomic step. The cap is never overshot.
Cap-correctness benchmark
The proof: fire 1000 concurrent reservations of $0.10 each against a $5.00 cap on real Postgres.
Snipz Postgres pool near capacity (10/10). Consider increasing max_size for high-throughput environments.
Snipz cap-correctness benchmark
===============================
Concurrency: 1000
Cap: $5.00
Per-req: $0.10
Duration: 2.755s
Cap [########################################] $5.00
Spend [########################################] $5.00
Reservations attempted: 1000
Expected successes: 50
Actual successes: 50 ( 5%)
Rejected (cap): 950 ( 95%)
Lock timeouts: 0 ( 0%)
Other errors: 0 ( 0%)
CAP HELD: $5.00 <= $5.00 (no overshoot)
Exactly 50 reservations committed, 950 raised BudgetExceededError, final spend $5.00. The line above the chart is Snipz's own pool-utilization warning firing — observability is a feature (architecture.md Decision Log #21).
Reproduce in one command (needs Docker for the auto-spun Postgres container):
uv run python benchmarks/cap_correctness.py --testcontainers-postgres --concurrency 1000
No Docker? Default SQLite scenario at 100 concurrent works without any setup:
uv run python benchmarks/cap_correctness.py
Quickstart
import asyncio
from decimal import Decimal
from snipz import Budget, Scope
async def main():
budget = Budget("snipz.db") # or "postgresql://..."
await budget.migrate()
await budget.set_limit(Scope("user", "u_42"), Decimal("500")) # $5/month cap
async with await budget.reserve(Scope("user", "u_42"), Decimal("10")) as r:
response = await call_anthropic(...)
await r.observe(price_from_usage(response.usage))
# exit auto-commits at observed cost; auto-releases on exception
await budget.close()
asyncio.run(main())
What you can rely on:
- Atomic cap-check. Two concurrent reserves at the cap → one wins, one raises
BudgetExceededError. Verified by the benchmark above. - Idempotent retries. Pass
request_id="..."toreserve(); parallel retries with the same id converge on one ledger row. - Streaming-aware.
r.observe(actual)updates the in-flight cost mid-stream; the cap-check formula usesMAX(actual, estimated)so concurrent requests see the true running total. - Late-commit safety. If your call takes longer than the TTL, the sweeper releases the row; a subsequent
commit()still settles cleanly and fireson_overrun. - Multi-scope. Reserve against
[user_scope, tenant_scope, feature_scope]in one call — all caps checked atomically, atomic rollback on any failure.
Install
pip install snipz # core: SQLite, async
pip install snipz[postgres] # + asyncpg for Postgres
pip install snipz[openai] # + tiktoken for exact OpenAI token counts
What's in the box
| What | Where | What it's for |
|---|---|---|
Budget, Reservation, Scope |
from snipz import ... |
The async engine — reserve / commit / release / observe |
| Sync wrapper (experimental) | from snipz.sync import Budget |
For sync codebases — background event loop, raises if called from inside an async loop |
Pricing |
from snipz import Pricing |
Vendored price book (Pricing.default()) + DB overrides (Pricing.with_backend(...)) |
| Estimators | from snipz.estimators import AnthropicEstimator, OpenAIEstimator, FallbackEstimator |
Pre-flight token counters; OpenAI is exact via tiktoken |
@budget.guard |
budget.guard(scope=..., estimate=..., actual=...) |
Decorator that wraps an async LLM call with the full reserve/observe/commit/release lifecycle |
| Hooks | budget.on_reserved, on_committed, on_released, on_overrun |
Plug-in points for metrics, alerting, audit logs |
| Sweeper | snipz sweep [--interval N] CLI or snipz.sweep.sweep_loop() |
Background job that releases expired reservations |
snipz update-pricing |
CLI | Refresh the vendored pricing.toml from LiteLLM upstream |
All async/sync surfaces share the same engine and correctness guarantees.
Deep dives
snipz.md— positioning, competitor analysis, build phasesarchitecture.md— layered architecture, schema, full decision logsnipz-protocol.md— wire protocol spec (DRAFT — comments open)scenarios.md— concurrency walkthroughs
Development
uv sync # install all deps + .venv
uv run pytest # 140 tests against SQLite
uv run pytest --postgres # + 15 Postgres integration tests (needs Docker)
uv run ruff check src/ tests/ benchmarks/
uv run mypy src/snipz benchmarks/
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file snipz-0.1.0.tar.gz.
File metadata
- Download URL: snipz-0.1.0.tar.gz
- Upload date:
- Size: 60.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfd416acc375406fdd3f05d5082714ee7d576a64c405576e05d9d0dd7530dbb6
|
|
| MD5 |
db0067f90affa50287bfc38f56d9e3d6
|
|
| BLAKE2b-256 |
f453497a4d1c27605feacac7cfdc9dee35e4a4a24ccca812cbefbc21732543d5
|
File details
Details for the file snipz-0.1.0-py3-none-any.whl.
File metadata
- Download URL: snipz-0.1.0-py3-none-any.whl
- Upload date:
- Size: 54.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a46ea499068daa1f63a91e0517d7e3c09f428069e0a6d13d5edddc8a3a417fa
|
|
| MD5 |
b4bf04120d7e4341d7fd99ea021c3d4b
|
|
| BLAKE2b-256 |
c8825d38d2d576444977f9ce1b90e9454321d498a12616633eab0e1202d6ea6f
|