Skip to main content

LLM cost reservation ledger — pre-flight reserve, commit, release.

Project description

Snipz

An LLM cost reservation ledger for Python. Cap your spend per user, per tenant, per feature — and never overshoot, even under concurrent load.

Status: v0.1.x — pre-1.0. The engine is feature-complete and the cap-correctness benchmark passes on real Postgres. API may shift before v1.0; pin snipz>=0.1,<0.2 to allow patches and forbid breaks.

async with await budget.reserve(Scope("user", "u_42"), Decimal("10")) as r:
    response = await call_anthropic(...)
    await r.observe(price(response))   # exit auto-commits at observed cost
# on exception: auto-release; the cap is never overshot

Why

Every team building LLM features rebuilds cost guardrails from scratch. Existing libraries (LiteLLM BudgetManager, Shekel) follow an estimate-then-record pattern — they check the cap, run the call, then log the spend. Under concurrent load this lets two requests both pass a cap check at $4.95 of a $5.00 cap and both run, blowing the cap.

Snipz is a reservation ledger: every call holds budget inside a transaction with SELECT … FOR UPDATE (or BEGIN IMMEDIATE on SQLite) before the LLM runs, commits the actual cost on success, releases on failure. The cap-check and the ledger insert are a single atomic step. The cap is never overshot.


Cap-correctness benchmark

The proof: fire 1000 concurrent reservations of $0.10 each against a $5.00 cap on real Postgres.

Snipz Postgres pool near capacity (10/10). Consider increasing max_size for high-throughput environments.

Snipz cap-correctness benchmark
===============================
  Concurrency: 1000
  Cap:         $5.00
  Per-req:     $0.10
  Duration:    2.755s

  Cap     [########################################] $5.00
  Spend   [########################################] $5.00

  Reservations attempted: 1000
  Expected successes:     50
  Actual successes:         50  (  5%)
  Rejected (cap):          950  ( 95%)
  Lock timeouts:             0  (  0%)
  Other errors:              0  (  0%)

  CAP HELD: $5.00 <= $5.00 (no overshoot)

Exactly 50 reservations committed, 950 raised BudgetExceededError, final spend $5.00. The line above the chart is Snipz's own pool-utilization warning firing — observability is a feature (architecture.md Decision Log #21).

Reproduce in one command (needs Docker for the auto-spun Postgres container):

uv run python benchmarks/cap_correctness.py --testcontainers-postgres --concurrency 1000

No Docker? Default SQLite scenario at 100 concurrent works without any setup:

uv run python benchmarks/cap_correctness.py

Quickstart

import asyncio
from decimal import Decimal
from snipz import Budget, Scope

async def main():
    budget = Budget("snipz.db")                                # or "postgresql://..."
    await budget.migrate()
    await budget.set_limit(Scope("user", "u_42"), Decimal("500"))   # $5/month cap

    async with await budget.reserve(Scope("user", "u_42"), Decimal("10")) as r:
        response = await call_anthropic(...)
        await r.observe(price_from_usage(response.usage))
        # exit auto-commits at observed cost; auto-releases on exception

    await budget.close()

asyncio.run(main())

What you can rely on:

  • Atomic cap-check. Two concurrent reserves at the cap → one wins, one raises BudgetExceededError. Verified by the benchmark above.
  • Idempotent retries. Pass request_id="..." to reserve(); parallel retries with the same id converge on one ledger row.
  • Streaming-aware. r.observe(actual) updates the in-flight cost mid-stream; the cap-check formula uses MAX(actual, estimated) so concurrent requests see the true running total.
  • Late-commit safety. If your call takes longer than the TTL, the sweeper releases the row; a subsequent commit() still settles cleanly and fires on_overrun.
  • Multi-scope. Reserve against [user_scope, tenant_scope, feature_scope] in one call — all caps checked atomically, atomic rollback on any failure.

Install

pip install snipz                  # core: SQLite, async
pip install snipz[postgres]        # + asyncpg for Postgres
pip install snipz[openai]          # + tiktoken for exact OpenAI token counts

What's in the box

What Where What it's for
Budget, Reservation, Scope from snipz import ... The async engine — reserve / commit / release / observe
Sync wrapper (experimental) from snipz.sync import Budget For sync codebases — background event loop, raises if called from inside an async loop
Pricing from snipz import Pricing Vendored price book (Pricing.default()) + DB overrides (Pricing.with_backend(...))
Estimators from snipz.estimators import AnthropicEstimator, OpenAIEstimator, FallbackEstimator Pre-flight token counters; OpenAI is exact via tiktoken
@budget.guard budget.guard(scope=..., estimate=..., actual=...) Decorator that wraps an async LLM call with the full reserve/observe/commit/release lifecycle
Hooks budget.on_reserved, on_committed, on_released, on_overrun Plug-in points for metrics, alerting, audit logs
Sweeper snipz sweep [--interval N] CLI or snipz.sweep.sweep_loop() Background job that releases expired reservations
snipz update-pricing CLI Refresh the vendored pricing.toml from LiteLLM upstream

All async/sync surfaces share the same engine and correctness guarantees.


Deep dives


Development

uv sync                          # install all deps + .venv
uv run pytest                    # 140 tests against SQLite
uv run pytest --postgres         # + 15 Postgres integration tests (needs Docker)
uv run ruff check src/ tests/ benchmarks/
uv run mypy src/snipz benchmarks/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snipz-0.1.0.tar.gz (60.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

snipz-0.1.0-py3-none-any.whl (54.8 kB view details)

Uploaded Python 3

File details

Details for the file snipz-0.1.0.tar.gz.

File metadata

  • Download URL: snipz-0.1.0.tar.gz
  • Upload date:
  • Size: 60.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for snipz-0.1.0.tar.gz
Algorithm Hash digest
SHA256 dfd416acc375406fdd3f05d5082714ee7d576a64c405576e05d9d0dd7530dbb6
MD5 db0067f90affa50287bfc38f56d9e3d6
BLAKE2b-256 f453497a4d1c27605feacac7cfdc9dee35e4a4a24ccca812cbefbc21732543d5

See more details on using hashes here.

File details

Details for the file snipz-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: snipz-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 54.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for snipz-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8a46ea499068daa1f63a91e0517d7e3c09f428069e0a6d13d5edddc8a3a417fa
MD5 b4bf04120d7e4341d7fd99ea021c3d4b
BLAKE2b-256 c8825d38d2d576444977f9ce1b90e9454321d498a12616633eab0e1202d6ea6f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page