Skip to main content

Pure, fail-closed cost-gating for expensive remote/external work: a hard $ budget cap, backend routing (local->cloud->SDK), and opt-in audited API egress.

Project description

dispatch-kit

A tiny, pure, dependency-free library for gating expensive remote/external work — the same machinery for a cloud GPU job (a Cloud Run L4 reached over a tailnet) and a paid LLM/SDK API call (Gemini, Claude, Rowan). It answers three questions, fail-closed:

  • Can we afford it? — a hard, reserve-on-approval budget cap (per-run + per-month).
  • Where should it run? — a pure router: LOCAL → LAN → CLOUD → SDK, SDK opt-in only.
  • Is the external call safe? — opt-in, audited API egress with reference-only secrets.

It owns the policy (afford / route / approve / egress); your app keeps its job entity, persistence, and executor. The transport auth (who may talk) is a separate concern — pair this with tailnet-guard. Stdlib only; every check is fail-closed (default budget 0 = paid work off; SDK never auto-selected; a missing key refuses).

Use

from decimal import Decimal
from dispatch_kit import (
    BudgetCap, BudgetState, CostRates, admits, estimate_cost,   # the hard $ cap
    select_backend, BackendKind, ToolRequirements,              # the where
    SecretRef, ExternalEndpoint, log_egress,                    # opt-in API egress
    Approval,                                                   # the approval audit fact
)

# 1. Reserve-on-approval: refuse a job that would push past the cap (both windows).
rates = CostRates(gpu_usd_per_s=Decimal("0.0008"), vcpu_usd_per_s=Decimal("0.00001"),
                  gib_usd_per_s=Decimal("0.000002"), idle_tail_s=Decimal(600))
cost = estimate_cost(rates, max_runtime_s=3600, vcpus=8, memory_gib=32)   # an UPPER bound
decision = admits(cost, run_state, month_state, BudgetCap(run_usd=Decimal(50), month_usd=Decimal(500)))
if not decision.admitted:
    raise OverBudget(decision.reason)        # default cap is $0 — paid work is off until you set one

# 2. Pick where it runs — LOCAL first, SDK only if explicitly allowed.
backend = select_backend(my_backends, ToolRequirements(tool_id="cofold", min_vram_gb=24.0))

# 3. An LLM/SDK key is a REFERENCE (env var name), resolved at call time, never logged.
gemini = ExternalEndpoint("gemini", "https://generativelanguage.googleapis.com",
                          SecretRef("GEMINI_API_KEY"))
log_egress(gemini, detail="summarize")       # audit that data left the boundary
headers = {"Authorization": gemini.bearer()} # raises if the key is unset (never an unauth call)

What's in the box

Module Purpose
budget BudgetCap / BudgetState / CostRates / admits / estimate_cost — the hard, Decimal-exact, reserve-on-approval cap across a run + month window
estimate CostEstimate / HostCapabilities / vram_fits — the one "no GPU ⇒ a GPU job is infeasible" rule, shared by the gate and the router
routing BackendKind / BackendCapabilities / ToolRequirements / select_backend (generic over a Routable) — the pure LOCAL→LAN→CLOUD→SDK policy; SDK opt-in
egress SecretRef / ExternalEndpoint / log_egress — reference-only API keys, https-only, fail-closed on a missing key, audited egress (SDKs and LLM APIs)
approval Approval / ApprovalOutcome — the who/when/why audit fact for a gated job
dispatch JobStore / Transport / WorkerExecutor protocols + is_lease_stale / should_give_up / Lease — the run-it-once-recoverably contract (atomic claim, stale-reject, lease recovery); push vs pull is only the Transport adapter

Notes

  • The budget cap lives in your dispatch service, never the UI — an agent hitting the API directly is still gated. Default cap $0; if spend can't be computed, refuse.
  • Reserve on approval, reconcile on completion — approving reserves the estimate immediately so a burst counts against the cap; the worker's true runtime reconciles reserved → spent.
  • SDK / external egress is the one deliberate exception — never the default (allow_sdk / opt-in), always logged, the key sourced from a secret at call time and never written to a log.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dispatch_kit-0.4.0.tar.gz (39.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dispatch_kit-0.4.0-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file dispatch_kit-0.4.0.tar.gz.

File metadata

  • Download URL: dispatch_kit-0.4.0.tar.gz
  • Upload date:
  • Size: 39.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dispatch_kit-0.4.0.tar.gz
Algorithm Hash digest
SHA256 520f0e0e51612edaeb4e09e339878dee0a878d2dad5c3e270d682f2693ca6233
MD5 2dd596c357dc52639f1d0b391c500244
BLAKE2b-256 2d82bb43c603b251ec8bbfcde667e5da9c82ff937dce871951969cff2df4285a

See more details on using hashes here.

Provenance

The following attestation bundles were made for dispatch_kit-0.4.0.tar.gz:

Publisher: release.yml on falahat/dispatch-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dispatch_kit-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: dispatch_kit-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dispatch_kit-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1dcdf6c8c465c504250d856465c706f60b2856196ca5409adcd084c80c22337a
MD5 900204b4a03d096bdc8817f8d768f876
BLAKE2b-256 79243c8b00a64d915bbf9765220c2b7253378b40d64bddde941539b6772156bf

See more details on using hashes here.

Provenance

The following attestation bundles were made for dispatch_kit-0.4.0-py3-none-any.whl:

Publisher: release.yml on falahat/dispatch-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page