Skip to main content

Pure, fail-closed cost-gating for expensive remote/external work: a hard $ budget cap, backend routing (local->cloud->SDK), and opt-in audited API egress.

Project description

dispatch-kit

A tiny, pure, dependency-free library for gating expensive remote/external work — the same machinery for a cloud GPU job (a Cloud Run L4 reached over a tailnet) and a paid LLM/SDK API call (Gemini, Claude, Rowan). It answers three questions, fail-closed:

  • Can we afford it? — a hard, reserve-on-approval budget cap (per-run + per-month).
  • Where should it run? — a pure router: LOCAL → LAN → CLOUD → SDK, SDK opt-in only.
  • Is the external call safe? — opt-in, audited API egress with reference-only secrets.

It owns the policy (afford / route / approve / egress); your app keeps its job entity, persistence, and executor. The transport auth (who may talk) is a separate concern — pair this with tailnet-guard. Stdlib only; every check is fail-closed (default budget 0 = paid work off; SDK never auto-selected; a missing key refuses).

Use

from decimal import Decimal
from dispatch_kit import (
    BudgetCap, BudgetState, CostRates, admits, estimate_cost,   # the hard $ cap
    select_backend, BackendKind, ToolRequirements,              # the where
    SecretRef, ExternalEndpoint, log_egress,                    # opt-in API egress
    Approval,                                                   # the approval audit fact
)

# 1. Reserve-on-approval: refuse a job that would push past the cap (both windows).
rates = CostRates(gpu_usd_per_s=Decimal("0.0008"), vcpu_usd_per_s=Decimal("0.00001"),
                  gib_usd_per_s=Decimal("0.000002"), idle_tail_s=Decimal(600))
cost = estimate_cost(rates, max_runtime_s=3600, vcpus=8, memory_gib=32)   # an UPPER bound
decision = admits(cost, run_state, month_state, BudgetCap(run_usd=Decimal(50), month_usd=Decimal(500)))
if not decision.admitted:
    raise OverBudget(decision.reason)        # default cap is $0 — paid work is off until you set one

# 2. Pick where it runs — LOCAL first, SDK only if explicitly allowed.
backend = select_backend(my_backends, ToolRequirements(tool_id="cofold", min_vram_gb=24.0))

# 3. An LLM/SDK key is a REFERENCE (env var name), resolved at call time, never logged.
gemini = ExternalEndpoint("gemini", "https://generativelanguage.googleapis.com",
                          SecretRef("GEMINI_API_KEY"))
log_egress(gemini, detail="summarize")       # audit that data left the boundary
headers = {"Authorization": gemini.bearer()} # raises if the key is unset (never an unauth call)

What's in the box

Module Purpose
budget BudgetCap / BudgetState / CostRates / admits / estimate_cost — the hard, Decimal-exact, reserve-on-approval cap across a run + month window
estimate CostEstimate / HostCapabilities / vram_fits — the one "no GPU ⇒ a GPU job is infeasible" rule, shared by the gate and the router
routing BackendKind / BackendCapabilities / ToolRequirements / select_backend (generic over a Routable) — the pure LOCAL→LAN→CLOUD→SDK policy; SDK opt-in
egress SecretRef / ExternalEndpoint / log_egress — reference-only API keys, https-only, fail-closed on a missing key, audited egress (SDKs and LLM APIs)
approval Approval / ApprovalOutcome — the who/when/why audit fact for a gated job
dispatch JobStore / Transport / WorkerExecutor protocols + is_lease_stale / should_give_up / Lease — the run-it-once-recoverably contract (atomic claim, stale-reject, lease recovery); push vs pull is only the Transport adapter

Notes

  • The budget cap lives in your dispatch service, never the UI — an agent hitting the API directly is still gated. Default cap $0; if spend can't be computed, refuse.
  • Reserve on approval, reconcile on completion — approving reserves the estimate immediately so a burst counts against the cap; the worker's true runtime reconciles reserved → spent.
  • SDK / external egress is the one deliberate exception — never the default (allow_sdk / opt-in), always logged, the key sourced from a secret at call time and never written to a log.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dispatch_kit-0.3.0.tar.gz (28.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dispatch_kit-0.3.0-py3-none-any.whl (24.0 kB view details)

Uploaded Python 3

File details

Details for the file dispatch_kit-0.3.0.tar.gz.

File metadata

  • Download URL: dispatch_kit-0.3.0.tar.gz
  • Upload date:
  • Size: 28.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dispatch_kit-0.3.0.tar.gz
Algorithm Hash digest
SHA256 f07646c5a62c480c7d8774fa6a7ffbdb3d4df0c68c5786b7ebb2a5681590f333
MD5 2d48a6f55d706b5e155a7c392068a883
BLAKE2b-256 71dfd1f559a1a4c7366c3b109fd509039475d8aac99a97066ddcdf15d98a8f34

See more details on using hashes here.

Provenance

The following attestation bundles were made for dispatch_kit-0.3.0.tar.gz:

Publisher: release.yml on falahat/dispatch-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dispatch_kit-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dispatch_kit-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 24.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dispatch_kit-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 693e31a9efc250fd533f527d83076fdb18f87b12d8e7ebf35310d18f3925c158
MD5 433dc0b84a924f204ba01b136bcbe2e5
BLAKE2b-256 15783d0a53baf8a03b43fb5add96245deb600bcc4bc62dc61a134ce0a0a36943

See more details on using hashes here.

Provenance

The following attestation bundles were made for dispatch_kit-0.3.0-py3-none-any.whl:

Publisher: release.yml on falahat/dispatch-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page