Skip to main content

Pure, fail-closed cost-gating for expensive remote/external work: a hard $ budget cap, backend routing (local->cloud->SDK), and opt-in audited API egress.

Project description

dispatch-kit

A tiny, pure, dependency-free library for gating expensive remote/external work — the same machinery for a cloud GPU job (a Cloud Run L4 reached over a tailnet) and a paid LLM/SDK API call (Gemini, Claude, Rowan). It answers three questions, fail-closed:

  • Can we afford it? — a hard, reserve-on-approval budget cap (per-run + per-month).
  • Where should it run? — a pure router: LOCAL → LAN → CLOUD → SDK, SDK opt-in only.
  • Is the external call safe? — opt-in, audited API egress with reference-only secrets.

It owns the policy (afford / route / approve / egress); your app keeps its job entity, persistence, and executor. The transport auth (who may talk) is a separate concern — pair this with tailnet-guard. Stdlib only; every check is fail-closed (default budget 0 = paid work off; SDK never auto-selected; a missing key refuses).

Use

from decimal import Decimal
from dispatch_kit import (
    BudgetCap, BudgetState, CostRates, admits, estimate_cost,   # the hard $ cap
    select_backend, BackendKind, ToolRequirements,              # the where
    SecretRef, ExternalEndpoint, log_egress,                    # opt-in API egress
    Approval,                                                   # the approval audit fact
)

# 1. Reserve-on-approval: refuse a job that would push past the cap (both windows).
rates = CostRates(gpu_usd_per_s=Decimal("0.0008"), vcpu_usd_per_s=Decimal("0.00001"),
                  gib_usd_per_s=Decimal("0.000002"), idle_tail_s=Decimal(600))
cost = estimate_cost(rates, max_runtime_s=3600, vcpus=8, memory_gib=32)   # an UPPER bound
decision = admits(cost, run_state, month_state, BudgetCap(run_usd=Decimal(50), month_usd=Decimal(500)))
if not decision.admitted:
    raise OverBudget(decision.reason)        # default cap is $0 — paid work is off until you set one

# 2. Pick where it runs — LOCAL first, SDK only if explicitly allowed.
backend = select_backend(my_backends, ToolRequirements(tool_id="cofold", min_vram_gb=24.0))

# 3. An LLM/SDK key is a REFERENCE (env var name), resolved at call time, never logged.
gemini = ExternalEndpoint("gemini", "https://generativelanguage.googleapis.com",
                          SecretRef("GEMINI_API_KEY"))
log_egress(gemini, detail="summarize")       # audit that data left the boundary
headers = {"Authorization": gemini.bearer()} # raises if the key is unset (never an unauth call)

What's in the box

Module Purpose
budget BudgetCap / BudgetState / CostRates / admits / estimate_cost — the hard, Decimal-exact, reserve-on-approval cap across a run + month window
estimate CostEstimate / HostCapabilities / vram_fits — the one "no GPU ⇒ a GPU job is infeasible" rule, shared by the gate and the router
routing BackendKind / BackendCapabilities / ToolRequirements / select_backend (generic over a Routable) — the pure LOCAL→LAN→CLOUD→SDK policy; SDK opt-in
egress SecretRef / ExternalEndpoint / log_egress — reference-only API keys, https-only, fail-closed on a missing key, audited egress (SDKs and LLM APIs)
approval Approval / ApprovalOutcome — the who/when/why audit fact for a gated job
dispatch JobStore / Transport / WorkerExecutor protocols + is_lease_stale / should_give_up / Lease — the run-it-once-recoverably contract (atomic claim, stale-reject, lease recovery); push vs pull is only the Transport adapter

Notes

  • The budget cap lives in your dispatch service, never the UI — an agent hitting the API directly is still gated. Default cap $0; if spend can't be computed, refuse.
  • Reserve on approval, reconcile on completion — approving reserves the estimate immediately so a burst counts against the cap; the worker's true runtime reconciles reserved → spent.
  • SDK / external egress is the one deliberate exception — never the default (allow_sdk / opt-in), always logged, the key sourced from a secret at call time and never written to a log.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dispatch_kit-0.1.0.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dispatch_kit-0.1.0-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file dispatch_kit-0.1.0.tar.gz.

File metadata

  • Download URL: dispatch_kit-0.1.0.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dispatch_kit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d33c18a06a91375f25997abccd835b1ef26d5b79bcca069242d17192122d8031
MD5 6b9e0bbc1f4c2db86d836d67b7a5722a
BLAKE2b-256 0b0b088f0df86c0c1d2d06c57e696fd233244123dbe1d2c82ae04c4eb3de905c

See more details on using hashes here.

Provenance

The following attestation bundles were made for dispatch_kit-0.1.0.tar.gz:

Publisher: release.yml on falahat/dispatch-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dispatch_kit-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dispatch_kit-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dispatch_kit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 44a973adeef917b233dba0c1e573be1c8c3e97ed6202b0ee44eea60ce3d4d4fb
MD5 380005493a70ced22d4e759ebe67f02d
BLAKE2b-256 b2b3461705c2e538a683b37e75d1dce31e7a365fb32488011f68d8365da2835d

See more details on using hashes here.

Provenance

The following attestation bundles were made for dispatch_kit-0.1.0-py3-none-any.whl:

Publisher: release.yml on falahat/dispatch-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page