Skip to main content

Pure, fail-closed cost-gating for expensive remote/external work: a hard $ budget cap, backend routing (local->cloud->SDK), and opt-in audited API egress.

Project description

dispatch-kit

A tiny, pure, dependency-free library for gating expensive remote/external work — the same machinery for a cloud GPU job (a Cloud Run L4 reached over a tailnet) and a paid LLM/SDK API call (Gemini, Claude, Rowan). It answers three questions, fail-closed:

  • Can we afford it? — a hard, reserve-on-approval budget cap (per-run + per-month).
  • Where should it run? — a pure router: LOCAL → LAN → CLOUD → SDK, SDK opt-in only.
  • Is the external call safe? — opt-in, audited API egress with reference-only secrets.

It owns the policy (afford / route / approve / egress); your app keeps its job entity, persistence, and executor. The transport auth (who may talk) is a separate concern — pair this with tailnet-guard. Stdlib only; every check is fail-closed (default budget 0 = paid work off; SDK never auto-selected; a missing key refuses).

Use

from decimal import Decimal
from dispatch_kit import (
    BudgetCap, BudgetState, CostRates, admits, estimate_cost,   # the hard $ cap
    select_backend, BackendKind, ToolRequirements,              # the where
    SecretRef, ExternalEndpoint, log_egress,                    # opt-in API egress
    Approval,                                                   # the approval audit fact
)

# 1. Reserve-on-approval: refuse a job that would push past the cap (both windows).
rates = CostRates(gpu_usd_per_s=Decimal("0.0008"), vcpu_usd_per_s=Decimal("0.00001"),
                  gib_usd_per_s=Decimal("0.000002"), idle_tail_s=Decimal(600))
cost = estimate_cost(rates, max_runtime_s=3600, vcpus=8, memory_gib=32)   # an UPPER bound
decision = admits(cost, run_state, month_state, BudgetCap(run_usd=Decimal(50), month_usd=Decimal(500)))
if not decision.admitted:
    raise OverBudget(decision.reason)        # default cap is $0 — paid work is off until you set one

# 2. Pick where it runs — LOCAL first, SDK only if explicitly allowed.
backend = select_backend(my_backends, ToolRequirements(tool_id="cofold", min_vram_gb=24.0))

# 3. An LLM/SDK key is a REFERENCE (env var name), resolved at call time, never logged.
gemini = ExternalEndpoint("gemini", "https://generativelanguage.googleapis.com",
                          SecretRef("GEMINI_API_KEY"))
log_egress(gemini, detail="summarize")       # audit that data left the boundary
headers = {"Authorization": gemini.bearer()} # raises if the key is unset (never an unauth call)

What's in the box

Module Purpose
budget BudgetCap / BudgetState / CostRates / admits / estimate_cost — the hard, Decimal-exact, reserve-on-approval cap across a run + month window
estimate CostEstimate / HostCapabilities / vram_fits — the one "no GPU ⇒ a GPU job is infeasible" rule, shared by the gate and the router
routing BackendKind / BackendCapabilities / ToolRequirements / select_backend (generic over a Routable) — the pure LOCAL→LAN→CLOUD→SDK policy; SDK opt-in
egress SecretRef / ExternalEndpoint / log_egress — reference-only API keys, https-only, fail-closed on a missing key, audited egress (SDKs and LLM APIs)
approval Approval / ApprovalOutcome — the who/when/why audit fact for a gated job
dispatch JobStore / Transport / WorkerExecutor protocols + is_lease_stale / should_give_up / Lease — the run-it-once-recoverably contract (atomic claim, stale-reject, lease recovery); push vs pull is only the Transport adapter

Notes

  • The budget cap lives in your dispatch service, never the UI — an agent hitting the API directly is still gated. Default cap $0; if spend can't be computed, refuse.
  • Reserve on approval, reconcile on completion — approving reserves the estimate immediately so a burst counts against the cap; the worker's true runtime reconciles reserved → spent.
  • SDK / external egress is the one deliberate exception — never the default (allow_sdk / opt-in), always logged, the key sourced from a secret at call time and never written to a log.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dispatch_kit-0.2.0.tar.gz (24.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dispatch_kit-0.2.0-py3-none-any.whl (21.0 kB view details)

Uploaded Python 3

File details

Details for the file dispatch_kit-0.2.0.tar.gz.

File metadata

  • Download URL: dispatch_kit-0.2.0.tar.gz
  • Upload date:
  • Size: 24.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dispatch_kit-0.2.0.tar.gz
Algorithm Hash digest
SHA256 eb007df1d86c0d0bed4d13f69699138fc2f7c63c57ce17586482d99bfc312c68
MD5 02a361894f6837dcfbededa0aee6282e
BLAKE2b-256 fb2b339f9239733ceb4b60f8a0912dcf5c68c97f85f72d6f0e2f55f7665ac387

See more details on using hashes here.

Provenance

The following attestation bundles were made for dispatch_kit-0.2.0.tar.gz:

Publisher: release.yml on falahat/dispatch-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dispatch_kit-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dispatch_kit-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 21.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dispatch_kit-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8dbd7de3fa1488b29bd346d3972b4c5eba9088b33415a0f2d2e984c2660fa961
MD5 d91e437f3f703dca3f00ef6b284ce7da
BLAKE2b-256 22b41c80aa1a471107fc916e462ffb83302c971f795e35062bc1e31b4802ab7b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dispatch_kit-0.2.0-py3-none-any.whl:

Publisher: release.yml on falahat/dispatch-kit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page