The open trust layer for autonomous AI agents — deterministic gates, budget breakers, and tamper-evident action receipts.
Project description
Surety AI
Probabilistic agents. Verified actions.
The open execution boundary for autonomous AI agents. Agents may hallucinate; execution does not have to. Surety puts deterministic gates, hard exposure limits, and tamper-evident receipts between an agent's proposal and the real world.
Why · Quick start · Earned autonomy · Works with your stack · Evals · Production readiness · Examples · Strategy · Architecture · Spec · Roadmap
Why
In April 2026, a coding agent hit a credential mismatch in staging, found an API token in an unrelated file, and deleted a production database — and its backups — in nine seconds. Its post-incident summary: "I violated every principle I was given."
Principles stated in a prompt are not controls. The agent ecosystem has content guardrails (is this text safe?) and authorization (is this action permitted?), but almost nothing answers the questions that actually decide blast radius:
- How much autonomy has this agent earned? Static human-in-the-loop doesn't scale — reviewers drown, then reflexively approve, and "oversight" becomes theater. Approving everything and approving nothing both fail.
- What evidence does each decision leave? Post-incident forensics today rely on the agent's own logs — the defendant writes the police report.
- What stops a runaway loop at 3 a.m.? A hard ceiling, not a polite instruction.
Surety is that layer. Four invariants, enforced in code:
- Rules decide, LLMs propose. Only deterministic rules allow an action. The same action, policy, and state produce the same decision — there is no prompt to inject.
- Trust is earned per action-type. Agents graduate
SUPERVISED → PROBATIONARY → TRUSTED → BONDEDon track record, and one rejection demotes instantly. An email track record grants nothing for refunds. - Hard limits are hard. Daily action/spend ceilings in integer minor units. Checked at the gate, committed only after execution.
- Every decision leaves a receipt. Hash-chained, payload-hashed (never payload-stored), verifiable by a third party. An open spec, not a log format.
The next phase closes the loop: require independent evidence for an action's assumptions, hold unresolved actions against an exposure budget, verify the real outcome, and grant more autonomy only from verified success. We call this outcome-bonded autonomy. See the product strategy and roadmap.
Surety's research direction adds calibrated foresight without weakening the deterministic boundary: forecasting and ML may require simulation, canaries, approval, or denial, but may never override failed invariants or hard limits. See reliability research.
Quick start
npm install suretyai # Python: pip install suretyai
import { BondLimits, createGuard } from 'suretyai'
const limits = new BondLimits({ max_actions_per_day: 100, max_spend_per_day_minor: 10_000 })
const guard = createGuard(
[
limits.rule(),
{
id: 'refund-ceiling',
check: (a) => a.type !== 'payment.refund' || (a.payload.amount_minor as number) <= 5000,
reason: 'Refunds above £50.00 require human approval',
},
],
{ agent_id: 'billing-agent', chain: true }
)
const result = guard({ type: 'payment.refund', payload: { invoice: 'INV-1042', amount_minor: 9900 } })
result.allowed // false — deterministically, every time
result.reasons // ['Refunds above £50.00 require human approval']
result.receipt // tamper-evident Action Receipt for your audit store
Earned autonomy
The full pipeline adds the trust ledger, human approval gates, and oversight-health monitoring:
import { ApprovalSignalHealth, TrustLedger, WebhookApprovalGate, createPipeline } from 'suretyai'
const pipeline = createPipeline({
rules: [limits.rule(), refundCeiling],
trust: new TrustLedger(), // graduated autonomy
approval: new WebhookApprovalGate({ url: APPROVALS_URL }), // human-in-the-loop
health: new ApprovalSignalHealth(), // rubber-stamp detection
limits,
agent_id: 'billing-agent',
chain: true,
})
const result = await pipeline.run(action)
// decision: 'auto_approved' | 'gate_approved' | 'gate_rejected' | 'gate_timeout' | 'policy_blocked'
if (result.allowed) { await execute(action); limits.record(action) }
What that buys you, measured (run it yourself):
| Static HITL | No HITL | Surety graduated trust | |
|---|---|---|---|
| Human decisions per 100 routine actions | 100 | 0 | 30, falling toward 0 |
| Rogue action stopped before execution | ✅ (if reviewer awake) | ❌ | ✅ rules + gate + limits |
| Reviewer fatigue → reflexive approval | guaranteed at scale | n/a | detected and flagged |
| Misbehavior consequence | none | none | instant demotion |
| Audit trail | app logs | agent's own logs | hash-chained receipts |
And trust is asymmetric, like it is with people: ~30 clean approvals to earn autonomy, one rejection to lose it.
Works with your stack
One guard object; adapters for wherever your agents live. No framework lock-in, no rewrite:
// MCP — wrap any server's tool dispatcher
const safeTool = mcpGuard(guard, server.tool.bind(server))
// Claude Agent SDK — PreToolUse hook, sync, ~zero latency
const hook = claudePreToolUse(guard)
// OpenAI Agents SDK — input guardrail
new Agent({ inputGuardrails: [openaiGuardrail(guard)] })
Composes with — never replaces — the rest of the safety stack: content guardrails (LlamaFirewall, NeMo Guardrails) above, policy engines (OPA, Cedar) alongside. See where Surety sits.
Evals
Every claim above is backed by a reproducible eval — CI runs them on every push, so the README can't drift from the code. Reproduce with npm run eval:
| # | Claim | Result |
|---|---|---|
| E1 | Included adversarial corpus: case spoofing, string smuggling, type spoofing, credential laundering | 0/10 bypassed |
| E2 | Included canonicalization and collision cases (incl. the nested-key collision class — spec §3) | 6/6 correct |
| E3 | Graduated trust vs static HITL, 200 routine actions | 85% fewer human decisions |
| E4 | Oversight-health monitor: 4 rubber-stamp patterns + 1 healthy reviewer | 5/5 classified correctly |
Details and methodology: evals/.
The eval suite also includes a seeded comparative simulation that replays one labeled refund-action stream through unguarded execution, static HITL, and the real Surety guard. It reports both prevented loss and residual risk. Simulation validates mechanisms and hypotheses; field claims require independently labeled historical or shadow-mode traces from your own execution system.
Examples
Five runnable demos, no API keys needed — including the PocketOS incident replayed against Surety (3/3 destructive steps blocked, routine work unimpeded) and an agent earning its autonomy in 60 seconds. Index: examples/.
This repo dogfoods itself: the CI research agent runs under a Surety guard via a Claude Code PreToolUse hook — pushes to main blocked, workflow self-edits blocked, every tool call receipted (scripts/surety-hook.mjs).
Project status
v0.2 — core pipeline (guard, trust, gates, health, limits, receipts) shipped in TypeScript and Python with 80+ tests and the eval suite; MCP/Claude/OpenAI adapters shipped. Pre-1.0: API may still move; the Action Receipt spec is versioned independently. See the roadmap for what's next (receipt persistence, Slack gate, crewAI/LangGraph/pydantic-ai adapters, signed receipts).
Production readiness
Read this before putting Surety on a real money path — we'd rather you know the edges than discover them.
Ready now
- The deterministic guard, bond-limit checks, canonical hashing, and receipt chaining are pure and side-effect-free — safe to run inline on a hot path.
- TS and Python cores are at parity for guard / trust / limits / health; the full async pipeline and approval gates are TypeScript-only today.
Not ready yet — design around these
- State is in-memory and single-process.
TrustLedger,BondLimits,ApprovalSignalHealth, and the receipt chain live in process memory. Across concurrent workers or replicas you get per-process trust and limits — two workers can each approve up to the "daily" ceiling, and trust earned on one isn't seen by another. For now, run one guard instance behind a queue, or persist/reload manually viaTrustLedger.export()/from(). Durable Postgres/Redis backends with atomic limits are the headline of Phase 2. - Budget commit is the caller's job. The pipeline checks limits but does not record them; call
limits.record(action)after a successful execution or ceilings won't enforce (see examples/02). - Approval is a prediction, not proof. Trust graduates on approvals; close the loop with
trust.recordOutcome(success)so an approved-but-ineffective agent is demoted. Without outcome reporting, "trusted" means "approved," not "worked." - The bundled evals are internal simulations, not field evidence — they prove the code behaves as specified on synthetic workloads (the simulation deliberately leaves an
ambiguous_intentclass unblocked). Validate against your own traces before trusting the numbers.
In short: today Surety is production-ready as a single-instance decision boundary with manual state persistence. Distributed, durable, atomic state is next.
Documentation
| Architecture & design decisions | The stack position, pipeline, and 11 design decisions with rationale |
| Reliability research | Deterministic assurance informed by calibrated forecasting and ML |
| Action Receipt spec v0.1 | The vendor-neutral receipt format — implement it without Surety |
| Examples · Evals | Runnable demos and reproducible measurements |
| Roadmap | Phased plan with measurable exit criteria |
| Contributing · Security | How to help, how to report |
License
Apache-2.0 — including its explicit patent grant: everything here is freely usable, forever.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file suretyai-0.2.0.tar.gz.
File metadata
- Download URL: suretyai-0.2.0.tar.gz
- Upload date:
- Size: 14.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f44183d25783b8c7447cac040d562c84ee601dedf43a791bd13a743a6e5081b1
|
|
| MD5 |
b24e2cf8c771785fc7c1e9cd8300dd57
|
|
| BLAKE2b-256 |
8a9ab175e6fb0ef6074e02de67ad14aa8730dea44f6c1a9096d4ca8f90f45513
|
File details
Details for the file suretyai-0.2.0-py3-none-any.whl.
File metadata
- Download URL: suretyai-0.2.0-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89f4ba280951a8db47a2c28220dfd79e6c18edc7df7db8db5af57491beef2445
|
|
| MD5 |
7872013c3e7544dc637d0f3dffbb3f66
|
|
| BLAKE2b-256 |
f34c835dc6deca0d9efb1918afd5119337f05fac7c3695f226a470c71e406b69
|