The open trust layer for autonomous AI agents — deterministic gates, budget breakers, and tamper-evident action receipts.

These details have not been verified by PyPI

Project links

Project description

Surety AI

Probabilistic agents. Verified actions.

The open execution boundary for autonomous AI agents. Agents may hallucinate; execution does not have to. Surety puts deterministic gates, hard exposure limits, and tamper-evident receipts between an agent's proposal and the real world.

Why · Quick start · Earned autonomy · Works with your stack · Evals · Production readiness · Examples · Strategy · Architecture · Spec · Roadmap

Why

In April 2026, a coding agent hit a credential mismatch in staging, found an API token in an unrelated file, and deleted a production database — and its backups — in nine seconds. Its post-incident summary: "I violated every principle I was given."

Principles stated in a prompt are not controls. The agent ecosystem has content guardrails (is this text safe?) and authorization (is this action permitted?), but almost nothing answers the questions that actually decide blast radius:

How much autonomy has this agent earned? Static human-in-the-loop doesn't scale — reviewers drown, then reflexively approve, and "oversight" becomes theater. Approving everything and approving nothing both fail.
What evidence does each decision leave? Post-incident forensics today rely on the agent's own logs — the defendant writes the police report.
What stops a runaway loop at 3 a.m.? A hard ceiling, not a polite instruction.

Surety is that layer. Four invariants, enforced in code:

Rules decide, LLMs propose. Only deterministic rules allow an action. The same action, policy, and state produce the same decision — there is no prompt to inject.
Trust is earned per action-type. Agents graduate SUPERVISED → PROBATIONARY → TRUSTED → BONDED on track record, and one rejection demotes instantly. An email track record grants nothing for refunds.
Hard limits are hard. Daily action/spend ceilings in integer minor units. Checked at the gate, committed only after execution.
Every decision leaves a receipt. Hash-chained, payload-hashed (never payload-stored), verifiable by a third party. An open spec, not a log format.

The next phase closes the loop: require independent evidence for an action's assumptions, hold unresolved actions against an exposure budget, verify the real outcome, and grant more autonomy only from verified success. We call this outcome-bonded autonomy. See the product strategy and roadmap.

Surety's research direction adds calibrated foresight without weakening the deterministic boundary: forecasting and ML may require simulation, canaries, approval, or denial, but may never override failed invariants or hard limits. See reliability research.

Quick start

npm install @suretyainpm/suretyai        # Python: pip install suretyai

import { BondLimits, createGuard } from '@suretyainpm/suretyai'

const limits = new BondLimits({ max_actions_per_day: 100, max_spend_per_day_minor: 10_000 })

const guard = createGuard(
  [
    limits.rule(),
    {
      id: 'refund-ceiling',
      check: (a) => a.type !== 'payment.refund' || (a.payload.amount_minor as number) <= 5000,
      reason: 'Refunds above £50.00 require human approval',
    },
  ],
  { agent_id: 'billing-agent', chain: true }
)

const result = guard({ type: 'payment.refund', payload: { invoice: 'INV-1042', amount_minor: 9900 } })

result.allowed   // false — deterministically, every time
result.reasons   // ['Refunds above £50.00 require human approval']
result.receipt   // tamper-evident Action Receipt for your audit store

Earned autonomy

The full pipeline adds the trust ledger, human approval gates, and oversight-health monitoring:

import { ApprovalSignalHealth, TrustLedger, WebhookApprovalGate, createPipeline } from '@suretyainpm/suretyai'

const pipeline = createPipeline({
  rules: [limits.rule(), refundCeiling],
  trust: new TrustLedger(),                                  // graduated autonomy
  approval: new WebhookApprovalGate({ url: APPROVALS_URL }), // human-in-the-loop
  health: new ApprovalSignalHealth(),                        // rubber-stamp detection
  limits,
  agent_id: 'billing-agent',
  chain: true,
})

const result = await pipeline.run(action)
// decision: 'auto_approved' | 'gate_approved' | 'gate_rejected' | 'gate_timeout' | 'policy_blocked'
if (result.allowed) { await execute(action); limits.record(action) }

What that buys you, measured (run it yourself):

	Static HITL	No HITL	Surety graduated trust
Human decisions per 100 routine actions	100	0	30, falling toward 0
Rogue action stopped before execution	✅ (if reviewer awake)	❌	✅ rules + gate + limits
Reviewer fatigue → reflexive approval	guaranteed at scale	n/a	detected and flagged
Misbehavior consequence	none	none	instant demotion
Audit trail	app logs	agent's own logs	hash-chained receipts

And trust is asymmetric, like it is with people: ~30 clean approvals to earn autonomy, one rejection to lose it.

Works with your stack

One guard object; adapters for wherever your agents live. No framework lock-in, no rewrite:

// MCP — wrap any server's tool dispatcher
const safeTool = mcpGuard(guard, server.tool.bind(server))

// Claude Agent SDK — PreToolUse hook, sync, ~zero latency
const hook = claudePreToolUse(guard)

// OpenAI Agents SDK — input guardrail
new Agent({ inputGuardrails: [openaiGuardrail(guard)] })

Composes with — never replaces — the rest of the safety stack: content guardrails (LlamaFirewall, NeMo Guardrails) above, policy engines (OPA, Cedar) alongside. See where Surety sits.

Evals

Every claim above is backed by a reproducible eval — CI runs them on every push, so the README can't drift from the code. Reproduce with npm run eval:

#	Claim	Result
E1	Included adversarial corpus: case spoofing, string smuggling, type spoofing, credential laundering	0/10 bypassed
E2	Included canonicalization and collision cases (incl. the nested-key collision class — spec §3)	6/6 correct
E3	Graduated trust vs static HITL, 200 routine actions	85% fewer human decisions
E4	Oversight-health monitor: 4 rubber-stamp patterns + 1 healthy reviewer	5/5 classified correctly

Details and methodology: evals/.

The eval suite also includes a seeded comparative simulation that replays one labeled refund-action stream through unguarded execution, static HITL, and the real Surety guard. It reports both prevented loss and residual risk. Simulation validates mechanisms and hypotheses; field claims require independently labeled historical or shadow-mode traces from your own execution system.

Examples

Five runnable demos, no API keys needed — including the PocketOS incident replayed against Surety (3/3 destructive steps blocked, routine work unimpeded) and an agent earning its autonomy in 60 seconds. Index: examples/.

This repo dogfoods itself: the CI research agent runs under a Surety guard via a Claude Code PreToolUse hook — pushes to main blocked, workflow self-edits blocked, every tool call receipted (scripts/surety-hook.mjs).

Project status

v0.2 — core pipeline (guard, trust, gates, health, limits, receipts) shipped in TypeScript and Python with 80+ tests and the eval suite; MCP/Claude/OpenAI adapters shipped. Pre-1.0: API may still move; the Action Receipt spec is versioned independently. See the roadmap for what's next (receipt persistence, Slack gate, crewAI/LangGraph/pydantic-ai adapters, signed receipts).

Production readiness

Read this before putting Surety on a real money path — we'd rather you know the edges than discover them.

Ready now

The deterministic guard, bond-limit checks, canonical hashing, and receipt chaining are pure and side-effect-free — safe to run inline on a hot path.
TS and Python cores are at parity for guard / trust / limits / health; the full async pipeline and approval gates are TypeScript-only today.

Not ready yet — design around these

State is in-memory and single-process. TrustLedger, BondLimits, ApprovalSignalHealth, and the receipt chain live in process memory. Across concurrent workers or replicas you get per-process trust and limits — two workers can each approve up to the "daily" ceiling, and trust earned on one isn't seen by another. For now, run one guard instance behind a queue, or persist/reload manually via TrustLedger.export()/from(). Durable Postgres/Redis backends with atomic limits are the headline of Phase 2.
Budget commit is the caller's job. The pipeline checks limits but does not record them; call limits.record(action) after a successful execution or ceilings won't enforce (see examples/02).
Approval is a prediction, not proof. Trust graduates on approvals; close the loop with trust.recordOutcome(success) so an approved-but-ineffective agent is demoted. Without outcome reporting, "trusted" means "approved," not "worked."
The bundled evals are internal simulations, not field evidence — they prove the code behaves as specified on synthetic workloads (the simulation deliberately leaves an ambiguous_intent class unblocked). Validate against your own traces before trusting the numbers.

In short: today Surety is production-ready as a single-instance decision boundary with manual state persistence. Distributed, durable, atomic state is next.

Documentation


Architecture & design decisions	The stack position, pipeline, and 11 design decisions with rationale
Reliability research	Deterministic assurance informed by calibrated forecasting and ML
Action Receipt spec v0.1	The vendor-neutral receipt format — implement it without Surety
Examples · Evals	Runnable demos and reproducible measurements
Roadmap	Phased plan with measurable exit criteria
Contributing · Security	How to help, how to report

License

Apache-2.0 — including its explicit patent grant: everything here is freely usable, forever.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.1

Jun 14, 2026

0.2.0

Jun 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

suretyai-0.2.1.tar.gz (14.8 kB view details)

Uploaded Jun 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

suretyai-0.2.1-py3-none-any.whl (15.2 kB view details)

Uploaded Jun 14, 2026 Python 3

File details

Details for the file suretyai-0.2.1.tar.gz.

File metadata

Download URL: suretyai-0.2.1.tar.gz
Upload date: Jun 14, 2026
Size: 14.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for suretyai-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`61073a7ad2ea0f07d9532412b6adaae4517626ede2def80a1d9fe02ae94e20dd`
MD5	`096ca5ade0e510a5d544a8528a25720c`
BLAKE2b-256	`19cd993e15acebf15300d8d15086cfe978c96994c493e9bfad214833d4189328`

See more details on using hashes here.

File details

Details for the file suretyai-0.2.1-py3-none-any.whl.

File metadata

Download URL: suretyai-0.2.1-py3-none-any.whl
Upload date: Jun 14, 2026
Size: 15.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for suretyai-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e4dbb5264851f74075d3f1f0820156ffcb571691a044d6aee8d42ec1262096ea`
MD5	`78f357e0ef4b2eafd4e85fd9cf4d4343`
BLAKE2b-256	`72be70c8fd5cdfabdbb8f3baccf024626ca0aa170e3ec6ab2e62de27999c231a`

See more details on using hashes here.

suretyai 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Surety AI

Probabilistic agents. Verified actions.

Why

Quick start

Earned autonomy

Works with your stack

Evals

Examples

Project status

Production readiness

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes