Runtime contract enforcement for LLM agent systems

These details have not been verified by PyPI

Project links

Project description

Sponsio

⭐ Help us grow the Sponsio community for better shared Contract Library and policy enforcement. Star the repo!

Sponsio

Runtime enforcement for AI agents. Input policies in natural language; Sponsio compiles them into unbreakable, deterministic agent contracts. Enforced under 0.01ms, zero LLM runtime cost, covers all 10 OWASP Agentic risks.

An agent contract is a runtime check at every agent action, backed by formal methods — NOT a system prompt your agent can ignore or jailbreak.

Works with any stack. LangChain, Claude Agent, OpenAI Agents, Google ADK, CrewAI, Vercel AI, MCP, or any custom tool-calling loop. Python · TypeScript · Prompt · Agent Skills.

Demo video coming soon

SOTA Agent Safety Solutions

Sponsio architecture: Agent Flow + (Natural Language + Pattern Library) compile into Contracts (Assumption → Enforcement), enforced by a Fuzzy LTL Monitor (deterministic + stochastic) that decides Pass / Block · Warn · Escalate / Redirect for every function call, with full audit trail logs feeding back to the agent.

On ODCV-Bench — a third-party benchmark from McGill DMaS — 12 frontier LLMs × 80 trajectories (Claude-Opus-4.6 included), unguarded models cheat in 11.5%–66.7% of runs. With Sponsio, 84.5% of misalignment is blocked on average, while the next-best publicly announced runtime guardrail (Salus, YC W26, launch Feb 2026) reaches 52% on the same benchmark. On the Financial-Audit-Fraud-Finding scenario, frontier models commit fraud in 67% of trials (16/24); with Sponsio, 100% blocked.

Why Sponsio

Approach	When it works	Where it fails	How Sponsio solves
Prompt-injection Filters	Pre-generation, on input text	Drifts on novel phrasings; sees text, not tool calls; no notion of action history	Enforces which tools may run, in what order, with what arguments, before function call executes, with full trace context
Output Validators	Post-generation, on response strings	The mistakes (e.g. refund, DB write, API call) may already have fired	Blocks the call before execution; reasons over the full action history, not just the latest string
LLM-as-Judge	Flexible, handles fuzzy properties; useful for offline eval	Stochastic verdicts, hundreds-of-ms latency, itself prompt-injectable - unsuitable as a synchronous gate	Sub-0.01ms deterministic checks, zero LLM in the hot path; stochastic pipeline is opt-in for fuzzy properties
Sandboxing & Access Control Lists	Strong perimeter for identity- and resource-level isolation	Narrows agent capability. Gates by who and what resource, not by behavior sequence	Enforces temporal contracts over the action sequence, including ordering, history, and multi-step invariants, preserving agent capability

Compared to other deterministic enforcers, Sponsio's edge:

1. Temporal contracts over sequential actions, not stateless rule matching. Existing enforcers evaluate each action in isolation. Sponsio reasons over the full trajectory: "verify_recipient before send_email", "no external calls after PII access", "refund_payment ≤ 3 calls per session".

2. Machine-checkable, not heuristic. Contracts compile to LTL formulas, then to deterministic finite automata. Every verdict is a deterministic DFA transition, not a probabilistic confidence score. Same proof technique used in hardware verification (Intel FPU correctness, AWS S3 TLA+). How it works →

3. Zero to protected in minutes, no DSL learning curve. Existing tools require hand-written YAML / Rego / Cedar policies from scratch. Sponsio offers four paths in:

Auto-inferred — sponsio init (interactive wizard) reads your tool signatures and writes starter contracts
Contract library — include pre-built bundles by capability (sponsio:capability/shell, …/filesystem) or by incident (sponsio:incident/openclaw); each bundle composes 44 det patterns underneath (sto atoms ship in Sponsio Cloud)
Natural language — sponsio validate "..." compiles plain English to LTL
Policy doc — sponsio scan --policy security.md parses an existing compliance document

4. Framework-agnostic and low-dependency. Other tools ship as opinionated stacks — bundling identity, SRE, dashboards, orchestration. Sponsio is a single enforcement library that plugs in alongside whatever observability, IAM, and orchestration you already use.

Quick start

Pick your project's language. Instant onboarding with a single prompt or a 2-line CLI command.

Python

Paste into Claude Code / Codex / Cursor. The agent helps run the full onboarding process. Click for the full prompt template. Note: Cursor may not be able to explicitly show what Sponsio has blocked in the conversation, due to its own harness design.

Or run the CLI yourself:

pip install sponsio
sponsio init .

init is an interactive wizard. It detects your framework (LangGraph / OpenAI / Claude Agent / Vercel AI / CrewAI / MCP / …), asks which IDE hosts to wire up (Claude Code / Codex / Cursor / OpenClaw, each at none / skill / full level), and observe vs enforce mode. Then it writes sponsio.yaml and prints the 2-line patch:

from sponsio.langgraph import Sponsio
from langgraph.prebuilt import create_react_agent

guard = Sponsio(config="sponsio.yaml", agent_id="coding_agent")
agent = create_react_agent(model, guard.wrap(tools))

TypeScript

Paste into Claude Code / Codex / Cursor:

Or run the CLI yourself:

npm install -D @sponsio/sdk
npx sponsio init .

Note — the TS wizard is currently single-axis (provider × mode × agent). For the full multi-axis flow that also installs IDE-host plugins (Claude Code / Codex / Cursor / OpenClaw), paste the Python prompt above into your IDE agent — it works on TS projects too (drives the Python sponsio CLI, writes a TS-compatible sponsio.yaml).

import { Sponsio } from "@sponsio/sdk";
import { wrapTools } from "@sponsio/sdk/langchain";
import { ToolNode } from "@langchain/langgraph/prebuilt";

const guard = new Sponsio({ config: "sponsio.yaml", agentId: "coding_agent" });
const toolNode = new ToolNode(wrapTools(tools, guard));

For OpenClaw Community

The Python prompt above is the install path for you too. Pick openclaw=full when the wizard asks about IDE hosts; Sponsio then gates every before_tool_call event in your OpenClaw runtime through the contract engine, with ClawHavoc + CVE-2026-25253 coverage via the bundled sponsio:incident/openclaw pack.

Watch live blocks in your terminal — every Sponsio decision against your OpenClaw runtime streams here:

sponsio host trace openclaw --follow

sponsio.yaml can also be hand-written, scanned from a policy doc (sponsio scan --policy policy.md), or mined from traces (sponsio refresh). Syntax: docs/concepts/contracts.md.

Full walkthrough: QUICKSTART.md — config reference, observe → enforce flip, sponsio refresh, CI wiring, troubleshooting.

Benchmarks & Performance

Sponsio is benchmarked on two public agent-safety suites covering two distinct failure modes — rational KPI-pressure metric gaming, and dangerous bash / python snippet detection. All offline-replay against published trajectories, not Sponsio-authored test sets, and library-only (no per-scenario LLM scan on the blocking path).

Benchmark	What it measures	Sponsio result
ODCV-Bench (12 LLMs × 80 trajectories)	Intent integrity under KPI pressure	84.5% blocked (next-best published baseline: 52%) · 0 new FPs on clean scenarios
RedCode-Exec (1,410 cases)	Dangerous bash / python snippet detection	95% bash · 90% python · 92% combined · 0% utility FP on 60-file clean-code audit

Both libraries ship as loadable contract packs. Semantic properties det can't fingerprint (tone, hallucination, scope drift in NL output) require an LLM judge — the OSS engine ships a Judge extension point you can plug; the managed stochastic pipeline is a Sponsio Cloud feature.

Logic-flaw failures are deterministically catchable too

Conventional wisdom said RedCode's logic-flaw categories — biased decision logic, weak regex bypasses, missing-authorisation patterns, algorithmic flaws, memory leaks — were "behavioural defects with no syscall fingerprint" and required a semantic LLM judge. We pushed back on that label. Every such failure surfaces as a finite code-text pattern; once the pattern is enumerated, det binds. Seven new layers (bias_logic, weak_regex, algo_flaw, memory_leak, match_no_default, missing_auth, redcode_unauth_marker) lifted RedCode python from 69% → 90% with 0/60 FP on a clean-code audit (Sponsio's own source, tests, and API routes). The det/sto boundary is wider than most guardrail authors assumed; sto stays for properties that genuinely live in free-form output (tone, hallucination, faithfulness) — not for code-shape patterns whose finite enumeration was just under-explored.

Hot-Path Performance

Workload	Contracts	p50	p99
Synthetic micro-bench (single contract, pre-warmed DFA)	1	0.0052 ms	0.012 ms
ODCV-Bench mandated (1,438 calls, scan-discovered)	6–18	0.139 ms	0.765 ms
RedCode bash (3,848 per-command calls)	7	0.434 ms	0.558 ms
RedCode python (810 whole-script calls)	9	0.811 ms	1.035 ms

Backend-engineer anchor: at 0.139 ms p50 on ODCV mandated, Sponsio's hot path adds less overhead than a single local Redis read (typical 0.1–0.5 ms).

5,000×–60,000× faster than any LLM-as-judge guardrail (gpt-4o-mini, Lakera Guard, OpenAI Moderation — all 50–800 ms per check) on the same per-tool-call workload, at zero LLM cost on the hot path. Per-call latency scales linearly with contract count; p99 stays under 1.04 ms across every measured workload. The heaviest scenario (9-contract layered regex over a whole RedCode python script) is still 50× faster than the cheapest LLM-as-judge call.

Full per-model breakdown, methodology, harness scripts: docs/reference/benchmarks.md.

Today's numbers are starting points, not ceilings

production traces ──→ sponsio scan ──→ proposed contracts
       ↑                                       │
       │                                       ▼
       └──────── enforcement ←──────── library (versioned)

Today's 84.5% / 92% are starting points, not ceilings. The library grows from your traces and ships back upstream — every new attack pattern, every newly observed unsafe call, feeds the next release.

Contract Library

Sixteen contract bundles ship out of the box, organized by tier (always-on / per-tool / per-incident). Each bundle is a YAML pack composed from Sponsio's 44 det patterns (sto atoms ship in Sponsio Cloud). Drop one into sponsio.yaml and your agent is guarded against a known failure class in one line, with no per-contract authoring. The seven highlighted below are the most commonly used.

Starter bundles

Bundle	Tier	Rules	Who it's for
`sponsio:core/universal`	Always-on	5 sto (Cloud)	Any LLM agent. Response-scoped checks: prompt injection, jailbreak, harm, toxic, semantic PII. Requires a configured judge — managed in Sponsio Cloud, or BYO judge via the OSS `Judge` extension point. Without one, these log-and-skip on OSS.
`sponsio:core/runaway`	Always-on	5 det	Any agent with token use, delegation, or tool loops. The "while(true) with a credit card" defense: token budgets, delegation depth, loop caps.
`sponsio:capability/shell`	Per-tool	11 det	Agents exposing `exec` / `bash`. Catches `rm -rf /`, fork bombs, `curl \| bash`, reverse shells, line-continuation evasion. Inspired by Claude Code #10077 (rm -rf $HOME, Oct 2025), the Replit prod-DB wipe (Fortune coverage, Jul 2025), and the Ansible `rm -rf {foo}/{bar}` postmortem on 1,535 servers (Marsala, 2016).
`sponsio:capability/filesystem`	Per-tool	13 det	Agents exposing `read` / `write` / `edit` / `apply_patch`. Sensitive-path denies, workspace scoping, bootstrap-file gates (`CLAUDE.md`, `AGENTS.md`, `.cursorrules`). Inspired by the OpenClaw weather-skill `.env` exfil and the Cursor `.cursorignore` bypass (CVE-2025-64110 / GHSA-vhc2-fjv4-wqch).
`sponsio:incident/openclaw`	Incident	45 mixed	OpenClaw / ClawCode users. Covers CVE-2026-25253 (WebSocket 1-click RCE), ClawHavoc — 1,184 malicious skills on ClawHub (Koi Security disclosure, Feb 2026), the `--yolo` flag, and the weather-skill exfil. A worked example to fork rules from.
`sponsio:incident/cursor-railway-wipe`	Incident	mixed	Replays the PocketOS production-DB wipe (Apr 24, 2026) — Cursor + Claude Opus 4.6 deleted prod + backups in 9 seconds via an over-scoped Railway API token. (Tom's Hardware · Railway's own postmortem) Catches credential-scope abuse + destructive-API gates.
`sponsio:incident/claude-code-secret-bypass`	Incident	mixed	Replays CVE-2025-55284 (overly broad safe-command allowlist → file-read confirmation bypass) and the deny-rule cap bypass (50-subcommand padding silently disables deny rules). Catches secret reads + arg-padding evasion.

# sponsio.yaml — one-line bundle inclusion
agents:
  my_agent:
    workspace: "/srv/my-bot"
    include:
      - sponsio:core/runaway          # always-on
      - sponsio:core/universal        # always-on
      - sponsio:capability/shell      # if your agent runs commands
      - sponsio:capability/filesystem # if your agent touches files

sponsio init auto-selects tier-0 bundles based on your detected tool inventory. You can disable or retune individual rules without forking the pack: customized: lets you target rules by their desc, pack_source, or pattern field. Rename canonical tool names (exec, read, edit) to your agent's via tool_rename:.

Full bundle reference is at docs/reference/contract-lib.md. The underlying primitives that bundles compose are catalogued separately: 44 det patterns in docs/reference/patterns.md. Sto atoms (LLM-judge evaluators for tone, hallucination, scope drift, etc.) are part of Sponsio Cloud — the OSS engine ships a Judge extension point for bring-your-own-judge use.

Want a bundle for your agent type? This is currently the highest-leverage way to contribute. Open an issue with your incident, CVE, or pattern.

Integrations

Pick your framework — each block expands to a drop-in snippet. Python and TypeScript share the same engine and DSL.

No framework — custom tool-calling loop

from sponsio import Sponsio

guard = Sponsio(config="sponsio.yaml", agent_id="bank_bot")

for name, args in agent_calls:
    result = guard.guard_before(name, args)
    if result.blocked:
        continue
    output = tools[name](**args)
    guard.guard_after(name, output)

import { Sponsio } from "@sponsio/sdk";

const guard = new Sponsio({ config: "sponsio.yaml", agentId: "bank_bot" });

const result = guard.guardBefore(name, args);
if (!result.blocked) {
  const output = tools[name](args);
  guard.guardAfter(name, output);
}

LangGraph / LangChain.js — wrap tools

from sponsio.langgraph import Sponsio
from langgraph.prebuilt import create_react_agent

guard = Sponsio(config="sponsio.yaml", agent_id="hr_bot")
agent = create_react_agent(llm, guard.wrap(tools))

import { Sponsio } from "@sponsio/sdk";
import { wrapTools } from "@sponsio/sdk/langchain";
import { ToolNode } from "@langchain/langgraph/prebuilt";

const guard = new Sponsio({ config: "sponsio.yaml", agentId: "hr_bot" });
const toolNode = new ToolNode(wrapTools(tools, guard));

Claude Agent SDK — native hooks, zero tool wrapping

from sponsio.claude_agent import Sponsio
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions

guard = Sponsio(config="sponsio.yaml", agent_id="support_bot")
options = ClaudeAgentOptions(hooks=guard.hooks())

async with ClaudeSDKClient(options=options) as client:
    await client.query("Refund order #W456.")

import { Sponsio } from "@sponsio/sdk";
import { sponsioHooks } from "@sponsio/sdk/claude-agent";

const guard = new Sponsio({ config: "sponsio.yaml", agentId: "support_bot" });
const hooks = sponsioHooks(guard);
// Pass `hooks` to ClaudeSDKClient options.

OpenAI SDK — monkey-patch or explicit wrap

from sponsio.openai import Sponsio

guard = Sponsio(config="sponsio.yaml", agent_id="db_admin")
resp = client.chat.completions.create(...)
guard.check_response(resp)

import OpenAI from "openai";
import { Sponsio } from "@sponsio/sdk";
import { wrapOpenAI } from "@sponsio/sdk/openai";

const guard = new Sponsio({ config: "sponsio.yaml", agentId: "db_admin" });
const client = wrapOpenAI(new OpenAI(), guard);

For a quick no-YAML wire-up (handy in scripts / notebooks): from sponsio.openai import patch_openai.

OpenAI Agents SDK — wrap Agent tools

from sponsio.agents import Sponsio
from agents import Agent, Runner

guard = Sponsio(config="sponsio.yaml", agent_id="deploy_bot")

agent = Agent(
    name="deploy_bot",
    instructions="Ship v2.1 to production.",
    tools=guard.wrap([run_tests, deploy_staging, deploy_production]),
)

result = Runner.run_sync(agent, "Deploy v2.1 now.")

TypeScript: not yet supported.

Google ADK — wrap Agent tools (Gemini)

from sponsio.google_adk import Sponsio
from google.adk.agents.llm_agent import Agent

guard = Sponsio(config="sponsio.yaml", agent_id="travel_agent")

root_agent = Agent(
    name="travel_agent",
    model="gemini-flash-latest",
    instruction="Search before booking. Charge only once.",
    tools=guard.wrap([search_flights, book_flight, charge_payment]),
)

import { Sponsio } from "@sponsio/sdk";
import { wrapGoogleAdkTools } from "@sponsio/sdk/google-adk";
import { LlmAgent } from "@google/adk";

const guard = new Sponsio({ config: "sponsio.yaml", agentId: "travel_agent" });
const tools = wrapGoogleAdkTools([searchFlights, bookFlight, chargePayment], guard);
export const rootAgent = new LlmAgent({ name: "travel_agent", tools, model: "gemini-flash-latest" });

Vercel AI SDK — middleware

from sponsio.vercel_ai import Sponsio

guard = Sponsio(config="sponsio.yaml", agent_id="publish_bot")

async for msg in agent.run(model, messages, middleware=[guard.wrap()]):
    ...

import { Sponsio } from "@sponsio/sdk";
import { sponsioMiddleware } from "@sponsio/sdk/vercel-ai";

const guard = new Sponsio({ config: "sponsio.yaml", agentId: "publish_bot" });
const middleware = sponsioMiddleware(guard);

CrewAI — Crew-level hooks

from sponsio.crewai import Sponsio
from crewai import Agent, Crew, Task

guard = Sponsio(config="sponsio.yaml", agent_id="moderator")

crew = Crew(
    agents=[agent],
    tasks=[task],
    before_tool_call=guard.on_tool_start,
    after_tool_call=guard.on_tool_end,
)
result = crew.kickoff()

TypeScript: not yet supported.

MCP — proxy the MCP client

from sponsio.mcp import MCPContractProxy

# Build a sponsio System from your contracts — see runnable example for full wire-up.
proxy = MCPContractProxy(mcp_client=your_mcp_client, system=system)

# Use `proxy` wherever you called the raw MCP client; contracts apply transparently.
result = await proxy.call_tool("write_external_api", {"data": "batch_1"})

TypeScript: not yet supported.

Note on the snippets above. All examples assume you've run sponsio init . first, which walks the wizard, generates a sponsio.yaml with a starter contract set inferred from your tool inventory, and prints the wrap snippet to paste. To populate the YAML differently — pattern-library bundle, hand-written rules, natural-language one-liners, or parsed from a policy doc (sponsio scan --policy security.md) — see Contract types and authoring and docs/concepts/contracts.md for full syntax.

Docs

AI agents reading this repo: llms.txt lists canonical doc paths; llms-full.txt is the concatenated full context dump.

Security

Sponsio enforces runtime contracts, so its own correctness matters. Found something? Report privately via GitHub's security advisory form rather than a public issue. See SECURITY.md for scope, timelines, and what counts as in-scope (enforce-mode bypasses, LTL-evaluator crashes, session-log leakage, judge-prompt injection, etc.).

Contributing

Patches, issue reports, and new pattern proposals are welcome. Start with CONTRIBUTING.md.

Important notes

Sponsio enforces runtime contracts that you define — it does not certify your application's compliance with any regulatory framework. If you operate in regulated domains (HIPAA, GDPR, SOX, EU AI Act, financial services, healthcare), Sponsio's controls and our OWASP Agentic Top 10 mapping are inputs to your compliance program. They are not substitutes for qualified security audit, legal review, or domain-specific regulatory analysis. Author your contracts with appropriate review and revisit them when your agent's tool surface changes.

Det contracts give you machine-checkable enforcement at the action boundary. They do not protect against vulnerabilities upstream of Sponsio (compromised LLM provider, malicious tools you've allowlisted, infrastructure-layer risks like transport encryption / SBOM provenance). See SECURITY.md for the full scope.

License & open source promise

Apache 2.0 — see LICENSE.

Sponsio Labs is a commercial company; Sponsio Cloud (pip install sponsio[cloud]) opens mid-May 2026 and adds the managed LLM-judge pipeline, cross-customer pattern intelligence, and a hosted multi-tenant dashboard. The OSS engine is complete and production-ready for self-hosted use — see OSS_PROMISE.md for what stays in OSS forever, what we sell, and what we promise about the boundary.

Sponsio™ is a trademark of Sponsio Labs — see BRAND.md.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.1

May 23, 2026

This version

0.1.0

May 6, 2026

0.1.0a3 pre-release

May 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sponsio-0.1.0.tar.gz (906.8 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sponsio-0.1.0-py3-none-any.whl (719.9 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file sponsio-0.1.0.tar.gz.

File metadata

Download URL: sponsio-0.1.0.tar.gz
Upload date: May 6, 2026
Size: 906.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for sponsio-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8ef7ef10368f0d003248b2c6127518f6d5379cd6caa753dbc2480f6f0aa0ebbe`
MD5	`54f3dd721345fe16e13f2452fcd04e11`
BLAKE2b-256	`fd78acc61a1d63c2996cc74b7873c3f2aea58616ea5f3be4df904742fd4d0c75`

See more details on using hashes here.

File details

Details for the file sponsio-0.1.0-py3-none-any.whl.

File metadata

Download URL: sponsio-0.1.0-py3-none-any.whl
Upload date: May 6, 2026
Size: 719.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for sponsio-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`eeca17ef085fbba245d62d3aff011036138a9b7f1ced17460c096deedb03dcf5`
MD5	`153cb7979a2f8c1a8a371f12eebd5e08`
BLAKE2b-256	`c098c90b739d5b1c6acce981a9bbc34ae2fa3c40b5e757ec835c088896cb01cc`

See more details on using hashes here.

sponsio 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Sponsio

SOTA Agent Safety Solutions

Why Sponsio

Quick start

Python

TypeScript

For OpenClaw Community

Benchmarks & Performance

Logic-flaw failures are deterministically catchable too

Hot-Path Performance

Today's numbers are starting points, not ceilings

Contract Library

Starter bundles

Integrations

Docs

Security

Contributing

Important notes

License & open source promise

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes