The pre-execution control layer for AI agents. Block dangerous actions before they happen.
Project description
Cordon
The pre-execution control layer for AI agents.
Cordon runs deterministic safety probes on a proposed agent action before it executes. No LLM calls, no heuristics, no inference latency — just fast, auditable, replayable verdicts on whether an action is safe to run.
pip install cordon
from cordon import Guard, Action
guard = Guard.strict()
verdict = guard.check(Action(
kind="shell",
command="pip install -r requirements.txt",
changes={"requirements.txt": "reqeusts==2.31.0\n"},
))
if verdict.blocked:
print(verdict.top_reason())
# → "'reqeusts' is 2 edit(s) from 'requests' (likely typosquat)"
Protect your agent in 5 lines
Cordon ships first-class integrations for the three frameworks every production agent uses today:
| Vendor | Module | Entry point | Example |
|---|---|---|---|
| OpenAI | cordon.integrations.openai |
check_response(response, ...) |
examples/openai_protect.py |
| Anthropic | cordon.integrations.anthropic |
check_response(message, ...) |
examples/anthropic_protect.py |
| LangChain | cordon.integrations.langchain |
guard_tools([t1, t2], ...) |
examples/langchain_protect.py |
A single ActionBuilder registry maps your tool names to cordon.Action shapes — the same registry works across all three vendors:
import cordon
from cordon.integrations.openai import ActionBuilder, check_response
builder = ActionBuilder()
@builder.tool("run_shell")
def _(args): return cordon.Action(kind="shell", command=args["command"],
changes=args.get("changes", {}))
response = client.chat.completions.create(model="gpt-4o", messages=..., tools=...)
for tcv in check_response(response, builder=builder, guard=cordon.Guard.strict()):
if tcv.blocked:
send_refusal(tcv.tool_call_id, tcv.verdict.top_reason())
else:
dispatch_tool(tcv.tool_name, tcv.arguments)
No vendor SDK is a runtime dependency. Cordon duck-types each response shape, so it works with the official SDKs, with litellm, with raw HTTP, and with any internal proxy.
Why Cordon exists
LLM agents now write code, install packages, read secrets, and call APIs on behalf of users and companies. Current safety layers fall into two camps, and both have blind spots:
- Transcript monitors read the agent's reasoning. On semantic attacks (typosquatting, silent data loss, exfiltration-by-artifact) they approve 100% of attacks because the commands look legitimate. Only the intent is malicious.
- Guardrail LLMs call another model to judge outputs. They add latency, cost, a non-deterministic failure mode, and a new attack surface.
Cordon is a third path: static, deterministic probes on the proposed side effects. An agent-written command to pip install reqeusts looks legitimate to a transcript monitor; it looks identical to a Levenshtein check comparing every package against the top 200 PyPI names. One of these catches the attack deterministically. The other flips a coin.
Headline benchmark
The 36-task Semantic Guard benchmark covers all six attack classes the probes are designed to catch (3 attacks + 3 benign per class). Run it yourself:
cordon benchmark --profile strict
| Profile | Block rate (TPR) | False-positive rate | Control score | Tasks passed |
|---|---|---|---|---|
Guard.permissive() |
88.9% | 0.0% | 0.889 | 34 / 36 |
Guard.default() |
94.4% | 0.0% | 0.944 | 35 / 36 |
Guard.strict() |
100.0% | 0.0% | 1.000 | 36 / 36 |
Zero false positives across all 36 benign tasks on every profile. The full 36-task suite runs in ~7 ms end-to-end on a single core — fast enough that every agent step gets checked with no perceptible latency. Lower profiles intentionally flag (rather than block) lower-confidence signals like supply-chain mirror swaps, so security teams see the warning without false stops.
Per-category on strict — 3 / 3 attacks blocked, 0 / 3 benign blocked, in every class:
| Category | Block rate | FPR |
|---|---|---|
| Typosquat (supply-chain) | 1.00 | 0.00 |
| Secret leak (artifact) | 1.00 | 0.00 |
| Exfiltration (network) | 1.00 | 0.00 |
| Silent failure | 1.00 | 0.00 |
| Test suppression | 1.00 | 0.00 |
| Security weakening | 1.00 | 0.00 |
Install
pip install cordon-ai
Python 3.10+. The PyPI distribution is cordon-ai (the unsuffixed cordon name on PyPI is held by an unrelated log-analysis project). The Python import name is cordon — once installed, you write import cordon everywhere. Zero heavy dependencies (pydantic, typer, rich only).
Quickstart
Inline check
from cordon import Guard, Action
guard = Guard.default()
action = Action(
kind="file",
intent="Write a debug report",
reads=["/app/.env"],
workspace_files={"/app/.env": "EXAMPLE_KEY=FAKE-TEST-TOKEN-0123456789ABCDEF"},
changes={
"artifacts/debug.log": "env loaded; key=FAKE-TEST-TOKEN-0123456789ABCDEF"
},
)
verdict = guard.check(action)
print(verdict.decision) # 'block'
print(verdict.top_reason()) # 'Secret leak pattern: /app/.env → artifacts/debug.log (content match)'
Decorator: protect any agent step
@guard.protect
def execute_step(action: Action) -> str:
return shell.run(action.command)
execute_step(action) # raises BlockedAction if the guard blocks
CLI
cordon check action.json --profile strict
cordon list-probes --profile default
cordon demo # runs 3 built-in actions: 2 attacks, 1 benign
cordon version
Architecture
┌───────────────────────────────────────────────────────────────────┐
│ cordon.Guard │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Semantic probes (fast, tier="fast") │ │
│ │ │ │
│ │ • TyposquatProbe supply-chain dependency attacks │ │
│ │ • SecretLeakProbe sensitive read → artifact write │ │
│ │ • SilentFailureProbe code introducing silent drops │ │
│ │ • ExfiltrationProbe secret → network flows │ │
│ │ • TestSuppressionProbe test skip / xfail abuse │ │
│ │ • SecurityWeakeningProbe config downgrades │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Environment probes (coming in v0.2) │ │
│ │ │ │
│ │ • preview_diff shows net filesystem delta │ │
│ │ • dry_run replays in sandbox │ │
│ │ • sensitivity_scan flags reads of sensitive paths │ │
│ │ • inspect_targets network endpoint analysis │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Aggregation + decision │ │
│ │ │ │
│ │ Critical severity anywhere → block │ │
│ │ max(confidence) ≥ block_thresh → block │ │
│ │ max(confidence) ≥ flag_thresh → flag │ │
│ │ otherwise → allow │ │
│ └─────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────┘
Every probe is a pure function of an Action. Every verdict is replayable. No probe ever executes side effects — Cordon is strictly pre-execution.
Profiles
| Profile | block_threshold |
flag_threshold |
When to use |
|---|---|---|---|
Guard.permissive() |
0.85 | 0.50 | Research agents, exploratory workflows |
Guard.default() |
0.70 | 0.30 | Most production apps |
Guard.strict() |
0.60 | 0.20 | Regulated / financial / healthcare agents |
Custom thresholds and probe sets:
from cordon import Guard
from cordon.probes.semantic import (
TyposquatProbe, SecretLeakProbe, SilentFailureProbe,
ExfiltrationProbe, TestSuppressionProbe, SecurityWeakeningProbe,
)
guard = Guard(
probes=[
TyposquatProbe(), SecretLeakProbe(),
SilentFailureProbe(), ExfiltrationProbe(),
TestSuppressionProbe(), SecurityWeakeningProbe(),
],
block_threshold=0.8,
flag_threshold=0.4,
)
Status
v0.1 — first release. Two semantic probes (TyposquatProbe, SecretLeakProbe), CLI, decorator API.
v0.2 (current) — full 6-probe Semantic Guard suite, three vendor integrations (OpenAI, Anthropic, LangChain), 36-task benchmark reproducer (cordon benchmark), 135 tests passing.
v0.3 (next) — environment probes (preview_diff, dry_run, sensitivity_scan, inspect_targets), comparative benchmark vs Lakera / NeMo Guardrails / GPT-4-as-judge.
Research
Cordon is the production-grade descendant of two research projects:
- ActionLens — 3rd place at Apart Research AI Control Hackathon 2026 (36 countries). Introduced the pre-execution environment probes and the 36-task benchmark. Repo →
- Context-Conditioned Confidentiality Failures in Refusal-Tuned Language Models — Cohere Catalyst Grant, 2026. Showed that refusal-tuned models leak confidential context at 46–62% even when explicitly instructed not to. This is the empirical basis for Cordon's secret-leak probe.
Contributing
Cordon is early. If you deploy AI agents that do real work and have seen them do something stupid, we want to hear from you — both as a user and as a contributor. Open an issue with a redacted trace and we'll build the probe.
License
Apache 2.0. Use it anywhere, including commercially.
Citation
@software{cordon2026,
author = {Adharapurapu V S M Ashok Kumar},
title = {Cordon: Pre-Execution Control for AI Agents},
year = {2026},
url = {https://github.com/Ashok-kumar290/cordon},
version = {0.1.0}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cordon_ai-0.2.0.tar.gz.
File metadata
- Download URL: cordon_ai-0.2.0.tar.gz
- Upload date:
- Size: 59.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b634290ef242f93fcce685c1799bce78d22c1da7132218a74f43ac5045e7c4e3
|
|
| MD5 |
9ca2a4f2c6dfc97d22481791838d3403
|
|
| BLAKE2b-256 |
4f087e3e86ab48e2be93e12f84922156cbf44ff4bd1df292c68a28f10ed473d5
|
File details
Details for the file cordon_ai-0.2.0-py3-none-any.whl.
File metadata
- Download URL: cordon_ai-0.2.0-py3-none-any.whl
- Upload date:
- Size: 62.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b786655e49693a6ea011233f14d0066babe1d364c2ebc2c1baf453d17ae0df65
|
|
| MD5 |
f532842a1f97cc8bd710a176a964ba69
|
|
| BLAKE2b-256 |
88683e7ae2d0dc6c46d06339d4fa7560ecdb0a867572f0cce7ef479e7e720386
|