The deterministic merge gate for AI-generated agent capability changes. Agent release readiness for tool-using AI agents. CLI + GitHub Action. Scans MCP, OpenAPI, OpenAI Agents SDK, Anthropic, Google ADK, LangChain, CrewAI, OpenAI API, Codex config, Codex plugin, n8n.

These details have not been verified by PyPI

Project links

Project description

Agents Shipgate · the deterministic merge gate for AI-generated agent capability changes

Agents Shipgate

Your coding agent changed what your AI agent can do — Agents Shipgate tells you whether it can merge.

The deterministic merge gate for AI-generated agent capability changes.

Local-first and static by default — no agent execution, tool calls, LLM calls, or network access.

[!IMPORTANT] Status: pre-1.0 (beta). The decision engine is deterministic and stable. First real-history accuracy numbers (small n, published in full in benchmark/miner/README.md): across 361 merged PRs mined from 9 real agent repos, 336 (93%) organically skip the trigger; of the 10 PRs the gate engaged on the 2026-W26 toolkit corpus, it never wrongly passed an authority-bearing change (2/2 held for a human, zero benign escalations) — but it also never cleanly passed a safe one: 4/8 safe PRs returned insufficient_evidence (the dynamic-toolkit gap, the active fix) and 4/8 hit a since-chipped scan crash. Labels are AI-adjudicated (disagreement 0/10), pending human spot-check. On heavily dynamic tool surfaces Shipgate deliberately returns insufficient_evidence rather than guess. Treat it as an advisory gate while this work closes — see ROADMAP.md.

60 seconds: watch it block two PRs

Claude Code adds stripe.create_refund to your support agent and opens a PR. The diff looks fine to a human skimming it. Should it merge?

uvx agents-shipgate fixture run ai_generated_refund_pr

→ merge_verdict: blocked — the new refund capability has no declared approval policy and no idempotency evidence. The verifier explains both blockers and routes the PR to a human.

Now the move every reviewer fears — the agent deletes the Shipgate CI gate to make its PR pass:

uvx agents-shipgate fixture run agent_weakens_gate

→ merge_verdict: blocked, can_merge_without_human: false. The gate-removal checks are suppression-immune: the cheapest reward-hack is also the most visible one.

…and here's the failure mode. These two cases are constructed fixtures with a clear-cut answer, chosen to show the gate working. Real PRs are messier: when a change builds its tool surface dynamically — a toolkit factory, a config-bound allowlist, tools assembled at runtime — static extraction often can't enumerate the result, and Shipgate returns insufficient_evidence and routes to a human rather than emit a confident wrong verdict. That is the intended failure mode, not a bug; reducing how often it fires on real dynamic code is active work (see ROADMAP.md).

One engine decides (report.json.release_decision.decision); everything else — merge_verdict, PR comments, Check Runs, Action outputs — is a deterministic projection of it. Five-minute version: docs/mental-model.md.

Agents Shipgate is an open-source CLI and GitHub Action for local-first, static Tool-Use Readiness review. It scans MCP, OpenAPI, OpenAI Agents SDK, Anthropic Messages API, Google ADK, LangChain/LangGraph, CrewAI, OpenAI API, Codex repo config, Codex plugin, and n8n artifacts, then writes a deterministic Tool-Use Readiness Report before your agent gets production-like permissions.

Within agent release readiness, Agents Shipgate's wedge is Tool-Use Readiness: the tool surface, schemas, scopes, approval policies, idempotency, and blast radius reviewed at PR time.

Website: threemoonslab.com — quickstart, glossary, check catalog, and design partners.

Static-by-default — no agent execution, no LLM calls, no MCP server connections, no scanner network calls, no scanner telemetry. Audited exceptions are pinned in tests/test_adapter_static_only.py::ALLOWED_EXCEPTIONS. Apache-2.0.

What your PR sees

When a PR changes what your agent can do, the GitHub Action posts the merge verdict as a PR comment. This is the comment for the first demo PR above — the coding-agent diff that adds stripe.create_refund to a support agent (abridged from the verbatim pr-comment.md artifact):

Agents Shipgate result: block

Decision: block · Risk: critical · Required reviewers: agent-platform, security

Impact Change Subject Why

blocks release action added stripe.create_refund Capability added.

blocks release action broadened stripe.create_refund high-risk effect financial_action added

blocks release scope broadened stripe.create_refund:stripe:* scope added

Required before merge — Actor: Human (human authority required — a coding agent must not self-resolve):

Declare an approval policy for stripe.create_refund or remove this tool from the release.

Declare approval.required, safeguards.audit_log, and safeguards.idempotency for this financial write action.

Replace wildcard/admin scopes with operation-specific scopes.

Then re-verify: agents-shipgate verify --base origin/main --head HEAD --json

Impact	Change	Subject	Why
blocks release	action added	`stripe.create_refund`	Capability added.
blocks release	action broadened	`stripe.create_refund`	high-risk effect financial_action added
blocks release	scope broadened	`stripe.create_refund:stripe:*`	scope added

The same uvx agents-shipgate fixture run ai_generated_refund_pr command above writes this comment verbatim to reports/pr-comment.md.

Verify-first quickstart

Install once:

pipx install agents-shipgate

Then start from one of three prominent flows.

Local Boundary Check

Coding agents run shipgate check before reporting an agent-capability change complete. Parse the stdout shipgate.codex_boundary_result/v1 object:

shipgate check --agent codex --workspace . --format codex-boundary-json
shipgate check --agent claude-code --workspace . --format codex-boundary-json
shipgate check --agent cursor --workspace . --format codex-boundary-json

Switch on decision, completion_allowed, must_stop, first_next_action, human_review, repair, policy, and verify_required; never infer a decision from prose. shipgate check is necessary but not sufficient for capability-expanding diffs: if a change adds dynamic, undeclared, or otherwise ambiguous tool capability, do not treat decision="allow" as merge readiness; run agents-shipgate verify and read release_decision.decision.

PR And Local Verification

When a PR changes what your agent can do, run the deterministic verifier on the diff and read its merge verdict before you merge. For committed PR/CI refs, make the base ref available first because verify never fetches:

agents-shipgate verify --workspace . --config shipgate.yaml \
  --ci-mode advisory --format json --base origin/main --head HEAD

For local, uncommitted work, omit --base/--head so your working-tree edits are scanned instead:

agents-shipgate verify --workspace . --config shipgate.yaml \
  --ci-mode advisory --format json

If a repo is not configured yet, use the verify flow's preview entry point:

agents-shipgate verify --preview --json

The short shipgate verify alias remains invokable for compatibility, but agent-facing PR-gate guidance uses agents-shipgate verify.

Host-Grant Audit

Before changing local MCP servers, Codex/Claude/Cursor permission rules, hooks, workflow scopes, or other host grants, capture the host inventory:

shipgate audit --host --json --out agents-shipgate-reports/host-grants.json

The release gate is agents-shipgate-reports/report.json → release_decision.decision (blocked | review_required | insufficient_evidence | passed). The PR/controller surface is agents-shipgate-reports/verifier.json → merge_verdict (mergeable | human_review_required | insufficient_evidence | blocked | unknown), a deterministic projection of the release decision. Read agent-handoff.json first (gate.merge_verdict, then controller), then the authoritative controller substrate verifier.json for merge_verdict, applicability, agent_controller, can_merge_without_human, first_next_action, and fix_task. capability_review.top_changes is supporting/provisional reviewer context.

Zero-setup demos of both verdicts are in 60 seconds above; uvx runs them with no persistent install. To upgrade the CLI, use pipx upgrade agents-shipgate - a plain install is a no-op over a stale build. Your agent project does not need Python 3.12; the CLI installs separately. To verify your own repo and write the standard agents-shipgate-reports/ directory, see Verify your repo below.

Sample Tool-Use Readiness Report showing 2 critical, 14 high, and 2 medium findings on the support_refund_agent fixture, including a missing approval policy on stripe.create_refund.

How to read your first result

For PR verification, read agent-handoff.json.gate.merge_verdict first:

Merge verdict	Meaning	Next step
`blocked`	Active, unaccepted blockers exist.	Fix blockers or remove the risky capability.
`insufficient_evidence`	Static evidence is too weak to gate release confidently.	Add better sources and rerun; do not auto-merge.
`human_review_required`	A person must review accepted debt, trust-root changes, or authority-bearing gaps.	Surface the required review; a coding agent must not self-approve it.
`mergeable`	No active blocker or review signal was found.	Keep verifier/report artifacts with the PR record.
`unknown`	Verify could not produce a reliable head scan or diff context.	Fix setup, fetch the base ref, or rerun with usable inputs.

Then read report.json.release_decision.decision, the source-of-truth gate:

Decision	Meaning	Next step
`blocked`	Active, unaccepted blockers exist.	Fix the blockers or remove the risky tool surface.
`insufficient_evidence`	The scan cannot confidently gate release from the available static evidence. This does not prove the agent is unsafe.	Provide clearer sources such as an MCP export, OpenAPI spec, explicit local tool inventory, or broader OpenAI SDK source path, then rerun.
`review_required`	Human review is needed, often for accepted debt or evidence gaps below the blocked threshold.	Review the listed items before promotion.
`passed`	No active blocker or review signal was found.	Keep the report artifact with the PR/release record.

Common review signals include missing confirmation, missing idempotency evidence, broad-scope permissions, prohibited-action policy gaps, and trust-root changes such as weakened CI or manifest policy.

Not sure if Shipgate applies?

Run the zero-install detector from the repo you are reviewing. It is a stdlib-only first touch for engineers and coding agents that need a yes/no relevance signal before installing anything:

curl -sSL https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/tools/shipgate-detect.py \
  | python3 - --workspace . --json

Continue to Verify your repo when the output has is_agent_project: true, non-empty suggested_sources, non-empty codex_plugin_candidates, or the workspace already has shipgate.yaml.

Sample reports

Open a report first if you want to see the output shape before installing:

Sample	Markdown	JSON
`support_refund_agent`	`report.md`	`report.json`
`simple_openai_api_agent`	`report.md`	`report.json`
`simple_langchain_agent`	`report.md`	`report.json`

The support_refund_agent fixture also includes a reviewer-shaped Release Evidence Packet in packet.md, packet.json, and packet.html.

Copy this into your coding agent

Add a Tool-Use Readiness release gate for this tool-using AI agent with Agents Shipgate.
Use only the prominent Shipgate flows as first-look commands:
shipgate check --agent codex --workspace . --format codex-boundary-json
shipgate check --agent claude-code --workspace . --format codex-boundary-json
shipgate check --agent cursor --workspace . --format codex-boundary-json
agents-shipgate verify --workspace . --config shipgate.yaml --ci-mode advisory --format json
agents-shipgate verify --workspace . --config shipgate.yaml \
  --base origin/main --head HEAD --ci-mode advisory --format json
shipgate audit --host --json --out agents-shipgate-reports/host-grants.json
For local control, parse the `shipgate check` stdout JSON
(`shipgate.codex_boundary_result/v1`): switch on `decision`,
`completion_allowed`, `must_stop`, `first_next_action`, `human_review`,
`repair`, `policy`, and `verify_required`. For local uncommitted verify work,
omit `--base`/`--head`. For committed PR/CI refs,
make the base ref available first because `verify` never fetches. Read
`agents-shipgate-reports/agent-handoff.json` first and lead with
`gate.merge_verdict` and `controller`, then read the authoritative substrate
`agents-shipgate-reports/verifier.json` (`merge_verdict`, `applicability`,
`agent_controller`, `can_merge_without_human`, `first_next_action`,
`fix_task`), then supporting/provisional `capability_review.top_changes` and
`agents-shipgate-reports/report.json` for `release_decision.decision`. Do not
claim completion when `merge_verdict` is `blocked`, `insufficient_evidence`, or
`human_review_required` unless the user explicitly accepts human review. Do not auto-assert approval. Do not auto-assert confirmation, idempotency,
broad-scope safety, prohibited-action enforcement, runtime-trace proof,
suppressions, waivers, baselines, or policy weakening. Never remove Shipgate CI
or weaken agent instructions just to make the verifier pass.

Use with your coding agent

Claude Code — two commands wire the full surface:

pipx install agents-shipgate
agents-shipgate init --workspace . --write --claude-code

init --claude-code writes the CLAUDE.md managed block, the auto-discoverable .claude/skills/agents-shipgate/ skill, and the Claude Code hooks: a cheap trigger check after Edit|Write|MultiEdit and the full verifier at Stop, so capability changes are re-checked before the agent reports work complete — even on long sessions where instruction files lose attention. CI stays authoritative; the hooks are the local feedback loop. Inside Claude Code, agent mode auto-enables, so a zero-flag agents-shipgate verify prints the compact agent result. Slash command, skill internals, and manual paths: docs/agents/use-with-claude-code.md.

Prefer a plugin over a committed kit? This repo is also a Claude Code plugin marketplace — the skill-only symmetric counterpart of the Codex plugin below (workflows, not the scanner binary; install the CLI separately):

/plugin marketplace add ThreeMoonsLab/agents-shipgate
/plugin install agents-shipgate@agents-shipgate

The plugin ships the auto-triggering agents-shipgate skill and the /agents-shipgate:shipgate command (plugin commands are namespaced). It does not ship hooks — install those explicitly with agents-shipgate install-hooks --target claude-code --write, which requires the CLI on PATH.

Codex — install the skill-only plugin from this repo's marketplace, or write the repo-scoped kit directly:

codex plugin marketplace add ThreeMoonsLab/agents-shipgate   # plugin path
agents-shipgate init --workspace . --write --agent-instructions=agents-md,codex-skill  # committed path

Then invoke $agents-shipgate in a fresh thread. The plugin supplies workflows, not the scanner binary — install the CLI (pipx install agents-shipgate && pipx upgrade agents-shipgate) where Codex runs commands and require contract v9 or newer. Marketplace details, kit overrides, and the beta-migration steps: docs/agents/use-with-codex.md.

Cursor — init --agent-instructions=cursor writes the auto-attach rule; see docs/agents/use-with-cursor.md.

Who this is for

Agent builders — review MCP, OpenAPI, and SDK tool definitions before merging changes that expand the tool surface.
Platform teams — add release gates for approval, scope, idempotency, and baseline drift to PR review.
Security and GRC reviewers — get static release evidence without running agents or importing user code.

Use this when

Run Agents Shipgate when a PR adds or changes agent tool surfaces or the policy evidence around them:

MCP exports, OpenAPI specs, or local tool inventories.
OpenAI Agents SDK, Google ADK, LangChain/LangGraph, CrewAI, Anthropic Messages API, or OpenAI API artifact tool definitions.
Codex repo config such as .codex/config.toml or .codex/hooks.json.
Prompts, permission scopes, approval policies, confirmation policies, prohibited actions, or shipgate.yaml.
GitHub Actions or CI release gates for a tool-using AI agent.

Verify your repo

agents-shipgate verify --workspace . --config shipgate.yaml \
  --base origin/main --head HEAD --ci-mode advisory --format json

For local uncommitted work, omit --base/--head. For committed PR/CI refs, make the base ref available first because verify never fetches. Verify writes agents-shipgate-reports/agent-handoff.json, verifier.json, verify-run.json, pr-comment.md, the head capability lock, and the normal report.{md,json,sarif} / packet artifacts when a scan is required. If the base scan can be materialized, verify also writes base.capabilities.lock.json plus capability-lock-diff.{json,md}, and the PR comment includes a compact semantic capability diff summary. Lead with merge_verdict, applicability, agent_controller, can_merge_without_human, first_next_action, and fix_task; use release_decision.decision as the release gate. Capability diff summaries and capability_review.top_changes are supporting/provisional review context. Legacy agent_result_v1 / agent-result.json compatibility surfaces are supporting/provisional projections, not the CI gate or verifier read path.

Install alternatives (your agent project does not need Python 3.12 — install the CLI separately):

python -m pip install -U --pre agents-shipgate       # global pip
uv tool install --upgrade agents-shipgate            # via uv
agents-shipgate contract --json                      # require contract_version >= 9

Adopt in one turn (scan helper)

The verifier-first loop above is the product entry path. For a scan-oriented first adoption pass, agents-shipgate bootstrap runs all four steps in one command, or run them individually:

agents-shipgate detect --json                                          # 1. classify
agents-shipgate init --write --ci --json                               # 2. manifest + workflow
agents-shipgate scan -c shipgate.yaml --suggest-patches --format json  # 3. scan + suggest
agents-shipgate apply-patches --from agents-shipgate-reports/report.json \
    --confidence high --apply                                          # 4. apply safe trivial fixes

apply-patches is dry-run by default and refuses to mutate anything outside the manifest's directory. Agent-driven recipes: docs/agent-recipes.md; framework-by-framework minimal manifests: docs/minimal-real-configs.md.

Use in CI

The public Action is listed on the GitHub Action Marketplace. Drop this full advisory workflow into .github/workflows/agents-shipgate.yml — it runs on every PR, posts a summary comment, uploads artifacts, and never fails the job (same file as examples/github-actions/01-advisory-pr-comment.yml):

name: Agents Shipgate (advisory)

on:
  pull_request:

permissions:
  contents: read
  pull-requests: write

jobs:
  shipgate:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd
        with:
          fetch-depth: 0
      - uses: ThreeMoonsLab/agents-shipgate@v0.15.0
        with:
          ci_mode: advisory
          diff_base: target
          check_annotations: 'true'
          pr_comment: 'true'

The PR comment is fixed into a human summary plus agent instruction block, with merge_verdict, the semantic capability diff when available, required next action, and artifact links:

Preview of the optional Agents Shipgate PR comment showing merge verdict, capability changes, required next action, and report artifacts.

The action delegates to agents-shipgate verify and never fetches — keep fetch-depth: 0 on checkout. After adoption, choose an explicit merge policy: 07-block-on-blocked-verdict.yml blocks only when merge_verdict == blocked; 08-require-mergeable.yml requires can_merge_without_human == true; 11-fail-on-insufficient-evidence.yml fails only on insufficient_evidence. Strict / baseline / SARIF / Check Run / multi-config recipes live in examples/github-actions/; the full input and output catalog is in action.yml. Use the decision output for CI gating and merge_verdict / can_merge_without_human for PR-controller routing.

CI is advisory by default. Strict mode exits 20 only on unsuppressed critical findings; for existing projects, save a baseline first so strict CI fails only on new findings:

agents-shipgate scan --config shipgate.yaml --ci-mode strict
agents-shipgate baseline save --config shipgate.yaml --out .agents-shipgate/baseline.json
agents-shipgate scan --config shipgate.yaml --baseline .agents-shipgate/baseline.json --ci-mode strict

Severity and failure thresholds are configurable in the manifest (checks.severity_overrides, ci.fail_on) — see docs/baseline.md and docs/integrations.md for GitLab, CircleCI, Jenkins, and pre-commit equivalents.

What it scans

Input	Status
Model Context Protocol (MCP) exports	Supported
OpenAPI 3.x specs	Supported
OpenAI Agents SDK Python files/directories	Supported
Anthropic Messages API artifacts	Supported
Google ADK Python and YAML config	Supported
LangChain/LangGraph static Python inputs	Supported
CrewAI static Python inputs	Supported
n8n workflow JSON and source-control stubs	Supported
OpenAI API artifacts	Supported
Codex repo config	Supported
Codex plugin packages and marketplaces	Supported

What it produces

When a PR changes what your agent can do, the verify loop writes these artifacts — in read order:

agents-shipgate-reports/agent-handoff.json — the first artifact a coding agent reads: the compact shipgate.agent_handoff/v1 object. Lead with gate.merge_verdict, then controller; it also projects blocked_by[], remediation_plan[], and verify-run reproducibility from existing artifacts, and it does not introduce a second verdict.
agents-shipgate-reports/verifier.json — the authoritative PR/controller evidence substrate. A coding agent reads merge_verdict (mergeable | human_review_required | insufficient_evidence | blocked | unknown), can_merge_without_human, agent_controller, first_next_action, and fix_task when producing reviewer evidence for an agent-capability PR. Local control comes from shipgate check --format codex-boundary-json and shipgate.codex_boundary_result/v1. See docs/agent-contract-current.md for the field contract.
agents-shipgate-reports/verify-run.json — the deterministic verify-run reproducibility artifact. It records stable subject/input hashes, policy-pack hashes, outcome, artifact paths, and run_id without wall-clock timestamps.
agents-shipgate-reports/attestation.json + agents-shipgate-reports/org-evidence-bundle.json — optional organization-governance projections over the same verifier/report artifacts. They are ledger inputs for platform teams, not release gates; report.json.release_decision.decision remains the decision engine.
agents-shipgate-reports/host-grants.json + agents-shipgate-reports/org-status.json — optional fleet-governance artifacts from audit --host --out and org status --json, useful for host-grant drift, policy-pack pin state, and exception hygiene.
agents-shipgate-reports/pr-comment.md — the human PR surface: the same verdict and semantic capability diff when available, shaped for a reviewer.
agents-shipgate-reports/capabilities.lock.json + agents-shipgate-reports/base.capabilities.lock.json + agents-shipgate-reports/capability-lock-diff.{json,md} — the capability review primitive. Verify always emits the head lock after a successful scan; it emits the base lock and diff when the base scan can be materialized, falling back to the reviewed committed lock at .agents-shipgate/capabilities.lock.json if needed.
Gate source of truth — report.json.release_decision.decision (passed | review_required | insufficient_evidence | blocked). merge_verdict is a deterministic projection of it; the report stays the one decision engine.
Tool-Use Readiness Report (supporting) — agents-shipgate-reports/report.{md,json,sarif}. Markdown for human release review, JSON for tools and coding agents, SARIF for GitHub code-scanning workflows. This is the underlying check domain the verdict summarizes.
Release Evidence Packet (supporting) — agents-shipgate-reports/packet.{md,json,html} (and packet.pdf with the [pdf] extras). Reviewer-shaped synthesis with fixed sections, including the compact evidence matrix plus tool-surface and action-surface diffs when available. Packet outputs are locally redacted by default; see STABILITY.md §Release Evidence Packet.

Exit codes

Code	Meaning
`0`	Pass (advisory mode or strict-no-blockers)
`2`	Manifest config error
`3`	Input parse error (file missing, malformed, path traversal blocked)
`4`	Other Agents Shipgate error
`20`	Strict-mode gate failure

For coding agents

Human readers can skip this section; it exists so coding agents can find the repo's machine-readable contracts quickly.

Agents Shipgate is designed to be agent-friendly. If you're a coding agent (Claude Code, Codex, Cursor, Aider) reading this repo:

llms.txt — short index of every machine-readable surface, one fetch.
llms-full.txt — long-form concatenation of AGENTS.md + recipes + checks + concepts + autofix policy, in one document. Built by scripts/build-llms-full.py.
.well-known/agents-shipgate.json — discovery metadata (tagline, install commands, schema URLs, gating signal, exit codes, trigger-catalog URL).
docs/triggers.json — machine-readable mirror of the AGENTS.md trigger table. Apply the rules to a PR diff to decide whether to run agents-shipgate verify --preview --json or the full verifier. Schema is stable for 0.x.
tools/shipgate-detect.py — zero-install, stdlib-only detector. curl … | python3 - --workspace . --json returns the same structural verdict as agents-shipgate detect --json. Pinned to the canonical CLI by tests/test_zero_install_detector.py. See docs/zero-install.md.
agents-shipgate contract --json — verify the installed CLI's local contract before relying on hard-coded schema or gating assumptions; contract v9 names primary_commands, the verifier, verify-run, agent-handoff, Codex boundary, attestation, registry, org evidence bundle, host-grants inventory, and legacy local-agent schema versions plus the agent read order.
docs/agent-contract-current.md — single source of truth for the current schema versions and which JSON fields to read. Updated whenever the contract bumps; other agent-facing surfaces link here instead of restating the contract.
docs/agent-native-merge-contract.md — the agent-native protocol map: the eight contracts (trigger, capability change, merge verdict, repair, forbidden action, human authority, trust root, attestation) each mapped to the artifact that implements it.
docs/capability-standard.md — stable non-gating capability lock/diff standard for external integrations and research tooling.
docs/product-hardening-gap-closure.md — closure map for root dogfooding, the governance case catalog, policy-pack tests, trace evidence, and runtime-inventory boundaries.
benchmark/agent-pr-governance/ + docs/governance-benchmark.md — stable research benchmark for unsafe-merge prevention, authority routing, and verifier explanation quality.
AGENTS.md — canonical agent-facing instructions: install, run, common tasks, JSON-mode flags, error semantics
STABILITY.md — what won't break across 0.x versions
docs/target-repo-agent-snippets.md — copyable snippets for adding Shipgate trigger rules to downstream agent repos
docs/agent-adoption-harness.md — manual protocol for checking whether coding agents discover and use Shipgate
benchmark/ — frozen archetypes, prompts, setup variants, and a public leaderboard CSV. Closes the loop on adoption-readiness changes.
docs/zero-install.md — single-file detector, uvx, and GitHub Action paths for evaluating Shipgate without a local install.
prompts/ — reusable prompts for common workflows
skills/agents-shipgate/ + .claude/commands/shipgate.md — self-contained Claude Code skill (bundled prompts and CI recipe) and /shipgate slash command. See docs/agents/use-with-claude-code.md to install in your own project.
agents-shipgate install-hooks --target claude-code --write — deterministic Claude Code hooks: a PreToolUse trust-root guard, a cheap trigger check after Edit|Write|MultiEdit, and a full verify at Stop, so the gate runs even when instruction files lose attention on long sessions. See docs/agents/use-with-claude-code.md.
agents-shipgate mcp-serve ([mcp] extra) — read-only stdio MCP server exposing shipgate.check, shipgate.preflight, shipgate.explain, shipgate.capabilities, and shipgate.handoff for agents without comfortable shell access. It is static-only and not a general MCP permission broker. See docs/mcp-server.md.
docs/ai-search-summary.md — human-readable summary for AI search, answer engines, and coding agents
docs/manifest-v0.1.json + docs/report-schema.v0.28.json + docs/agent-handoff-schema.v1.json + docs/preflight-schema.v0.2.json — JSON Schemas for live editor validation and agent routing (current; emitted reports carry report_schema_version: "0.28", handoff emits schema_version: "shipgate.agent_handoff/v1", preflight emits preflight_schema_version: "0.2"). v0.28 moves policy-pack owner/reviewer/approval routing metadata to findings[].policy_routing so Finding.evidence stays deterministic match/gating evidence; v0.27 added policy-pack distribution metadata (loaded_policy_packs[].{source,sha256,sha256_status,owner}) over v0.26's structured evidence gaps. Gate behavior is unchanged. Read release_decision.decision for release gating, agent-handoff.json.gate / controller for the compact agent step, and reviewer_summary.first_recommended_surface for the human-review entry point. reviewer_summary, verifier_summary, runtime trace/evidence fields, Release Evidence Packet outputs, legacy agent_result_v1 surfaces, and capability diff projections are supporting/provisional review or compatibility context, not additional gates. The per-version additive history lives in docs/agent-contract-current.md and STABILITY.md.
docs/capability-lock-schema.v0.2.json + docs/capability-lock-diff-schema.v0.3.json — stable schemas for the static capability envelope and semantic diff emitted by agents-shipgate capability and, in PR workflows, by agents-shipgate verify; non-gating and separate from report.json.
docs/attestation-schema.v0.4.json + docs/org-governance-schema.v0.1.json + docs/org-evidence-bundle-schema.v1.json + docs/registry-schema.v0.3.json + docs/host-grants-inventory-schema.v0.1.json — deterministic local attestation, organization governance, org evidence bundle, append-only registry, and host-grant inventory schemas for multi-repo governance.
docs/governance-benchmark-catalog-schema.v0.2.json + docs/governance-benchmark-result-schema.v0.2.json — stable schemas for the research benchmark catalog and deterministic result artifact.
docs/checks.json — machine-readable check catalog

Every command has a --json form. Errors emit a structured next_action line on stderr when agent mode is active — set AGENTS_SHIPGATE_AGENT_MODE=1, or rely on auto-detection inside a coding-agent harness (Claude Code exports CLAUDECODE=1, Cursor CURSOR_TRACE_ID). AGENTS_SHIPGATE_AGENT_MODE=0 forces it off.

Why this exists

Once an AI agent can refund, email, cancel, deploy, or modify a record, every tool change becomes a release event. Code review catches code; eval suites catch behavior; observability catches runtime. None of them answer the release question: given the tool surface declared in this PR, do we have explicit approval policies, scope coverage, idempotency evidence, and review readiness for every action?

Agents Shipgate produces a deterministic answer to that question, before promotion.

The current product promise is deliberately narrow: a deterministic, local-first, static merge gate for AI-generated agent capability changes — the Tool-Use Readiness review run at PR time. Broader lifecycle ideas are future roadmap work, not claims this scanner makes today.

Findings Gallery

The bundled support-refund fixture demonstrates the kind of release risks Agents Shipgate is designed to surface:

## Release Decision

Decision: blocked
Reason: 2 active findings block release.
Blockers: 2
Review items: 16
Fail policy: would_fail_ci=false (exit 0)

Top findings:
1. stripe.create_refund lacks a declared approval policy
2. stripe.create_refund lacks idempotency evidence
3. Manifest declares broad permission scopes

stripe.create_refund lacks a declared approval policy, so a financial action could ship without an explicit human review gate.
stripe.create_refund.amount lacks a maximum bound, weakening blast-radius control.
stripe.create_refund lacks idempotency evidence while retry behavior is known, risking duplicate refunds.
wildcard_mcp_tools.* exposes a wildcard tool surface, making review incomplete.
gmail.send_customer_email overlaps a prohibited external-communication action without a matching confirmation policy.

See it block a PR

The fastest way to understand what changes for a reviewer: walk through a Golden PR. Each one ships a sample manifest, the resulting report, the release decision, and the recommended PR-comment summary an agent should post.

openai-agents-sdk-refund-agent — refund agent adds stripe.create_refund. Shipgate decides blocked because approval policy and idempotency evidence are missing. Includes the recommended Markdown PR-comment template.
golden-pr-from-coding-agent.md — the artifact a coding agent should produce after running the verify-first flow: PR comment, merge_verdict, capability_review, and human/coding-agent next action.
mcp-only-tool-server — MCP server with no Python framework imports; demonstrates the MCP-only adoption path.
openapi-support-agent — OpenAPI-described tool surface; shows scope-coverage findings.

Why Not Just...

Alternative	Gap Agents Shipgate Covers
Unit tests	Tests usually validate code paths, not the released tool surface and declared policies.
Code review	Reviewers miss generated specs, MCP exports, broad scopes, and missing approval policies.
Runtime traces	Useful later, but they arrive after behavior exists. Agents Shipgate runs before promotion.
Nothing	Tool-surface drift becomes a production surprise.

For named comparisons against specific evaluators and platforms, see the marketing-site versus pages: vs evals, vs promptfoo, vs Braintrust, vs LangSmith, and vs observability platforms.

Framework notes

Framework adapters (Google ADK, LangChain/LangGraph, CrewAI, OpenAI Agents SDK) parse Python AST only — they never import framework packages or user modules. Dynamic or prebuilt toolsets produce warnings or insufficient_evidence findings unless you provide explicit MCP, OpenAPI, or local tool-inventory inputs. Framework-by-framework minimal manifests, with runnable sample repos for each adapter, live in docs/minimal-real-configs.md.

Organization-specific release rules ship as local declarative YAML policy packs (checks.policy_packs in the manifest, or --policy-pack on the CLI) — static data, no code import.

Limitations

Agents Shipgate is a static, manifest-first scanner. It is intentionally narrow:

It does not run agents, call tools, invoke LLMs, or verify model availability by default (static-by-default; see Trust Model and ALLOWED_EXCEPTIONS).
It does not verify runtime behavior, latency, prompt quality, or routing decisions.
It does not replace dynamic security testing or human security review of the underlying systems.
It only inspects what is declared in shipgate.yaml, local OpenAPI specs, MCP exports, Anthropic/OpenAI API artifacts, optional SDK AST metadata, static Google ADK/LangChain/CrewAI/n8n inputs, Codex repo config, and static Codex plugin package metadata; tools that are not declared or statically discoverable are not scanned.
The manifest remains version: "0.1" so existing configs keep working. Current reports carry report_schema_version: "0.28" (policy-pack routing metadata in findings[].policy_routing, separate from deterministic Finding.evidence) while preserving the stable payload contract documented in the report schema.

See ROADMAP.md for what is planned next.

Trust Model

Agents Shipgate does not import user code, run agents, call tools, call LLMs, connect to MCP servers, make network calls, or collect telemetry by default.

See Trust model and Security policy for the default local-only guarantees and disclosure process.

Pricing And Open Source Stance

Agents Shipgate is and will remain free OSS for individuals and teams running it on their own infrastructure. The core manifest-first scanner, built-in checks, Markdown report, and JSON report are intended to remain open source. We do not collect telemetry and do not require an account.

If hosted dashboards, SSO, org-wide baselines, approval workflows, or trace-based evidence emerge, they should live in a separate optional product rather than moving core OSS functionality behind a paywall.

Teams shipping production-like tool-using agents can apply to the Three Moons Lab design partner program — the marketing page mirrors docs/design-partners.md in the repo and includes a prefilled email CTA for review criteria and contact. The current pilot runbook is docs/design-partner-verifier-pilot.md: bring one AI-generated agent PR, run the verifier loop, and export redacted feedback (agents-shipgate feedback export --from agents-shipgate-reports/verifier.json --redact --out shipgate-feedback.json — never raw report evidence).

Docs

The marketing site at threemoonslab.com carries the same canonical concepts in human-readable, search-optimised form: quickstart, check catalog, glossary, MCP security review, AI agent least privilege, blog, and design partners. The in-repo docs below are the canonical contract; the marketing pages are sized for first-time readers and AI search ingest.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.15.0

Jul 8, 2026

0.14.0

Jul 2, 2026

0.13.0

Jun 12, 2026

0.12.0

Jun 9, 2026

0.11.0

Jun 1, 2026

0.8.0

May 5, 2026

0.7.0

May 5, 2026

0.5.1

Apr 30, 2026

0.5.0

Apr 29, 2026

0.4.0

Apr 27, 2026

0.3.0

Apr 26, 2026

0.2.0

Apr 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agents_shipgate-0.15.0.tar.gz (2.5 MB view details)

Uploaded Jul 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agents_shipgate-0.15.0-py3-none-any.whl (1.1 MB view details)

Uploaded Jul 8, 2026 Python 3

File details

Details for the file agents_shipgate-0.15.0.tar.gz.

File metadata

Download URL: agents_shipgate-0.15.0.tar.gz
Upload date: Jul 8, 2026
Size: 2.5 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for agents_shipgate-0.15.0.tar.gz
Algorithm	Hash digest
SHA256	`5fb20f3afe64a50ae3e36076e0d5f7ecc82ae256cc2818edc1b44b13cf8b4982`
MD5	`7a1821946062c9ee232bc9751b24c8d0`
BLAKE2b-256	`327120107acd8aa905af8527abeff7854c141270f03b87e39bc1bb184cd4b1a4`

See more details on using hashes here.

File details

Details for the file agents_shipgate-0.15.0-py3-none-any.whl.

File metadata

Download URL: agents_shipgate-0.15.0-py3-none-any.whl
Upload date: Jul 8, 2026
Size: 1.1 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for agents_shipgate-0.15.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`070130f07e6355c4d15ea8c1fbab9fe25ba93cbf38d776fac3b34fefa3e391a1`
MD5	`df98a8ef02fd63a34a4024feaf3cedf2`
BLAKE2b-256	`0e984952db5ddbdb2ae33cfcc17084aee4066faa94123073571f84374a561070`

See more details on using hashes here.

agents-shipgate 0.15.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Agents Shipgate

60 seconds: watch it block two PRs

What your PR sees

Agents Shipgate result: block

Verify-first quickstart

Local Boundary Check

PR And Local Verification

Host-Grant Audit

How to read your first result

Not sure if Shipgate applies?

Sample reports

Copy this into your coding agent

Use with your coding agent

Who this is for

Use this when

Verify your repo

Adopt in one turn (scan helper)

Use in CI

What it scans

What it produces

Exit codes

For coding agents

Why this exists

Findings Gallery

See it block a PR

Why Not Just...

Framework notes

Limitations

Trust Model

Pricing And Open Source Stance

Docs

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes