The Python Governance Platform for AI Agents — 60 policy rules, per-tool interception, HITL, FinOps, A2A.

These details have not been verified by PyPI

Project links

Project description

AgentMesh

The governance layer between your AI agents and everything they can touch.
Enforces policies on every action before it executes — deterministic, with no LLM in the loop.

Rules Performance

What It Does

AI agents call tools, spend money, delete files, and make decisions — autonomously, at speed, without asking.

AgentMesh is the governance layer between your agent and everything it can touch. It scans your codebase to find governance gaps before production does. And it enforces policies on live agent traffic — blocking actions before they execute, not logging them after.

pip install useagentmesh && agentmesh scan .

The scan is free, offline, and needs no account. The runtime platform connects in one line and protects agents in production.

No LLM in the evaluation loop. Every rule is deterministic. Same code, same result, every time.

Quick Start

pip install useagentmesh
agentmesh scan .

┌─ AgentMesh Scan ─────────────────────────────────────────┐
│ my-project  │  crewai 0.86.0  │  0.4s                    │
└──────────────────────────────────────────────────────────┘

  Agent BOM: 3 agents │ 12 tools │ 2 models │ 4 prompts

┌──────────────────────────────────────────────────────────┐
│ GOVERNANCE SCORE: 42/100 [D] ████████░░░░░░░░░░░░  42%  │
└──────────────────────────────────────────────────────────┘

  Risk Level: CRITICAL — API keys are exposed in source code.

  CRITICAL  3  │  HIGH  5  │  MEDIUM  4  │  LOW  2

  Top Issues
  • SEC-001  API key hardcoded in source code        (src/main.py)
  • SEC-005  Arbitrary code execution in tool         (tools/runner.py)
  • GOV-006  Agent can modify its own system prompt   (agents/writer.py)

👉 agentmesh scan --details    Full findings with code snippets
👉 agentmesh fix --dry-run     Preview auto-fixes
👉 agentmesh scan --share      Share your score on social media

Common flags:

agentmesh scan --format sarif       # SARIF 2.1.0 for GitHub Code Scanning
agentmesh scan --threshold 70       # Exit 1 if score < 70
agentmesh scan --fail-on critical   # Exit 1 on any critical finding
agentmesh scan --diff HEAD~1        # Only scan changed files
agentmesh scan --details            # Full report with code snippets and fixes
agentmesh history          # view policy snapshot history
agentmesh diff v2 v3       # compare policy versions
agentmesh rollback v2      # restore previous policy
agentmesh scan . --share   # scan + generate shareable score card

Configure Everything From One File

AgentMesh generates .agentmesh.yaml pre-populated with your real agents and tools. Edit it, push it, enforce it.

agentmesh scan .      # detects agents, tools, models
agentmesh init        # generates .agentmesh.yaml with your real tools
agentmesh push        # syncs policies to the enforcement engine

# .agentmesh.yaml — generated with YOUR tools pre-filled
agents:
  researcher:
    source: agents/researcher.py
  writer:
    source: agents/writer.py

tools:
  web_search:
    source: tools/search.py
    type: read
  code_runner:
    source: tools/runner.py
    type: execute    # ⚠ flagged CRITICAL by scan

policies:
  odd:
    researcher:
      permitted_tools: [web_search, file_reader]
      forbidden_tools: [code_runner, db_write]
      max_actions_per_minute: 30
      max_cost_per_session_usd: 5.00

  magnitude:
    max_spend_per_action_usd: 10.00
    max_tokens_per_call: 4000

  dlp:
    mode: enforce

  circuit_breaker:
    agent_level:
      failure_threshold: 5
    per_tool:
      code_runner:
        failure_threshold: 3
        fallback: skip

  hitl:
    triggers:
      tool_types: [write, execute, payment]
      trust_score_below: 60
      spend_above_usd: 100.00
    notification:
      webhook_url: "https://hooks.slack.com/services/xxx"

  intent_verification:
    required_for:
      tool_types: [payment, write]
    anti_replay: true

  hooks:
    pre_action:
      - name: block-dangerous-sql
        condition: "tool_name == 'execute_sql' and 'DROP' in tool_args"
        action: deny

  finops:
    routing:
      rules:
        - condition: "estimated_tokens < 500"
          model: gpt-5-mini
    cache:
      enabled: true
    budgets:
      daily_usd: 50.00

  fallback:
    tools:
      transfer_funds:
        fallback_action: escalate_human
        triggers: [circuit_breaker_open, trust_below: 30]

  intel:
    enabled: true
    contribute: true
    block_severity: 7

  alerts:
    rules:
      - name: injection-spike
        condition: "injection_events_1h > 10"
        severity: critical
        channel: slack

Then one line in your code:

from agentmesh import govern
crew = govern(crew)    # every tool call passes through enforcement

Every tool call is evaluated before executing. If a tool is forbidden, carries PII, exceeds spend caps, matches a known threat pattern, or requires human approval — blocked before it runs.

What the Scan Detects

60 deterministic rules across 13 categories. Full rule reference →

Category	Rules	What it catches
Security	SEC-001 → SEC-011	Hardcoded keys, secrets in prompts, arbitrary code execution, prompt injection vulnerability, unrestricted filesystem/network access, no input sanitization, unvalidated external data
Governance	GOV-001 → GOV-011	No audit logging, no human-in-the-loop, self-modifying prompts, no per-tool error handling, no fallback for critical tools, destructive autonomous actions, action replay vulnerability
Compliance	COM-001 → COM-006	EU AI Act Art. 12 logging, Art. 14 human oversight, Art. 9 risk management, Art. 11 documentation, no Agent BOM, no HITL for high-risk actions
Operational Boundaries	ODD-001 → ODD-004	No boundary definition, unrestricted tool access, no spend cap, no time constraints
Magnitude	MAG-001 → MAG-003	No spend cap, no rate limit, sensitive data without clearance
Identity	ID-001 → ID-003	Static credentials, no identity definition, shared credentials
Multi-Agent	MULTI-001 → MULTI-004	No topology monitoring, circular dependency, no conflict protection, no chaos testing
Hooks	HOOK-001 → HOOK-003	No pre-action validation, no session-end gate, no hook timeout
Versioning	CV-001 → CV-002	No policy versioning, audit logs without policy reference
FinOps	FIN-001 → FIN-003	No cost tracking, single model for all tasks, no caching
Resilience	RES-001 → RES-002	No fallback for critical ops, no state preservation
A2A	A2A-001 → A2A-003	No A2A authentication, unvalidated inter-agent input, no channel isolation
Best Practices	BP-001 → BP-005	Outdated framework, no tests, no retry/fallback, no timeout, too many tools

Scoring: Start at 100, deduct per finding (capped per category).
  CRITICAL: -15 (cap -60) │ HIGH: -8 (cap -40) │ MEDIUM: -3 (cap -20) │ LOW: -1 (cap -10)
  Grades: A (90-100) │ B (75-89) │ C (60-74) │ D (40-59) │ F (0-39)

Agent BOM (Bill of Materials)

Walks your Python AST to inventory every component. No config files, no runtime agent, no network calls.

Agents     3 (researcher, writer, reviewer)
Tools      12 (web_search, file_reader, code_runner, ...)
Models     2 (gpt-5, claude-4.6-sonnet)
MCP        1 server
Prompts    4 system prompts detected
Perms      filesystem, network, code_execution
Framework  crewai 0.86.0

Runtime Platform

The scan tells you what's wrong. The platform fixes it and keeps it fixed.

Enforcement Pipeline

Every tool call passes through this chain. Each step can block, modify, or escalate:

Agent decides to act
  │
  ├─ Pre-action Hooks ─── custom validation scripts
  ├─ Identity Check ───── is this agent who it claims to be?
  ├─ ODD Check ────────── is this tool permitted for this agent?
  ├─ Magnitude Check ──── does this exceed spend/volume/scope limits?
  ├─ HITL Check ───────── does this need human approval?
  ├─ Intent Gate 1 ────── fingerprint the decision (SHA-256 + Ed25519)
  ├─ DLP Scan ─────────── does the payload contain PII/PCI?
  ├─ Injection Scan ───── does the input contain prompt injection?
  ├─ Trust Check ──────── is this agent's reputation sufficient?
  ├─ IOC Check ────────── does this match a known threat pattern?
  ├─ Circuit Breaker ──── is this tool/agent healthy enough?
  ├─ Intent Gate 2 ────── verify decision wasn't altered since Gate 1
  │
  ▼
  Execute (or block with reason)
  │
  ├─ Post-action Hooks ── validate/modify result
  ├─ Topology Tracker ─── log interaction for multi-agent graph
  ├─ Cost Tracker ─────── record tokens, cost, model used
  └─ Audit Trail ──────── SHA-256 hash chain + policy snapshot ID

Capabilities

Capability	What it does
DLP	Presidio-based PII/PCI detection on every tool call payload. CRITICAL PII → action rejected before it reaches downstream APIs.
Prompt Injection Detection	Bidirectional: scans inputs reaching the agent for injection attempts in external data (documents, API responses, tool results). 5 pattern categories, deterministic, no LLM. Complements DLP which scans outputs.
Trust Score	Per-agent EigenTrust reputation (0–100), updated on every interaction, time-decayed. Agents earn or lose trust based on behavior.
Circuit Breaker	Per-agent AND per-tool. If one tool fails repeatedly, that tool auto-suspends — the agent continues with everything else. Hierarchy: tool CB → agent CB → fleet halt.
ODD Enforcement	Operational Design Domain: declare which tools, APIs, data sources, and time windows each agent can operate in. Allowlisting, not denylisting. Modes: audit, enforce, escalate.
Magnitude Limits	Pre-action: spend caps per action/session/day, data volume limits, blast radius constraints, compute guardrails. Blocks before execution.
Human-in-the-Loop	Agent pauses on high-risk actions and escalates to a human supervisor. Configurable triggers: tool type, trust threshold, spend amount, first-time actions. Webhook notifications (Slack, Teams, email). EU AI Act Art. 14 compliance.
Intent Fingerprinting	Two-gate cryptographic verification (SHA-256 + Ed25519). Gate 1 fingerprints the decision. Gate 2 verifies it wasn't altered before execution. If an hallucination or injection changes the action between decision and execution — blocked. SOC 2 audit proof.
Agent Identity	Managed credential lifecycle per agent: dynamic provisioning, automatic rotation with grace periods, instant revocation. DID-based. No more shared static API keys across agents.
Programmable Hooks	Python scripts or YAML conditions that run at pre_action, post_action, on_error, on_session_end. Stop hooks can block session completion until checks pass.
Context Versioning	Every config push creates an immutable SHA-256 snapshot. Audit logs reference the exact policy version active at the time. Diff, rollback, full change history.
FinOps	Cost-per-outcome tracking, smart model routing (route simple tasks to cheaper models), semantic caching (skip LLM calls for repeated queries), budget alerts at 50/80/95%. Dashboard shows: "AgentMesh saved you $X this month."
Deterministic Fallback	When CB trips, operations don't die — they failover to deterministic code, a simpler agent, a human operator, or a retry queue. State is preserved. Business continuity, not just error handling.
Secure A2A	Agent-to-agent communication routed through a governance gateway. Mutual authentication (DID exchange), channel policies (who talks to whom), prompt worm prevention (injection scan on inter-agent messages), propagation depth limits.
Multi-Agent Topology	Live directed graph of agent interactions. Detects resource contention, contradictory actions, cascade amplification, circular dependencies. Fleet Health Score (0–100).
Chaos Engineering	Controlled fault injection: deny tools, inject latency, expire credentials, exhaust budgets, disconnect peers. Governance Grade (A–F) based on how the system responds. Safety-gated via HITL approval.
Collective Intelligence	When one agent detects a threat, every agent benefits. Anonymous IOC (Indicator of Compromise) sharing across tenants. 6 AI-native IOC types. EigenTrust scoring for IOC quality. Sub-5s propagation. One detection in São Paulo protects a deployment in Berlin.
Observability	Session traces with span trees, latency breakdowns (P50/P95/P99), violation heatmaps, drift detection from intent fingerprint mismatches, loop detection, quality scoring, A/B testing between policy versions.
Alerting	Configurable rules in YAML. Slack, email, PagerDuty. "Alert if drift rate > 5%", "Alert if daily spend > $100", "Alert if injection events > 10/hour".
OTEL & SIEM Export	Pipe traces to Datadog, Grafana, New Relic via OpenTelemetry. Export security events to Splunk, ELK via STIX 2.1 or CEF.
Audit Trail	SHA-256 hash chain with Ed25519 signatures. Every action logged with cryptographic integrity, policy snapshot reference, and intent proof. Tamper-evident, exportable, regulator-ready.
Compliance Reports	EU AI Act Articles 9, 11, 12, 14. Generated from real scan data and runtime telemetry. Exportable for auditors and regulators.

Supported Frameworks

LangGraph, CrewAI, AutoGen — AST-based discovery
LangChain, LlamaIndex, PydanticAI — import/pattern detection

Framework detection is automatic.

CI/CD Integration

# .github/workflows/agentmesh.yml
name: AgentMesh Governance
on: [push, pull_request]

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: "3.12" }
      - run: pip install useagentmesh
      - run: agentmesh scan . --format sarif > results.sarif
      - run: agentmesh scan . --fail-on critical --threshold 70
      - uses: github/codeql-action/upload-sarif@v3
        with: { sarif_file: results.sarif }
        if: always()

How It Works

Discovery — Reads pyproject.toml / requirements.txt, identifies framework via import analysis
AST Parsing — Parses every .py file. Extracts agents, tools, models, prompts, MCP servers, permissions
Policy Evaluation — Runs 60 rules against the BOM. Each rule is a Python class with an evaluate() method
Scoring — Deducts points per severity with caps. Range: 0–100
Output — Terminal, JSON, or SARIF 2.1.0
Runtime — govern() wraps every tool call through the enforcement pipeline. Each call evaluated before execution.

Policy evaluation is deterministic — no LLM, no heuristics, no probabilistic scoring.

Benchmarks

10,000 iterations, time.perf_counter_ns(), after 1,000 warmup:

Scenario	P50	P99
Single rule	0.031ms	0.08ms
Full scan (60 rules)	1.84ms	3.2ms
Batch (100 tool calls)	1.79ms	2.8ms

Governance overhead: <0.2% of a typical LLM call (~800ms).

EU AI Act

High-risk system rules take effect August 2, 2026.

Article	Requirement	How AgentMesh covers it
Art. 9	Risk management	Scan rules (COM-004), ODD enforcement, magnitude limits
Art. 11	Technical documentation	Agent BOM, compliance reports, context versioning
Art. 12	Record-keeping / logging	Cryptographic audit trail with policy snapshot references
Art. 14	Human oversight	HITL checkpoints, programmable hooks, escalation policies

Runtime platform generates exportable compliance reports for these articles.

Roadmap

AgentMesh today covers the full governance stack: static analysis, runtime enforcement, cryptographic audit, human oversight, cost control, collective threat intelligence, and multi-agent security.

What comes next:

Beyond Python — governance for agents built in TypeScript, Go, and via REST. The enforcement pipeline, not the language, is the product. If it makes a tool call, AgentMesh governs it.

The compliance layer enterprises need — SOC 2 Type II audit packages, ISO 42001 certification support, FedRAMP groundwork, and packaged compliance templates for financial services, healthcare, and critical infrastructure. Governance that survives a regulator asking questions at 9am.

Governance-aware model routing — real-time decision engine that selects models based on task risk profile, not just cost. High-risk financial action → routes to the most reliable model with full intent verification. Low-risk summarization → routes to the cheapest model with audit-only. The governance policy decides the model, not the developer.

Autonomous remediation — when AgentMesh detects a governance gap, it doesn't just report it. It proposes a fix, runs it through the policy engine, and applies it with human approval. From "here's what's wrong" to "here's what we fixed" in one pipeline.

Agent insurance scoring — a quantified risk profile that maps directly to AI liability insurance underwriting. Your AgentMesh governance score, audit trail completeness, HITL coverage, and incident history become a standardized package that insurers can evaluate. The companies that can prove governance get coverage. The ones that can't, don't.

The governance mesh — federated governance across organizational boundaries. When Company A's agent talks to Company B's agent, both governance policies apply. Cross-org audit trails, mutual trust verification, and regulatory reporting that spans supply chains. The agent economy needs governance infrastructure, not just governance tools.

AgentMesh is actively developed and moving fast. If you're deploying AI agents to production, watch the repo — or better, run the scan and see what it finds.

License

BUSL-1.1. Free to use in production. Cannot offer governance capabilities as a competing hosted service. Converts to Apache 2.0 four years after release. See LICENSE.

See It in Action

git clone https://github.com/angelnicolasc/agentmesh.git
cd agentmesh/examples/demo-crewai
pip install useagentmesh
agentmesh scan .

The demo project has intentional governance gaps and scores ~35 (Grade F). See what AgentMesh finds.

Editor Support

Add to .vscode/settings.json for autocomplete and validation on .agentmesh.yaml:

{
  "yaml.schemas": {
    "https://raw.githubusercontent.com/angelnicolasc/agentmesh/main/schema/agentmesh-config.schema.json": ".agentmesh.yaml"
  }
}

Contributing

See CONTRIBUTING.md for development setup, testing, and how to add new policy rules.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.1.0

Mar 18, 2026

This version

2.0.1

Mar 17, 2026

2.0.0

Mar 17, 2026

0.7.3

Mar 18, 2026

0.7.2

Mar 18, 2026

0.7.1

Mar 18, 2026

0.7.0

Mar 18, 2026

0.6.0

Mar 15, 2026

0.5.0

Mar 15, 2026

0.4.0

Mar 15, 2026

0.3.0

Mar 14, 2026

0.2.2

Mar 14, 2026

0.2.1

Mar 13, 2026

0.2.0

Mar 12, 2026

0.1.10

Mar 11, 2026

0.1.9

Mar 10, 2026

0.1.8

Mar 7, 2026

0.1.7

Mar 6, 2026

0.1.6

Mar 6, 2026

0.1.5

Mar 4, 2026

0.1.4

Mar 3, 2026

0.1.3

Mar 3, 2026

0.1.2

Feb 28, 2026

0.1.1

Feb 27, 2026

0.1.0

Feb 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

useagentmesh-2.0.1.tar.gz (137.5 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

useagentmesh-2.0.1-py3-none-any.whl (120.3 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file useagentmesh-2.0.1.tar.gz.

File metadata

Download URL: useagentmesh-2.0.1.tar.gz
Upload date: Mar 17, 2026
Size: 137.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for useagentmesh-2.0.1.tar.gz
Algorithm	Hash digest
SHA256	`7d29cb70b7c6a88213436a3bd701c6d07759a69d67eb37490edb03f02579c3ac`
MD5	`1a84a9e1da1f54ca0d13841a35668564`
BLAKE2b-256	`eebf5b431d9d1688b061182f15ce7ffddf3d9f934949b98aedcf69e81aac10b4`

See more details on using hashes here.

File details

Details for the file useagentmesh-2.0.1-py3-none-any.whl.

File metadata

Download URL: useagentmesh-2.0.1-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 120.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for useagentmesh-2.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c601bc2aa69ce04bb4490e7f6688b70235f10c6c0f87a7cab81684bba5d9c095`
MD5	`6b72d1acfaa58125464710c10593a19c`
BLAKE2b-256	`922c03521b18dc1166c791f08360e7d36092e3cb97b9e02dc05b2f04d0ce55cf`

See more details on using hashes here.

useagentmesh 2.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AgentMesh

What It Does

Quick Start

Configure Everything From One File

What the Scan Detects

Agent BOM (Bill of Materials)

Runtime Platform

Enforcement Pipeline

Capabilities

Supported Frameworks

CI/CD Integration

How It Works

Benchmarks

EU AI Act

Roadmap

License

See It in Action

Editor Support

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes