440 security tests for AI agent systems - MCP, A2A, L402, x402 wire-protocol testing, decision governance, AIUC-1 compliance, NIST AI 800-2 aligned

These details have not been verified by PyPI

Project links

Research

Project description

Agent Security Harness

Even if an agent is properly authenticated and authorized, can it still be manipulated into unsafe or policy-violating behavior?

440 executable security tests across 31 modules. MCP + A2A + L402 + x402 wire-protocol testing. Decision-layer attack scenarios. One pip install away.

$ agent-security test mcp --url http://localhost:8080/mcp
Running MCP Protocol Security Tests v3.10...
 MCP-001: Tool List Integrity Check [PASS] (0.234s)
 MCP-002: Tool Registration via Call Injection [PASS] (0.412s)
 MCP-003: Capability Escalation via Initialize [FAIL] (0.156s)
...
Results: 8/10 passed (80% pass rate) - see report.json

Quick Start

pip install agent-security-harness

# Test an MCP server
agent-security test mcp --url http://localhost:8080/mcp

# Test an x402 payment endpoint
agent-security test x402 --url https://your-x402-endpoint.com

See docs/QUICKSTART.md for mock server setup, rate limiting, MCP server mode, and CI/CD integration.

Three Layers of Agent Decision Security

Layer	What it covers	Example focus
Protocol Integrity	Prevent spoofing, replay, downgrade, diversion, and malformed protocol behavior	MCP, A2A, L402, x402 wire-level tests
Operational Governance	Validate session state, capability boundaries, platform actions, trust chains, and execution context	capability escalation, facilitator trust, provenance, session security
Decision Governance	Test whether an agent should act at all under its authority, confidence, scope, and policy constraints	autonomy scoring, scope creep, return-channel poisoning, normalization-of-deviance

How This Differs From Other Projects

Capability	Invariant MCP-Scan (2K stars)	Cisco MCP Scanner (865 stars)	Snyk Agent Scan (2K stars)	NVIDIA Garak (7K stars)	This framework
What it does	Scans installed MCP configs for tool poisoning	YARA + LLM-as-judge for malicious tools	Scans agent configs for MCP/skill security	LLM model vulnerability testing	Active protocol exploitation + decision governance
Approach	Static analysis	Static + LLM classification	Config scanning	Model-layer probing	Wire-protocol adversarial testing
MCP coverage	Tool descriptions, config files	Tool descriptions, YARA rules	Config files	-	14 tests: real JSON-RPC 2.0 attacks
A2A coverage	-	-	-	-	13 tests
L402/x402 coverage	-	-	-	-	85 tests
Enterprise platforms	-	-	-	-	25 cloud + 20 enterprise
APT simulation	-	-	-	-	GTG-1002 (17 tests)
Jailbreak/over-refusal	-	-	-	Yes	50 tests (25 + 25 FPR)
AIUC-1 certification	-	-	-	-	Maps to all 24 requirements
Research backing	-	Cisco blog	-	Papers	5 DOIs + 3 NIST submissions
MCP server mode	-	-	-	-	Yes - invoke from any AI agent
Statistical testing	-	-	-	-	Wilson CIs, multi-trial
Total tests	Pattern matching	YARA rules	Config checks	Model probes	440 active tests

Use both. Scan with Invariant MCP-Scan or Cisco MCP Scanner for static analysis. Test with this framework for active exploitation. They're complementary layers.

Research

Five peer-reviewed preprints and three NIST submissions underpin the methodology:

Publication	DOI
Constitutional Self-Governance for Autonomous AI Agents — 12 governance mechanisms, 77 days production data, 56 agents	10.5281/zenodo.19162104
Detecting Normalization of Deviance in Multi-Agent Systems — First empirical demonstration that automated harnesses detect behavioral drift	10.5281/zenodo.19195516
Decision Load Index (DLI): A Quantitative Framework for Agent Autonomy Risk — Measuring cognitive burden of AI agent oversight	10.5281/zenodo.18217577
Normalization of Deviance in Autonomous Agent Systems — Foundational research on behavioral drift patterns	10.5281/zenodo.15105866
Cognitive Style Governance for Multi-Agent Deployments — Governance mechanisms for managing cognitive style across multi-agent systems	10.5281/zenodo.15106553

Documentation

Resource	Link
Expanded Quick Start	docs/QUICKSTART.md
Full Test Inventory (439 tests)	docs/TEST-INVENTORY.md
AIUC-1 Crosswalk	docs/AIUC1-CROSSWALK.md
Advanced Capabilities	docs/ADVANCED.md
MCP Server	docs/mcp-server.md
CI/CD GitHub Action	docs/github-action.md
Payment Attack Taxonomy	docs/PAYMENT-ATTACK-TAXONOMY.md
Comparison (detailed)	docs/COMPARISON.md
Privacy & Telemetry	docs/PRIVACY.md

Roadmap

v3.10 -- Prove It to Auditors ✅ Shipped. v4.1 -- Compliance Evidence ✅ Shipped. 439 tests, 31 modules, AUROC metrics, EU AI Act + ISO 42001 crosswalks, FRIA evidence, kill-switch compliance, watermark adversarial tests, HTML compliance report generator. v4.0 -- Lock the Category (H2 2026): benchmark corpus, schema standardization, longitudinal registry. Full details in ROADMAP.md.

Used By

Who	Use Case
FransDevelopment / Open Agent Trust Registry	OATR SDK v1.2.0 test fixtures (X4-021 through X4-030) -- Ed25519 attestation verification

Using the harness? Open a PR to add yourself, or tag us in your project.

Contributing

See CONTRIBUTING.md for guidelines, SECURITY_POLICY.md for security policy, and CONTRIBUTION_REVIEW_CHECKLIST.md for the PR checklist.

License

Apache License 2.0 -- see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Research

Release history Release notifications | RSS feed

4.1.1

Apr 12, 2026

This version

4.1.0

Apr 10, 2026

3.10.1

Apr 11, 2026

3.10.0

Apr 9, 2026

3.9.0

Apr 2, 2026

3.8.1

Mar 29, 2026

3.8.0

Mar 28, 2026

3.7.0

Mar 25, 2026

3.6.0

Mar 24, 2026

3.5.0

Mar 24, 2026

3.4.0

Mar 24, 2026

3.3.0

Mar 24, 2026

3.2.0

Mar 23, 2026

3.1.0

Mar 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_security_harness-4.1.0.tar.gz (337.9 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agent_security_harness-4.1.0-py3-none-any.whl (378.1 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file agent_security_harness-4.1.0.tar.gz.

File metadata

Download URL: agent_security_harness-4.1.0.tar.gz
Upload date: Apr 10, 2026
Size: 337.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agent_security_harness-4.1.0.tar.gz
Algorithm	Hash digest
SHA256	`8d030f3e49ae07c402a4b5e16a9f4e069dc52ea4ab720dc83902c24439b7151d`
MD5	`8de7c154ba73fb5cd13d0e3e0fa57ae5`
BLAKE2b-256	`9a43b5011e256b9cab13a592f8be1bc7b3407fb469c268f31d836d9bcf930263`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_security_harness-4.1.0.tar.gz:

Publisher: publish-pypi.yml on msaleme/red-team-blue-team-agent-fabric

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_security_harness-4.1.0.tar.gz
- Subject digest: 8d030f3e49ae07c402a4b5e16a9f4e069dc52ea4ab720dc83902c24439b7151d
- Sigstore transparency entry: 1271727112
- Sigstore integration time: Apr 10, 2026
Source repository:
- Permalink: msaleme/red-team-blue-team-agent-fabric@2a8348cbb020c6799c0fb12d59827c5adfdbcb34
- Branch / Tag: refs/tags/v4.1.0
- Owner: https://github.com/msaleme
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@2a8348cbb020c6799c0fb12d59827c5adfdbcb34
- Trigger Event: release

File details

Details for the file agent_security_harness-4.1.0-py3-none-any.whl.

File metadata

Download URL: agent_security_harness-4.1.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 378.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agent_security_harness-4.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e865014c7023c1880642cd0396127d99d7228b47d7a42c0a1487df3ce2bbd906`
MD5	`8c6c3f68d5fb443814fa60311d77b206`
BLAKE2b-256	`b219b0a7f64b7085b8051c08f25e971f1594a53599db1b8c25ac84bb8fa37e93`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agent_security_harness-4.1.0-py3-none-any.whl:

Publisher: publish-pypi.yml on msaleme/red-team-blue-team-agent-fabric

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agent_security_harness-4.1.0-py3-none-any.whl
- Subject digest: e865014c7023c1880642cd0396127d99d7228b47d7a42c0a1487df3ce2bbd906
- Sigstore transparency entry: 1271727114
- Sigstore integration time: Apr 10, 2026
Source repository:
- Permalink: msaleme/red-team-blue-team-agent-fabric@2a8348cbb020c6799c0fb12d59827c5adfdbcb34
- Branch / Tag: refs/tags/v4.1.0
- Owner: https://github.com/msaleme
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@2a8348cbb020c6799c0fb12d59827c5adfdbcb34
- Trigger Event: release

agent-security-harness 4.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Agent Security Harness

Quick Start

Three Layers of Agent Decision Security

How This Differs From Other Projects

Research

Documentation

Roadmap

Used By

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance