Skip to main content

Security testing toolkit for MCP servers and AI agents.

Project description

mcp-witness

tests ruff license: Apache 2.0 python: 3.11+ findings: 12 CVE-track classes: 2

Security scanner for Model Context Protocol servers and AI agents.

Found and disclosed a real SSRF in Anthropic's reference mcp-server-fetch

Demonstrated on EC2 with live AWS IAM credentials retrieved. Static rule MCP-S-009 flagged the missing scheme/host validation on mcp-server-fetch's fetch tool; the dynamic harness (MCP-D-003) drove the live probe against an EC2 t3.micro with IMDSv2 set to Optional and got back a real AccessKeyId / SecretAccessKey / Token triplet for the attached IAM role. Coordinated disclosure filed as modelcontextprotocol/servers#4143 on 2026-05-12; fix PR #4226 by @kgarg2468 shipped 2026-05-22 with scheme allowlist + RFC-reserved-range denylist + per-redirect validation. The same demo that previously returned IAM credentials now returns "Fetching private or non-public IP addresses is not allowed"independently verified 2026-05-22.

That's one of two CVE-track vulnerability classes mcp-witness's detectors have surfaced in real PyPI-published MCP servers. The other is DNS rebinding on HTTP-transport MCP servers (4 packages flagged, all 4 now under coordinated disclosure with the same 2026-08-10 embargo). Six coordinated disclosures total. Together they cover both ends of the MCP transport boundary — server reaching out (SSRF), browser reaching in (DNS rebind).

This isn't theoretical scanning. The detectors that found these vulnerabilities are static rule MCP-S-009 (SSRF) + dynamic scenario MCP-D-003 (SSRF) + static rule MCP-S-014 v0.3 (DNS rebinding, four-patch W1-W4 series developed from the survey work itself).

Real findings to date

Twelve documented audit observations against eleven PyPI-published servers, captured in findings/, plus a DNS-rebinding survey that frames four of them as one class:

Date Target Test Outcome
2026-05-11 mcp-server-fetch D-003 (direct SSRF probe) Vulnerability — env-dependent SSRF, demonstrated on EC2 2026-05-12 (real IAM credentials retrieved); disclosure #4143; fix PR #4226 independently verified 2026-05-22
2026-05-11 mcp-server-http-request D-003 (direct SSRF probe) Vulnerability — second instance of same SSRF class; disclosure email-filed 2026-05-12 (embargo 2026-08-10)
2026-05-12 mcp-streamablehttp-proxy v0.2.0 S-014 (static, DNS rebinding) Vulnerability — 127.0.0.1 + no Origin/Host check; universal escalation against whatever stdio MCP it proxies; disclosure filed 2026-05-12 (embargo 2026-08-10)
2026-05-12 mcp-fetch-streamablehttp-server v0.2.0 S-014 (static, DNS rebinding + 0.0.0.0 + recursive SSRF) Vulnerability — 0.0.0.0 + wildcard CORS + inherited fetch SSRF; co-disclosed with proxy finding (embargo 2026-08-10)
2026-05-12 fastmcp-http v0.1.4 S-014 (static, DNS rebinding) Vulnerability — Flask dev server on 0.0.0.0, no middleware anywhere; public-issue disclosure filed 2026-06-02 (embargo 2026-08-10)
2026-06-02 mcp-server-fetch-sse v0.1.1 S-014 (static, W1+W3 — aiohttp TCPSite) Vulnerability — DNS rebind + inherited pre-PR-#4226 SSRF; maintainer + Anthropic Security notified 2026-06-02 (embargo 2026-08-10)
2026-05-11 mcp-server-fetch D-001 against Claude Opus 4.7 Defense
2026-05-11 mcp-server-fetch D-006 against Claude Opus 4.7 Defense
2026-05-11 mcp-server-time S-003 static, 3 hits Info (pattern present; benign in this deployment)
2026-05-11 mcp-server-git D-002 (direct path-traversal) Defense (defense-in-depth example)
2026-05-11 mcp-server-aidd D-002 (direct path-traversal) Defense (allowed-directory containment working)
2026-05-11 mcp-server-aidd S-001 + S-002 + S-005 (static, multi-hit) Info (3 simultaneous rules — pattern stress test)

Each entry includes reproduction commands, the raw trace, an interpretation, caveats, and a disclosure recommendation. Two CVE-track classes are now in flight: SSRF in stdio fetch servers (2 packages, disclosed; 1 fix verified) and DNS rebinding in HTTP-transport servers (4 packages, all disclosed).

Quickstart — audit any pip-installable MCP server in one command

git clone https://github.com/desledishant10/mcp-witness && cd mcp-witness
pip install -e ".[dev]"
mcp-witness-audit mcp-server-fetch

That single command pip-installs mcp-server-fetch (or any other PyPI MCP server), captures its tools/list over stdio, runs the static analyzer rules, runs the capability classifier, and prints a human-readable report. Sample output against the official Anthropic reference server:

=== mcp-server-fetch ===
Launched: /usr/bin/python3 -m mcp_server_fetch
Tools:    1
  fetch
Capability tags: net_egress

2 analyzer findings:
  [HIGH    ] MCP-S-001  fetch
      Tool description contains instruction-like phrasing directed at the model.
  [HIGH    ] MCP-S-009  fetch
      Tool has URL parameter(s) ['url'] with no schema-level constraint
      and no validation keywords in the description.

Capture saved to: calibration/reports/captured-mcp-server-fetch.json

Two real findings on the official Anthropic reference server, surfaced from one mcp-witness-audit command — one at the description level (S-001 catches agent-directed instructions in the tool's docstring), one at the schema level (S-009 catches the missing SSRF allowlist). The SSRF one was demonstrated end-to-end on EC2 with real IAM credentials retrieved, disclosed via #4143, and fixed by PR #4226 (verified).

What's inside

CLIs

Command Purpose
mcp-witness-audit One-shot: install + capture + analyze + classify + report against any pip-installable MCP server
mcp-witness-capture Connect to any stdio MCP server, dump tools/list as JSON
mcp-witness-scaffold-gt Generate a calibration ground-truth skeleton from a capture
mcp-witness-analyze Static analysis — Python source or captured JSON
mcp-witness-classify Run the capability classifier on a tool definition
mcp-witness-eval-calibration Compare classifier predictions to hand-labeled ground truth
mcp-witness-lint-scenarios YAML lint for scenario files (catches null-byte smuggling, parse errors, schema violations)
mcp-witness-test Run a dynamic scenario against a real MCP server, optionally with a real LLM agent

Static analyzer rules (14 of 14 v0.1 rules implemented)

ID What it catches Mode
MCP-S-001 Imperative phrasing in tool description (you-must, now-you-can, grants-you, were-advised) per-tool, heuristic
MCP-S-002 One tool's description references another tool by name (poisoning vector) server-level, heuristic
MCP-S-003 Hidden instructions in schema sub-fields (parameter descriptions, titles, $comment) per-tool, heuristic
MCP-S-004 Tool annotation declares read-only / non-destructive but name/description indicates the opposite per-tool, heuristic
MCP-S-005 Overbroad capability surface (e.g. fs_read + net_egress = exfil pair) server-level, classifier-driven
MCP-S-006 Path traversal in file-handling tool per-tool, AST + taint
MCP-S-007 Shell command injection (subprocess(shell=True), os.system, os.popen) per-tool, AST
MCP-S-008 Database-query tool with no apparent input constraint (no parameterized-query mention, no schema pattern) per-tool, heuristic on tools/list
MCP-S-009 URL-fetching tool with no scheme/host allowlist (catches the SSRF class flagged dynamically by D-003) per-tool, heuristic on tools/list
MCP-S-010 Hardcoded API keys / tokens / PEM private keys / JWTs / .env files committed in source repo-level, regex + filename
MCP-S-011 Tool handler logs parameters, request data, headers, or env-derived secrets to stderr/stdout (debug-gated calls suppressed) per-tool, AST
MCP-S-012 RootsCapability referenced but list_roots() never called — declared containment guarantee not enforced repo-level, AST
MCP-S-013 Prompt template interpolates handler parameters into system/assistant-role messages without sanitization repo-level, AST + light taint
MCP-S-014 HTTP transport binds to loopback / 0.0.0.0 without Origin/Host validation (DNS rebinding); CORS allow_origins=['*'] + allow_credentials=True antipattern. v0.3 adds W1 (host-variable resolution), W2 (AST-based Origin validation check), W3 (aiohttp.web bind shapes), W4 (os.getenv default resolution). repo-level, AST

Every rule's lexicon decisions are commented with the calibration evidence that drove them. Spec for all 14 rules: docs/static-rules.md.

Dynamic scenarios (7 in the seed set)

Scenario YAML format: docs/scenario-schema.md. The seed set lives in scenarios/ and is described in scenarios/README.md. Highlights:

  • MCP-D-002 — path traversal in filesystem-read tools (direct probe, no LLM needed)
  • MCP-D-003 — SSRF in URL-fetching tools (direct probe; produces the cloud-metadata finding)
  • MCP-D-001 / MCP-D-006 — tool-description injection at two phrasing tiers (obvious vs subtle)
  • MCP-D-004 — tool-definition rug pull (mutate after first approval)
  • MCP-D-005 — invisible Unicode-tag-character injection via tool output
  • MCP-D-007 — cloud metadata-service exfiltration via URL-fetcher (strict oracle — only fires on actual metadata response content, designed for EC2/GCP/Azure audit verification)

Capability classifier

Layer 1 (lexical) classifier shared by analyzer rules and harness scenario filtering. 8 capability tags, 8 parameter roles, three-tier confidence (high / medium / low). Per the spec, promotion to "stable" requires ≥10-target corpus + ≥90% precision on high-confidence outputs.

Current calibration corpus state: 10 labeled targets, 81 tools, 100% precision and 100% recall on all four exercised capability tags (exec, fs_read, fs_write, net_egress). Verified by direct capture against 8 of 10 targets. Hit the spec's ≥10-target / ≥0.9-precision / ≥0.75-recall "stable" threshold. See calibration/README.md.

Dynamic harness

Two modes auto-selected per scenario:

  • Direct mode — agent-less, harness as MCP client. Used for server-side validation tests (path traversal, SSRF).
  • Proxy mode — harness mediates between an agent under test and the real target, applying mutations to tool descriptions and outputs in flight. Used for agent-side scenarios (description injection, rug pull, output injection).

Two agent driver implementations: stub (deterministic, plumbing tests) and anthropic (real Claude tool-use loop). See harness/README.md.

Architecture

mcp-witness/
  analyzer/      Static analysis: Python AST + captured-JSON modes, 14 rules
  classifier/    Capability tagger: 8 tags, 8 param roles, Layer 1 (lexical)
  harness/       Dynamic test runner: direct + proxy modes, stub + Claude agents
  scenarios/     Attack-scenario YAML library (7 scenarios)
  calibration/   Ground-truth corpus + eval driver + capture/scaffold tools
  findings/      Audit-trail record (12 entries + 1 class survey; append-only)
  disclosures/   Append-only log of outgoing coordinated-disclosure communications
  docs/          Specs: rules, scenarios, classifier, threat model

Roadmap and current state

Phase Planned window Status
1 — Static analyzer weeks 1–6 Complete. All 14 v0.1 rules implemented (S-001..S-014); v0.3 patches W1–W4 closed the DNS-rebind detector gap surfaced by the survey. Python AST + captured-JSON modes + repo-level scanning; CLI with severity filtering and CI-friendly exit codes.
2 — Dynamic harness weeks 7–14 Substantially complete: direct + proxy modes, two agent drivers, 7 scenarios runnable end-to-end against real servers.
3 — Real-world audit weeks 15–20 Substantially complete. 12 documented findings against 11 PyPI-published servers, plus a DNS-rebinding class survey. Two CVE-track classes: SSRF in fetch-family servers (2 packages disclosed, 1 fix shipped + independently verified) and DNS rebinding in HTTP-transport servers (4 packages, all under coordinated disclosure with 2026-08-10 embargo).
4 — Polish + publish weeks 21–26 In flight. Embargo-day blog draft in drafts/ (excluded from Pages indexing pre-embargo); EC2 audit runbook in docs/. PyPI release + conference submission queued.

Scope and non-goals

In scope for v1:

  • Static and dynamic vuln discovery in MCP server implementations
  • Agent-level attack scenarios against spec-conformant servers
  • Python servers (analyzer); any-language servers via captured tools/list JSON
  • stdio transport (harness)
  • MCP spec version 2025-06-18

Not yet implemented (planned):

  • TypeScript analyzer support (tree-sitter; would unlock 5 queued TS calibration targets)
  • SSE / Streamable HTTP transports in the harness
  • DNS / filesystem canaries (HTTP only today)

Out of scope for v1 (intentional — these are good follow-ups, not features):

  • Runtime "agent firewall" / behavioral detection / EDR-for-agents
  • Non-MCP agent frameworks (LangChain, AutoGPT, custom orchestrators)
  • Production WAF-style protection
  • Host-side hardening recommendations beyond what the scanner emits

Project metrics

Tests passing 164 / 164
Analyzer rules 14 of 14 (S-001..S-014) — v0.1 spec complete + v0.3 W1–W4 patches
Dynamic scenarios 7 (5 from v0.1 seed set + D-006 subtle-injection + D-007 cloud-metadata-exfil)
Calibration corpus 10 labeled targets, 81 tools, 100/100 precision-recall (8 verified by direct capture) — hit the spec's "stable" threshold
Real-world finding entries 12 + 1 class survey (6 vulnerabilities across 2 disclosure-track classes — SSRF + DNS rebinding; 4 defense; 2 informational)
Coordinated disclosures filed 6 (1 fix shipped + independently verified; 5 awaiting maintainer response with 2026-08-10 embargo)
Packages 5 (analyzer, classifier, harness, calibration + scenarios as YAML)
Console scripts 8

Running the test suite

pip install -e ".[dev]"
pytest

With coverage:

pytest --cov=analyzer --cov=classifier --cov=harness --cov=calibration --cov-report=term-missing

Lint + format:

ruff check .
ruff format --check .

Optional pre-commit hooks (one-time setup):

pip install pre-commit
pre-commit install

After install, every git commit runs ruff + a fast tests subset on push.

For dynamic scenarios that exercise a real LLM, install the optional Anthropic agent and set an API key:

pip install "mcp-witness[anthropic]"
export ANTHROPIC_API_KEY=sk-ant-...
mcp-witness-test scenarios/MCP-D-001-tool-desc-injection-fetch.yaml \
    --server-cmd python --server-arg=-m --server-arg=mcp_server_fetch \
    --agent anthropic

Methodology and threat model

Attack surface enumerated by MCP primitive (tools, resources, prompts, sampling, roots) plus transport and cross-cutting agent-level attacks. Every detection rule and every scenario carries a category field mapped back to this taxonomy. The categories and tag/role vocabularies live in:

Responsible disclosure

Findings against third-party servers follow coordinated disclosure: maintainers receive 90 days from notification before public release, extended if a fix is in active development. Reporters using mcp-witness are expected to follow the same practice. The disclosures/ directory documents every outgoing coordinated-disclosure communication, including channel-decision audit trails (e.g. when a public issue was the channel of last resort, when an intake automation deflected, when a follow-up was sent).

Policy + contact: SECURITY.md.

Contributing

Issue discussion and ruleset/scenario proposals are welcome. Highest-leverage places to contribute right now:

  1. Label more calibration targets. The corpus needs more labeled servers for the classifier to grow beyond the current stable threshold. The workflow is press-button (capturescaffold-gt → hand-label → eval-calibration). See calibration/README.md.
  2. Propose new analyzer rules following docs/static-rules.md §Rule lifecycle. Real-world evidence (from a captured server) is the bar for inclusion.
  3. Propose new dynamic scenarios following docs/scenario-schema.md. Same evidence bar.

License

Apache 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_witness-0.2.0.tar.gz (226.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_witness-0.2.0-py3-none-any.whl (136.2 kB view details)

Uploaded Python 3

File details

Details for the file mcp_witness-0.2.0.tar.gz.

File metadata

  • Download URL: mcp_witness-0.2.0.tar.gz
  • Upload date:
  • Size: 226.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for mcp_witness-0.2.0.tar.gz
Algorithm Hash digest
SHA256 073c27011d7cab13bbd1851476649d5f0ac3f858d751af3ae05fb91632f47261
MD5 835ab05d008c5330988a4ab15855e6da
BLAKE2b-256 5195359465306372b6da1edb65da907db304792069efd532f685cd63c3954636

See more details on using hashes here.

File details

Details for the file mcp_witness-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: mcp_witness-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 136.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for mcp_witness-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 319583085ed8cd3517e351808e36fee97d026c6291b35263b916fa66ce9cff95
MD5 1b57edcf2970a7cc310e466380bc6b70
BLAKE2b-256 3bbc6505182eb006e27ef3660a720bc7437a14d3a40af3d6d2bec352abe2bc69

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page