redteam-mcp

MCP server security auditor — deterministic engine + AI-native behavioral analysis

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

m0rvayne

These details have not been verified by PyPI

Project description

It doesn't tell you where your walls are thin. It walks through them.

I build MCP connectors and AI automation for businesses. 70+ connectors deployed across client projects. Some of them started acting up — dropping connections, config conflicts, servers I forgot to remove still sitting in config eating resources.

Went looking for something to audit this. Found mcp-scan — only reads tool descriptions, doesn't touch source code. Cisco's scanner — 78% false positives. Nothing that actually reads the server code and says "line 42, you have exec() with unsanitized input."

Built my own. Ran it on 106 public MCP servers. 7 had remote code execution. One of them had 25K GitHub stars.

Open-sourced because if my connectors had these problems, so do yours.

Two modes of operation:

Claude Code plugin — reads source code, probes tools, detects behavioral mismatches, maps cross-server attack chains. Interactive HTML report.
Standalone CLI — deterministic scan. 25 Semgrep rules, config health checks, SARIF output. Works in CI/CD without Claude.

What works today

Feature	Status	How
Config health scanner	Working	Dead servers, scope conflicts, credential exposure, supply chain, CVE checks
Semgrep code analysis	Working	14 rules (Python + JS/TS): injection, traversal, SSRF, eval, secrets, stdout
SARIF output	Working	GitHub Security tab integration
JSON output	Working	Machine-readable for CI/CD
Terminal output	Working	Rich colored tables with risk scores
CI exit codes	Working	`--fail-on critical` returns exit 1
LLM behavioral analysis	Working	Anthropic SDK, behavioral mismatch detection (optional)
Audit history	Working	JSONL log, cross-run comparison (new/confirmed/fixed)
Self-security audit	Working	10 vulnerabilities found and fixed in own code
Claude Code plugin	Working	AI-driven deep audit with HTML report
75+ tests	Passing	Unit, security, stress, edge cases, Hypothesis fuzzing

What doesn't work yet

Cross-server chain detection in CLI (exists in Claude Code plugin only)
Auto-fix in CLI (exists in Claude Code plugin only)
HTML report generation in CLI
MCPTox benchmark validation
Community rule contributions

Install

Claude Code plugin (deep AI-native audit):

claude plugin marketplace add m0rvayne/mcp-redteam
claude plugin install mcp-redteam
/mcp-redteam

Standalone CLI (deterministic, CI/CD ready):

pip install redteam-mcp
mcp-redteam scan ./your-mcp-server --no-llm

Remote MCP server (via URL, OAuth or token):

mcp-redteam scan-remote https://your-server.com/mcp --token <bearer>

Requires Python 3.10+. Semgrep installed separately for code analysis: pip install semgrep.

What it checks

Config Health (deterministic)

Dead/disconnected servers, scope conflicts (same server in multiple scopes), credentials in git-tracked config files (CVE-2025-59536), unpinned npx/uvx packages (supply chain), enableAllProjectMcpServers bypass (CVE-2026-21852), orphaned MCP processes.

Code Security (Semgrep, 25 rules)

Rule	What it detects	Languages
Shell injection	subprocess + shell=True with user input	Python
Path traversal	open()/Path() without realpath check	Python, JS/TS
SSRF	HTTP requests with user-controlled URL	Python, JS/TS
Eval injection	eval()/exec()/new Function() with user input	Python, JS/TS
Hardcoded secrets	API keys, tokens, passwords in source	Python, JS/TS
Stdout pollution	print()/console.log() in stdio handlers	Python, JS/TS
Missing error handling	Tool functions without try/catch	Python, JS/TS
Credential in response	API keys/tokens in tool return values	Python, JS/TS
Missing signal handler	Server without SIGTERM/SIGINT	Python
Blocking sync calls	requests.get() inside async functions	Python
OAuth over-privilege	Excessive OAuth scopes (gmail.modify, admin)	Python
No timeout on HTTP	httpx/requests/fetch without timeout	Python, JS/TS
No timeout on subprocess	subprocess/spawn without timeout	Python, JS/TS
Dangerous parameter names	Tool params named cmd, exec, eval, code	JS/TS
Env secrets without rotation	API keys from os.getenv used directly	Python

Based on 48+ CVEs, OWASP MCP Top 10, and research from Invariant Labs, Trail of Bits, Palo Alto Unit 42, OX Security, and Snyk.

LLM Behavioral Analysis (optional, requires API key)

Behavioral mismatch: tool description claims X, code does Y
Hidden operations: undeclared network requests, file writes, subprocess calls
Credential mishandling: secrets logged, leaked in errors, stored insecurely

How it compares

	mcp-scan (Snyk)	Cisco MCP Scanner	mcp-redteam
Approach	Static description scan	YARA + LLM-as-judge	Semgrep taint + LLM behavioral
Reads source code	No	Python only	Yes — Python + JS/TS
Config validation	No	Config discovery	Yes — 6 checks, CVE detection
Behavioral mismatch	No	No	Yes (LLM layer)
SARIF output	No	No	Yes
CI exit codes	Yes	No	Yes
Self-tested	Unknown	Unknown	75+ tests, self-security audit
Cloud dependency	Snyk API required	Cisco API (optional)	No — fully local

Why not just use mcp-scan?

mcp-scan reads what a server says about itself — tool descriptions. mcp-redteam checks what a server actually does — source code analysis + behavioral analysis.

A server with clean descriptions but leaky code: mcp-scan passes it. We catch it.

Real findings mcp-scan cannot detect (they live in code, not descriptions):

Trello API keys in .env committed to git
Instagram session cookies stored in plaintext
AppleScript injection via unescaped clipboard input
Google OAuth tokens with permissions 644

Audit History

Each audit saves a compact JSONL log to ~/Desktop/redteam-results/. On the next run, mcp-redteam reads previous results and compares:

confirmed — found again, higher confidence
new — first time seeing this
fixed — was in previous audit, now gone

This turns LLM non-determinism into an advantage: each run is a new perspective, the intersection is ground truth.

Architecture

 /mcp-redteam
      |
 +-----------------+
 | Phase 0: Config |
 +-----------------+
      |
 +-----------+
 | Discovery |
 +-----------+
      |
      |   1 server = 1 agent
      |
 +----------+ +----------+ +----------+ +----------+
 | Agent-01 | | Agent-02 | | Agent-03 | | Agent-N  |
 | youtube  | | trello   | | instagram| | server-N |
 | health   | | health   | | health   | | health   |
 | arch     | | arch     | | arch     | | arch     |
 | complete | | complete | | complete | | complete |
 | security | | security | | security | | security |
 +----+-----+ +----+-----+ +----+-----+ +----+-----+
      |            |            |            |
      +------+-----+-----+------+
             |
 +-------------------------+
 | Chain analysis + report |
 +-------------------------+
             |
    +----------------+
    | HTML + Fix     |
    +----------------+

Tests

75+ tests across 6 test files:

test_semgrep.py — each vulnerable fixture detected, each benign fixture clean
test_self_security.py — 21 tests: our own code audited for vulnerabilities
test_stress.py — 1000/10000 findings, concurrent scans, unicode
test_fuzzing.py — Hypothesis property-based: any input, no crash
test_edge_cases.py — corrupt JSON, missing files, null bytes, timeouts
test_models.py + test_formatters.py — unit tests for core logic

Current Limitations

Plugin requires Claude Code with connected MCP servers
CLI requires semgrep for code analysis (graceful skip if not installed)
LLM analysis requires ANTHROPIC_API_KEY
Destructive tests intentionally skipped — read-only probing only
Source code analysis works for local servers; pip/npm packages may have limited access
Plugin report quality scales with model capability (Opus > Sonnet > Haiku)
False positive rate not yet measured on production MCP servers

Known False Positive Patterns

SSRF rule triggers on httpx.get() with URL built from config, not user input
Path traversal rule triggers on open() where path is validated but validation isn't recognized as sanitizer
Stdout pollution flags print() in __main__ block (safe, not in MCP handler)

Docs

The docs/ folder is useful independently:

attack-playbook.md — 18 attack categories, 48+ CVEs, payloads and detection methods
best-practices.md — MCP server security checklist
reference-server.md — secure server templates (Python + Node.js)

References

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

m0rvayne

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

Jun 13, 2026

0.2.0

Jun 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redteam_mcp-0.3.0.tar.gz (33.9 kB view details)

Uploaded Jun 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

redteam_mcp-0.3.0-py3-none-any.whl (52.6 kB view details)

Uploaded Jun 13, 2026 Python 3

File details

Details for the file redteam_mcp-0.3.0.tar.gz.

File metadata

Download URL: redteam_mcp-0.3.0.tar.gz
Upload date: Jun 13, 2026
Size: 33.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for redteam_mcp-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`25fa1001a6b7de9b8ab5a9b59459d722b696aa9059d8c8a3b6a747d550a0189b`
MD5	`a388b1c1833b0335ca4e667f3d69f1d8`
BLAKE2b-256	`d4da1b5bdb36adcb4ab13278b1ed6e07fa11417434f9674920649b5cac1f73cd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for redteam_mcp-0.3.0.tar.gz:

Publisher: publish.yml on m0rvayne/mcp-redteam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: redteam_mcp-0.3.0.tar.gz
- Subject digest: 25fa1001a6b7de9b8ab5a9b59459d722b696aa9059d8c8a3b6a747d550a0189b
- Sigstore transparency entry: 1808002823
- Sigstore integration time: Jun 13, 2026
Source repository:
- Permalink: m0rvayne/mcp-redteam@dc7d373cec6554f8ebbc00b9a24af6ed6851593a
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/m0rvayne
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@dc7d373cec6554f8ebbc00b9a24af6ed6851593a
- Trigger Event: push

File details

Details for the file redteam_mcp-0.3.0-py3-none-any.whl.

File metadata

Download URL: redteam_mcp-0.3.0-py3-none-any.whl
Upload date: Jun 13, 2026
Size: 52.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for redteam_mcp-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b3757eb604046adcfa282e81800cd0d589bb5451fbf9da246df87b0cfef4b553`
MD5	`b30ee0e17a218d82a3e19f45e3203fc9`
BLAKE2b-256	`760639aef7bc0fb99dbd0b7dfe8f6ad3e93e82c3064fb6b2bb54315faec4b46e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for redteam_mcp-0.3.0-py3-none-any.whl:

Publisher: publish.yml on m0rvayne/mcp-redteam

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: redteam_mcp-0.3.0-py3-none-any.whl
- Subject digest: b3757eb604046adcfa282e81800cd0d589bb5451fbf9da246df87b0cfef4b553
- Sigstore transparency entry: 1808002879
- Sigstore integration time: Jun 13, 2026
Source repository:
- Permalink: m0rvayne/mcp-redteam@dc7d373cec6554f8ebbc00b9a24af6ed6851593a
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/m0rvayne
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@dc7d373cec6554f8ebbc00b9a24af6ed6851593a
- Trigger Event: push

redteam-mcp 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

What works today

What doesn't work yet

Install

What it checks

Config Health (deterministic)

Code Security (Semgrep, 25 rules)

LLM Behavioral Analysis (optional, requires API key)

How it compares

Why not just use mcp-scan?

Audit History

Architecture

Tests

Current Limitations

Known False Positive Patterns

Docs

References

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance