MCP proxy server with behavioral profiling, security scanning, risk gating, and safe execution
Project description
MCP safety warden is a proxy server that wraps any MCP server and adds behavioral profiling, security scanning, risk gating, and safe execution to its tools.
Overview
Most MCP servers expose tools with no information about what those tools actually do at runtime: whether they write data, call external services, delete things, or produce outputs that contain adversarial content.
Instead of calling a wrapped server's tools directly, you route calls through this wrapper. It classifies each tool, builds a behavior profile from observed runs, checks for injection attacks, and blocks or gates risky tools before they execute.
Behavioral profiling
- Static classification of effect class (read_only, additive_write, mutating_write, external_action, destructive), retry safety, and destructiveness.
- LLM-assisted classification via Anthropic, OpenAI, Gemini, or Ollama - LLM and rule-based signals are combined via weighted voting, producing higher confidence across all tools.
- Observed stats updated after every proxied call: p50/p95 latency, failure rate, output size, schema stability.
Security scanning
- mcpsafety+ five-stage pipeline: Recon, Planner, Hacker (live probing), Auditor (CVE/Arxiv research), Supervisor (final report). Enhanced over mcpsafetyscanner (Radosevich & Halloran, arxiv 2504.03767).
- LLM provider choice for mcpsafety+: Anthropic, OpenAI, Gemini, or Ollama (local, no API key).
- Multi-server scan: run the full pipeline against every registered server in one call via
scan_all_servers. - Cisco AI Defense: AST and taint analysis, YARA rules, optional cloud ML engine.
- Snyk: prompt injection, tool shadowing, toxic data flows, hardcoded secrets.
- Kali MCP integration: if a Kali Linux MCP server is registered,
quick_scan,vulnerability_scan, andtracerouterun against the target host at the start of the pipeline. The results are embedded in the Recon output so the Planner can ground its attack hypotheses in real port and service data rather than guessing from tool schemas alone. - Burp Suite MCP integration: if a Burp Suite MCP server is registered, the Hacker stage sends raw HTTP/1.1 probes directly to the MCP endpoint (malformed JSON, missing headers, oversized payloads), triggers Collaborator out-of-band payloads to detect blind SSRF (Pro edition), and pulls automated scanner findings (Pro edition). Proxy history feeds the Auditor as raw evidence. Community edition tools run automatically; Pro-only tools are tried and silently skipped if unavailable.
- All findings stored and surfaced automatically in subsequent preflight assessments.
Safe execution
- Argument scanning on every tool call: 20+ attack categories (SSRF, SQL/NoSQL/LDAP/XPath injection, command injection, path traversal, XXE, template injection, prompt injection, deserialization payloads, base64-encoded variants, Windows-specific paths). When an LLM key is set, flagged args get a second-pass LLM verification to clear false positives.
- Two-layer injection scanning on every tool output: 40+ regex patterns then LLM deep scan.
- Injection-flagged output is quarantined and never returned to the caller.
- Risk gating with per-tool permanent policies (allow/block) or per-call approval flow.
- Alternatives suggestion: when a tool is blocked, the LLM ranks safer substitutes by risk reduction and functional coverage.
CLI
- 16 subcommands covering all 17 MCP tools (
listcovers bothlist_serversandlist_server_tools). - Interactive risk menu for
call: pick an alternative, approve the original, or abort. scan-allruns the full pentest pipeline across all registered servers in one command.--jsonflag on every command for scripting and pipelines.--yes/-yflag on confirmation prompts for CI use.
Transport
- stdio (default), SSE, and streamable_http.
- Bearer token auth middleware for HTTP transports.
Use it when you need to audit what third-party or internal MCP tools actually do before trusting them in an agent workflow.
Architecture
MCP Client (Claude Desktop, agent, mcpsafetywarden CLI)
|
v
mcpsafetywarden/server.py (FastMCP, 17 tools, rate limiting, bearer auth)
|
+---> mcpsafetywarden/client_manager.py (connects to wrapped servers, records telemetry, injection scan)
|
+---> mcpsafetywarden/database.py (SQLite: servers, tools, runs, profiles, scans, policies)
|
+---> mcpsafetywarden/classifier.py (rule-based + LLM tool classification)
|
+---> mcpsafetywarden/profiler.py (computes behavior profiles from run history)
|
+---> mcpsafetywarden/scanner.py (LLM, Cisco, Snyk scan orchestration)
|
+---> mcpsafetywarden/mcpsafety_scanner.py (five-stage pentest pipeline)
|
+---> mcpsafetywarden/security_utils.py (redaction, normalisation, injection detection helpers)
mcpsafetywarden/cli.py imports from mcpsafetywarden/server.py and mcpsafetywarden/database.py directly. It does not use the MCP protocol; it calls the same Python functions that the MCP tools call, which means no network hop for CLI usage.
Request flow for safe_tool_call:
- Lookup tool record and behavior profile in SQLite.
- Check permanent policy (allow/block).
- Run
_preflight_assessment: compute risk level from profile and latest security scan findings. - If low or medium-low risk: scan args for threats -> forward call to wrapped server via
client_manager-> scan output -> record telemetry -> return result. - If medium/high risk and not approved: fetch LLM-ranked alternatives, return blocked response with numbered menu.
- If approved or alternative selected: scan args for threats -> execute -> scan output -> record telemetry -> return result.
Prerequisites
- Python 3.10 or later.
pipfor dependency installation.- At least one wrapped MCP server to proxy (stdio subprocess, SSE endpoint, or streamable_http endpoint).
- Recommended: an API key for at least one LLM provider (Anthropic, OpenAI, Gemini, or a local Ollama instance).
Why an LLM key matters:
The wrapper has two operating modes depending on whether an LLM is available:
| Capability | Without LLM key | With LLM key |
|---|---|---|
| Tool classification | Rule-based heuristics only - low confidence on ambiguous tool names | LLM resolves ambiguous cases; higher confidence across the board |
| Injection scanning | Regex patterns only (40+ rules) | Regex + LLM deep scan - catches obfuscated and novel injections |
| Risk gate alternatives | None - gate shows "More options" only | LLM ranks safer substitute tools by risk reduction and functional coverage |
| Security scanning | Snyk and Cisco only (metadata/static analysis, no LLM needed) | Full 5-stage pentest: Recon, Planner, Hacker, Auditor, Supervisor |
Set at minimum ANTHROPIC_API_KEY, OPENAI_API_KEY, or GEMINI_API_KEY before starting the server. For a fully local setup with no API keys, run Ollama and set OLLAMA_MODEL - then pass --provider ollama (or scan_provider="ollama") explicitly on every command, as Ollama is not auto-detected from environment variables.
Installation
git clone <YOUR_REPO_URL>
cd mcpsafetywarden
pip install .
With all optional LLM providers and scanners:
pip install ".[all]"
Or pick specific extras:
pip install ".[anthropic,snyk]"
Verify the install:
mcpsafetywarden --help
mcpsafetywarden-server --help
The SQLite database is created automatically on first run in the platform user data directory (e.g. ~/.local/share/mcpsafetywarden/ on Linux, ~/Library/Application Support/mcpsafetywarden/ on macOS, %APPDATA%\mcpsafetywarden\ on Windows). Set MCP_DB_PATH to override the location.
Optional: at-rest encryption for stored credentials
The wrapper stores server env vars and HTTP headers in the database. To encrypt them at rest:
pip install cryptography
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
Set the printed key as MCP_DB_ENCRYPTION_KEY before starting the server. Keep this key safe; losing it makes stored credentials unrecoverable.
Configuration
All configuration is via environment variables. No config file is required.
| Variable | Default | Purpose |
|---|---|---|
MCP_TRANSPORT |
stdio |
Transport mode: stdio, sse, or streamable_http |
MCP_HOST |
127.0.0.1 |
Bind address for HTTP transports |
MCP_PORT |
8000 |
Bind port for HTTP transports |
MCP_AUTH_TOKEN |
(unset) | Bearer token for HTTP transport auth. Unset means no auth (log warning is emitted). |
MCP_DB_ENCRYPTION_KEY |
(unset) | Fernet key to encrypt env_json and headers_json at rest |
ANTHROPIC_API_KEY |
(unset) | Enables Anthropic as LLM provider for classification and scanning |
OPENAI_API_KEY |
(unset) | Enables OpenAI as LLM provider |
GEMINI_API_KEY |
(unset) | Enables Gemini as LLM provider |
GOOGLE_API_KEY |
(unset) | Legacy alias for GEMINI_API_KEY |
OLLAMA_MODEL |
(unset) | Model name for Ollama provider (e.g. llama3.1, mistral) |
OLLAMA_BASE_URL |
http://localhost:11434/v1 |
Ollama API base URL (OpenAI-compatible) |
SNYK_TOKEN |
(unset) | Enables Snyk E001 prompt-injection detection |
MCP_SCANNER_API_KEY |
(unset) | Cisco AI Defense API key for cloud ML engine |
MCP_SCANNER_LLM_API_KEY |
(unset) | LLM key for Cisco internal AST analysis (falls back to OPENAI_API_KEY) |
MCP_DB_PATH |
(unset) | Override the SQLite database file path |
Example .env for local development:
MCP_TRANSPORT=stdio
ANTHROPIC_API_KEY=sk-ant-...
MCP_DB_ENCRYPTION_KEY=<generated_fernet_key>
Security note: Never commit API keys or the encryption key to version control. Pass them via environment variables or a secrets manager. The wrapper strips its own secrets (MCP_AUTH_TOKEN, MCP_DB_ENCRYPTION_KEY, and all LLM/scanner API keys) from the child process environment before spawning stdio servers. Other variables present in the parent environment are passed through.
Auxiliary Security Tool Integrations
The wrapper detects Kali and Burp by looking for registered servers whose server_id contains "kali" or "burp" (case-insensitive). Registration is the only setup step - once registered, the tools activate automatically on every scan, ping, and replay test.
Kali Linux MCP (ccq1/awsome_kali_MCPServers)
Docker-based, Apache 2.0, no auth. Adds real network reconnaissance to the Recon stage and network data to ping_server.
What it contributes:
| Pipeline stage / tool | Kali tools called | What it adds |
|---|---|---|
| Recon (before Planner) | quick_scan(target), vulnerability_scan(target), traceroute(target) |
Open ports, running services, OS fingerprint, network path - Planner uses this to craft targeted hypotheses |
ping_server |
quick_scan(target), traceroute(target) |
Network reachability detail beyond the MCP protocol ping (sse/streamable_http only — no network target for stdio) |
Setup:
# 1. Install Docker Desktop (if not already installed)
# Windows: winget install Docker.DockerDesktop
# macOS: brew install --cask docker
# Linux: https://docs.docker.com/engine/install/
# 2. Clone and build the image
git clone https://github.com/ccq1/awsome_kali_MCPServers
cd awsome_kali_MCPServers
docker build -t kali-mcps:latest .
# 3. Register with the wrapper (server_id must contain "kali")
mcpsafetywarden register kali-mcp \
--transport stdio \
--command docker \
--args '["run", "-i", "kali-mcps:latest"]'
Note: vulnerability_scan runs nmap vuln scripts which can take 60-90 seconds per target. On scan-all across many servers this adds up. Register only when you want network recon in your scans.
Burp Suite MCP (PortSwigger/mcp-server)
Kotlin, GPL-3.0, no auth, runs as an SSE server on port 9876. Community edition tools run always; Pro-only tools (Collaborator, scanner) are tried and silently skipped on failure.
What it contributes:
| Pipeline stage / tool | Burp tools called | Edition | What it adds |
|---|---|---|---|
| Hacker (after LLM probing) | SendHttp1Request x3 |
Community | Raw HTTP probes: malformed JSON body, missing Content-Type, oversized method field |
| Hacker | GenerateCollaboratorPayload, GetCollaboratorInteractions |
Pro | Out-of-band DNS/HTTP callbacks - detects blind SSRF and blind injection |
| Hacker | GetScannerIssues |
Pro | Automated active scanner findings against the MCP endpoint |
| Auditor | GetProxyHttpHistoryRegex |
Community | Raw HTTP traffic evidence for every finding the Auditor validates |
run_replay_test |
GetProxyHttpHistoryRegex |
Community | HTTP traffic captured during both tool calls, appended to the replay result |
Setup:
# 1. Install Burp Suite (Community or Professional)
# Download from https://portswigger.net/burp/releases
# 2. Build the MCP extension JAR
git clone https://github.com/PortSwigger/mcp-server.git
cd mcp-server
./gradlew embedProxyJar
# produces build/libs/burp-mcp-all.jar
# 3. Load into Burp
# Burp -> Extensions -> Add -> Java type -> select burp-mcp-all.jar
# Then go to the "MCP" tab in Burp and enable the server.
# SSE endpoint starts at http://127.0.0.1:9876/sse
# 4. Register with the wrapper (server_id must contain "burp")
mcpsafetywarden register burp-mcp \
--transport sse \
--url http://127.0.0.1:9876/sse
Snyk (snyk-agent-scan)
Python, Apache 2.0, requires a free Snyk account token. Connects to the target MCP server, lists its tools, and runs static analysis on the tool metadata (names, descriptions, schemas). It does not call any tools - it only reads what the server advertises.
What it checks:
| Code | Severity | Check |
|---|---|---|
| E001 | HIGH | Prompt injection strings in tool descriptions or schemas |
| E002 | HIGH | Tool shadowing (a tool impersonates another) |
| E004 | HIGH | Prompt injection embedded in skill definitions |
| E005 | HIGH | Suspicious download URLs in tool metadata |
| E006 | HIGH | Malicious code patterns in descriptions |
| W007 | HIGH | Insecure credential handling patterns |
| W008 | HIGH | Hardcoded secrets in tool metadata |
| W009 | MEDIUM | Direct financial execution capabilities |
| W011 | MEDIUM | Untrusted third-party content references |
| W012 | HIGH | Unverifiable external dependencies |
| W013 | MEDIUM | System service modification capabilities |
| W015 | MEDIUM | Untrusted content flows |
| W017 | MEDIUM | Sensitive data exposure patterns |
| W019 | MEDIUM | Destructive capabilities |
| W001 | LOW | Suspicious words |
| W014 | LOW | Missing skill documentation |
| W016 | LOW | Potential untrusted content |
| W018 | LOW | Workspace data exposure |
| W020 | LOW | Local destructive capabilities |
E001 (prompt injection) requires a Snyk token for Snyk's AI-based detection. All other checks run with the token present but also degrade gracefully if the token is invalid - structural and pattern-based checks are fully offline.
How it runs:
Snyk is invoked as a subprocess (snyk-agent-scan) with a temporary config JSON pointing at the target server. The binary opens its own live MCP connection, fetches the tool list, analyzes the metadata, and returns JSON findings. The wrapper normalizes these into its common findings format and stores them in the database, where they are automatically included in future preflight_tool_call responses.
Setup:
pip install snyk-agent-scan
Get a free token at app.snyk.io/account. Set it as an environment variable:
export SNYK_TOKEN=snyk_uat.<your_token>
Or pass it directly on the scan command:
mcpsafetywarden scan my-server --provider snyk --api-key snyk_uat.<your_token> --yes
Unlike Kali and Burp, Snyk is not auto-activated on every scan - it only runs when explicitly chosen as the provider via --provider snyk or provider="snyk".
CLI Reference
Global flags
All commands support --json for machine-readable output. Commands with confirmation prompts support --yes / -y to skip them.
Typical workflow
# Register, inspect, and scan a local stdio server in one step
mcpsafetywarden onboard my-server \
--transport stdio \
--command python \
--args '["my_mcp_server.py"]' \
--scan-provider anthropic
# Check what tools were discovered
mcpsafetywarden list my-server
# Execute a tool safely
mcpsafetywarden call my-server read_file --args '{"path": "/tmp/data.txt"}'
# Execute a risky tool (interactive menu appears if blocked)
mcpsafetywarden call my-server delete_file --args '{"path": "/tmp/old.txt"}'
call interactive flow when a tool is blocked:
⚠ Blocked risk: HIGH
1. list_files -- reduction: HIGH coverage: partial
2. More options
Pick: 2
B. Proceed with original tool despite risk
C. Abort
Pick [B/b/C/c]: B
✓ 142ms [explicit_approval]
To bypass the menu in scripts, pass --approved:
mcpsafetywarden call my-server delete_file \
--args '{"path": "/tmp/old.txt"}' \
--approved
Commands
list [server_id]
List all registered servers. Pass server_id to list tools on a specific server.
mcpsafetywarden list
mcpsafetywarden list my-server
mcpsafetywarden list my-server --json
onboard <server_id>
Register + inspect + security scan in one call. Prompts for authorization before scanning unless --yes is passed.
mcpsafetywarden onboard my-server --transport stdio --command python --args '["server.py"]'
mcpsafetywarden onboard my-server --transport streamable_http --url https://mcp.example.com/mcp \
--headers '{"Authorization": "Bearer TOKEN"}' \
--scan-provider anthropic --scan-model claude-opus-4-7 --scan-api-key sk-ant-... --yes
register <server_id>
Register only, without scanning.
mcpsafetywarden register my-server --transport stdio --command python --args '["server.py"]'
mcpsafetywarden register my-server --transport stdio --command python --no-inspect
mcpsafetywarden register my-server --transport stdio --command python --args '["server.py"]' --provider anthropic
inspect <server_id>
Reconnect to a registered server, refresh tools, re-classify.
mcpsafetywarden inspect my-server --provider anthropic
mcpsafetywarden inspect my-server --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...
scan <server_id>
Run a security scan against a single server. Prompts for authorization before probing.
anthropic,openai,gemini,ollama- mcpsafety+ 5-stage pipeline (Recon -> Planner -> Hacker -> Auditor -> Supervisor)cisco- Cisco AI Defense: AST taint analysis, YARA rules, optional cloud ML enginesnyk- Snyk: prompt injection, tool shadowing, toxic data flows, hardcoded secrets
For Ollama set OLLAMA_MODEL before running. Web research (DuckDuckGo/HackerNews/Arxiv CVE lookup in the Auditor stage) is skipped by default to avoid leaking findings externally; pass --web-research to enable it.
If a Kali MCP server is registered, nmap and traceroute results are shown after the findings table and included in --json output under network_scan. If a Burp Suite MCP server is registered, the number of HTTP-layer findings Burp contributed is shown as a summary line; use --json for the full evidence.
mcpsafetywarden scan my-server --provider anthropic
mcpsafetywarden scan my-server --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...
mcpsafetywarden scan my-server --provider ollama # local model, no API key
mcpsafetywarden scan my-server --provider cisco
mcpsafetywarden scan my-server --provider anthropic --web-research --destructive --timeout 600 --yes
scan-all
Run the full 5-stage mcpsafety+ pipeline against every registered server (or a comma-separated subset via --servers). Results are stored per server and displayed as a combined risk table. Only mcpsafety+ providers are supported (not cisco or snyk). Web research is skipped by default; pass --web-research to enable.
mcpsafetywarden scan-all --provider anthropic
mcpsafetywarden scan-all --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...
mcpsafetywarden scan-all --provider ollama --servers my-server,other-server --yes
mcpsafetywarden scan-all --provider openai --web-research --timeout 600 --json
call <server_id> <tool_name>
Execute a tool through the risk gate. Interactive menu appears if the tool is blocked.
Every argument value is scanned for 20+ attack categories (SSRF, SQL/NoSQL/LDAP/XPath injection, command injection, path traversal, XXE, prompt injection, deserialization payloads, base64-encoded variants, and more) before the call is forwarded. If an LLM key is set, a second-pass LLM verification runs on flagged args to clear false positives. Without an LLM key, the CLI prompts you to confirm before proceeding.
mcpsafetywarden call my-server search_web --args '{"query": "site:example.com"}'
mcpsafetywarden call my-server delete_file --args '{"path": "/tmp/x"}' --approved
mcpsafetywarden call my-server run_query --args '{"sql": "SELECT id FROM users"}' --args-scan-override
| Flag | Effect |
|---|---|
--approved |
Bypass the risk gate for a high-risk tool you have reviewed |
--args-scan-override |
Skip argument safety scanning (use only when you trust the args) |
--provider |
LLM provider for alternatives and arg verification (anthropic|openai|gemini|ollama) |
preflight <server_id> <tool_name>
Assess risk without executing.
mcpsafetywarden preflight my-server delete_file
mcpsafetywarden preflight my-server delete_file --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...
profile <server_id> <tool_name>
Print the full behavior profile.
mcpsafetywarden profile my-server read_file --json
retry-policy <server_id> <tool_name>
Print retry and timeout recommendations.
mcpsafetywarden retry-policy my-server call_api
mcpsafetywarden retry-policy my-server call_api --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...
alternatives <server_id> <tool_name>
List safer alternatives to a tool.
mcpsafetywarden alternatives my-server delete_file --provider anthropic
replay <server_id> <tool_name>
Run the tool twice and compare outputs. Prompts for confirmation. If a Burp Suite MCP server is registered, Burp proxy traffic captured during both calls is appended to the result - useful for spotting network-level differences even when output text is identical.
mcpsafetywarden replay my-server get_status --args '{"id": "123"}' --yes
policy <server_id> <tool_name>
Read or set a permanent execution policy. Without --set, prints the current policy.
By default no policy is set and safe_tool_call decides at runtime based on the behavior profile: low or medium-low risk tools run immediately, medium/high-risk tools trigger the approval gate. Setting a policy overrides that completely - allow bypasses the risk gate (argument scanning still runs unless --args-scan-override is also passed), block rejects unconditionally.
mcpsafetywarden policy my-server read_file # read current policy
mcpsafetywarden policy my-server read_file --set allow # always execute without preflight
mcpsafetywarden policy my-server drop_table --set block # never execute
mcpsafetywarden policy my-server read_file --set clear # remove policy, resume normal flow
history <server_id> <tool_name>
Show recent execution history.
mcpsafetywarden history my-server delete_file --limit 50
ping <server_id>
Check if a server is reachable. If a Kali MCP server is registered and the pinged server uses the sse or streamable_http transport, also runs quick_scan and traceroute against the target host and displays the output in labeled panels. Stdio servers have no network address to scan so Kali recon is skipped.
mcpsafetywarden ping my-server
get-scan <server_id>
Print the latest stored security scan report.
mcpsafetywarden get-scan my-server --json
Exit codes:
0: success1: error (tool not found, blocked by policy, unreachable server, invalid input)
MCP Integration
Connecting with Claude Desktop
Add the wrapper to claude_desktop_config.json:
{
"mcpServers": {
"mcpsafetywarden": {
"command": "mcpsafetywarden-server",
"args": [],
"env": {
"ANTHROPIC_API_KEY": "sk-ant-...",
"MCP_DB_ENCRYPTION_KEY": "<generated_fernet_key>"
}
},
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/yourname/Documents"]
},
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."
}
}
}
}
The wrapper and the servers it proxies are registered separately in Claude Desktop. Claude sees all of them - but you route calls through mcpsafetywarden (using safe_tool_call, preflight_tool_call, etc.) instead of calling filesystem or github directly. First register each server with the wrapper:
mcpsafetywarden register filesystem --transport stdio \
--command npx \
--args '["-y", "@modelcontextprotocol/server-filesystem", "/Users/yourname/Documents"]'
mcpsafetywarden register github --transport stdio \
--command npx \
--args '["-y", "@modelcontextprotocol/server-github"]' \
--env '{"GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."}'
Using the wrapper as a mandatory gateway for all tool calls
Instead of adding every MCP server to
claude_desktop_config.json, you can add only the wrapper and register all other servers inside it. Claude then has no direct path to any underlying server - every tool call must go throughsafe_tool_call, making the wrapper a mandatory enforcement point for risk gating, arg scanning, and output inspection across your entire MCP setup.
claude_desktop_config.json- wrapper only:{ "mcpServers": { "mcpsafetywarden": { "command": "mcpsafetywarden-server", "args": [], "env": { "ANTHROPIC_API_KEY": "sk-ant-..." } } } }Register your servers once via CLI before starting Claude Desktop:
mcpsafetywarden register github --transport stdio \ --command npx \ --args '["-y", "@modelcontextprotocol/server-github"]' \ --env '{"GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."}' mcpsafetywarden register slack --transport stdio \ --command npx \ --args '["-y", "@modelcontextprotocol/server-slack"]' \ --env '{"SLACK_BOT_TOKEN": "xoxb-..."}'Claude sees only the wrapper's 17 tools. To use github or slack it must call
safe_tool_call(server_id="github", ...)- there is no other route. Registration is enforced becausesafe_tool_callrejects anyserver_idthat is not registered.
Field notes:
| Field | Required | Notes |
|---|---|---|
command |
Yes | mcpsafetywarden-server after pip install. |
ANTHROPIC_API_KEY |
Strongly recommended | Enables LLM classification, deep injection scanning, risk gate alternatives, and the full mcpsafety+ pentest pipeline. Use OPENAI_API_KEY or GEMINI_API_KEY instead if preferred. Without any key the wrapper operates in rule-based-only mode - see Prerequisites. |
MCP_DB_ENCRYPTION_KEY |
Recommended | Encrypts stored server credentials (env vars, headers) at rest. Generate with: python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" |
MCP_TRANSPORT |
No | Defaults to stdio. Leave as-is for Claude Desktop. |
MCP_AUTH_TOKEN |
No | Not needed for stdio; only relevant for HTTP deployments. Omit or leave empty. |
Restart Claude Desktop. All 17 wrapper tools appear in Claude's tool list.
Connecting with an HTTP client
MCP_TRANSPORT=streamable_http MCP_AUTH_TOKEN=mytoken mcpsafetywarden-server
Configure your MCP client to connect to http://127.0.0.1:8000/mcp with header Authorization: Bearer mytoken.
Available MCP tools
| Tool | What it does |
|---|---|
onboard_server |
Register + inspect + security scan in one call |
register_server |
Register a server; optionally auto-inspect |
inspect_server |
Refresh tool list and profiles |
list_servers |
List all registered servers |
list_server_tools |
List tools on a server with summary profiles |
preflight_tool_call |
Risk assessment without execution |
safe_tool_call |
Execute with risk gating and interactive alternatives |
get_tool_profile |
Full behavior profile with observed stats |
get_retry_policy |
Retry and timeout recommendations |
suggest_safer_alternative |
LLM-ranked safer substitutes |
run_replay_test |
Idempotency test (runs tool twice); appends Burp proxy traffic if Burp is registered |
security_scan_server |
Live security audit (mcpsafety+, Cisco, Snyk); Kali nmap enriches Recon, Burp adds HTTP-layer probes to Hacker and evidence to Auditor |
scan_all_servers |
Run mcpsafety+ pipeline across all registered servers |
get_security_scan |
Latest stored scan report |
set_tool_policy |
Permanent allow/block policy for a tool |
get_run_history |
Recent execution history |
ping_server |
Reachability check with latency; adds Kali nmap + traceroute if Kali is registered |
Project Structure
mcpsafetywarden/
├── mcpsafetywarden/
│ ├── server.py # FastMCP server, all MCP tools, rate limiting, bearer auth
│ ├── cli.py # CLI entry point (typer + rich)
│ ├── client_manager.py # Connects to wrapped servers, injection scanning, telemetry
│ ├── database.py # SQLite persistence (servers, tools, runs, profiles, scans, policies)
│ ├── classifier.py # Static rule-based + LLM tool classification
│ ├── profiler.py # Builds behavior profiles from run history
│ ├── scanner.py # LLM, Cisco AI Defense, Snyk scan orchestration
│ ├── mcpsafety_scanner.py # Five-stage pentest pipeline (Recon, Planner, Hacker, Auditor, Supervisor)
│ └── security_utils.py # Text normalisation, redaction, credential detection
├── tests/
│ └── test_suite.py
├── docs/
│ └── COMPARISON.md
├── assets/
│ └── logo.png
└── pyproject.toml
The database (behavior_profiles.db) is stored in the platform user data directory, not in the project root. Override with MCP_DB_PATH.
Development
Install in editable mode with all extras:
pip install -e ".[all]"
Run the server in stdio mode and observe logs:
mcpsafetywarden-server 2>server.log
Run the CLI against a test server:
mcpsafetywarden onboard test-server --transport stdio --command python --args '["<YOUR_TEST_SERVER>.py"]'
mcpsafetywarden list test-server
mcpsafetywarden call test-server <tool_name>
Adding a new MCP tool:
- Define an async (or sync) function in
mcpsafetywarden/server.pydecorated with@mcp.tool(). - Use
db.*for persistence,cm.call_tool_with_telemetryfor proxied execution. - Add a corresponding CLI command in
mcpsafetywarden/cli.pywith@app.command(). - Follow the existing pattern: validate input, check rate limit if it is a management operation, return
json.dumps(...).
Logging:
Every module uses logging.getLogger(__name__). The server does not call logging.basicConfig itself - configure logging in your entry point or launcher script before importing the server. Example: logging.basicConfig(level=logging.DEBUG, format="%(asctime)s %(name)s %(levelname)s %(message)s").
Testing
A test suite is available at tests/test_suite.py. Run it with:
python tests/test_suite.py
Set ANTHROPIC_API_KEY (or another provider key) before running if you want LLM-assisted classification and scanning tests to execute. To validate behavior manually:
Verify tool classification:
mcpsafetywarden onboard test-server --transport stdio --command python --args '["<YOUR_MCP_SERVER>.py"]'
mcpsafetywarden list test-server --json
Check that effect_class values match what you expect for each tool.
Verify injection scanning:
Call a tool that returns text content. Inject a test pattern such as "Ignore all previous instructions" into the tool output (by modifying the wrapped server temporarily) and confirm the wrapper returns a quarantined response.
Verify risk gating:
mcpsafetywarden preflight test-server <high_risk_tool>
mcpsafetywarden call test-server <high_risk_tool>
# Should block and show alternatives menu
mcpsafetywarden call test-server <high_risk_tool> --approved
# Should execute
Verify policy enforcement:
mcpsafetywarden policy test-server <tool_name> --set block
mcpsafetywarden call test-server <tool_name>
# Should return policy_blocked immediately
mcpsafetywarden policy test-server <tool_name> --set clear
Deployment
Starting the server
stdio (default):
mcpsafetywarden-server
The server reads from stdin and writes to stdout. This is the mode used by Claude Desktop and other MCP clients that manage the subprocess.
HTTP (streamable_http):
MCP_TRANSPORT=streamable_http MCP_PORT=8000 mcpsafetywarden-server
Set MCP_AUTH_TOKEN to require bearer auth on all requests:
MCP_TRANSPORT=streamable_http MCP_AUTH_TOKEN=mysecrettoken mcpsafetywarden-server
SSE:
MCP_TRANSPORT=sse MCP_PORT=8000 mcpsafetywarden-server
Local (stdio with Claude Desktop)
Set up claude_desktop_config.json as shown in the MCP Integration section. No additional setup is needed.
Local HTTP server
MCP_TRANSPORT=streamable_http \
MCP_HOST=127.0.0.1 \
MCP_PORT=8000 \
MCP_AUTH_TOKEN=<your_secret_token> \
ANTHROPIC_API_KEY=<your_key> \
mcpsafetywarden-server
Container
A Dockerfile is not included. A minimal setup:
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir .
ENV MCP_TRANSPORT=streamable_http
ENV MCP_HOST=0.0.0.0
ENV MCP_PORT=8000
EXPOSE 8000
CMD ["mcpsafetywarden-server"]
Pass MCP_AUTH_TOKEN, MCP_DB_ENCRYPTION_KEY, and API keys as container environment variables. Do not bake them into the image.
Production considerations
- Rate limiting is in-process and resets on restart. For multi-replica deployments, replace the deque-based limiter with a shared store such as Redis.
- Database is a local SQLite file. For shared deployments, consider replacing with a networked database.
- Bearer auth covers the HTTP transport layer. For multi-tenant deployments, place an API gateway (nginx, Caddy, AWS API Gateway) in front and leave
MCP_AUTH_TOKENunset. - Logging goes to stderr by default via Python's
loggingmodule. Redirect and aggregate as needed for your observability stack. - Database permissions are set to owner-only (0o600) on POSIX systems. On Windows this is a no-op; use filesystem ACLs.
Troubleshooting
Tool '<name>' not found on server '<id>'.
Run mcpsafetywarden inspect <server_id> to refresh the tool list from the live server.
Server '<id>' not registered.
Run mcpsafetywarden register or mcpsafetywarden onboard first.
Rate limit exceeded.
There are two separate rate limits:
- Management operations (register, inspect, scan, replay, etc.): 10 calls per 60 seconds per server and 100 globally. Limits are in
mcpsafetywarden/server.py(_MGMT_RATE_LIMIT_MAX,_GLOBAL_RATE_LIMIT_MAX). - Tool calls via
safe_tool_call: 20 calls per 60 seconds per tool. Limit is inmcpsafetywarden/client_manager.py(_RATE_LIMIT_MAX_CALLS).
Wait for the window to expire. For heavy automation, batch operations or increase the relevant limit constants.
URL targets a private or restricted address.
The SSRF filter blocked a private IP, localhost, or cloud metadata endpoint. This is intentional. If you are proxying a legitimate internal server over stdio instead, use the stdio transport.
Registering a shell interpreter with an eval flag is not permitted.
You tried to register bash -c or similar. Use a dedicated MCP server script as the command instead of a shell one-liner.
LLM classification shows confidence: 0% for all tools.
No LLM API key was found. Set ANTHROPIC_API_KEY, OPENAI_API_KEY, or GEMINI_API_KEY. Classification falls back to rule-based when no key is available, which gives lower confidence on ambiguous tool names.
Scan fails immediately with confirm_authorized must be True.
The mcpsafety+ scanner requires explicit authorization before sending live probes. Pass --yes on the CLI or confirm_authorized=True on the MCP tool.
snyk-agent-scan not available.
Install with pip install snyk-agent-scan. If the binary is installed but not on PATH, the wrapper falls back to the Python module invocation automatically. If both fail, check that the install completed without errors and that pip show snyk-agent-scan shows the package.
SNYK_TOKEN is required for snyk-agent-scan.
Set SNYK_TOKEN=snyk_uat.<your_token> in your environment or pass --api-key snyk_uat.<your_token> on the CLI. Get a free token at app.snyk.io/account.
Snyk scan returns 0 findings on a server that has obvious issues.
Snyk analyzes tool metadata only - it does not call tools or inspect server-side logic. If the malicious content is not present in tool names, descriptions, or schemas as advertised by the server, Snyk will not detect it. Use --provider anthropic (or another LLM) with --yes for active probing.
MCP_DB_ENCRYPTION_KEY is set but Fernet init failed.
The key is malformed. Regenerate it with:
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
Decryption failure logged at ERROR level. The encryption key changed after data was written (key rotation). The affected server's env and headers fields will read as empty until the data is re-written with the new key by re-registering the server.
Security
Secrets in arguments The wrapper redacts credential-shaped values (JWTs, API keys, PEM blocks, long hex and base64 blobs) from tool arguments before storing them. If a secret is detected in an argument, a warning is included in the telemetry response. Prefer setting secrets as environment variables on the wrapped server rather than passing them as tool arguments.
Child process isolation
When spawning stdio servers, the wrapper strips its own secrets (MCP_AUTH_TOKEN, MCP_DB_ENCRYPTION_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, GOOGLE_API_KEY, SNYK_TOKEN, MCP_SCANNER_API_KEY, MCP_SCANNER_LLM_API_KEY) from the child process environment. Supply needed env vars explicitly via the env parameter in register_server.
Input validation All server IDs, URLs, commands, and argument values are length-checked before storage. URLs are checked against the SSRF blocklist. Shell interpreters with eval flags are rejected at registration time.
HTTP auth
Set MCP_AUTH_TOKEN for any HTTP deployment. The token is compared with hmac.compare_digest to prevent timing attacks. Without a token, the server logs a warning and accepts all connections.
Database
Enable at-rest encryption with MCP_DB_ENCRYPTION_KEY to protect stored server credentials. The database file is set to 0o600 on POSIX systems.
Argument scanning
Every tool call argument is scanned for 20+ attack categories before the call is forwarded to the wrapped server. If an LLM key is available, flagged values are sent for a second-pass LLM verification to clear false positives. Blocked calls return a structured response showing exactly which argument triggered which category. Pass args_scan_override=True (or --args-scan-override on the CLI) to bypass after manual review.
Injection quarantine Tool output flagged as a prompt injection attempt is stored in the database under the run ID but is never returned to the calling agent. The response contains a quarantine notice and the run ID for forensic review.
Contributing
- Fork the repository and create a branch from
main. - Make your changes. Keep functions focused. Follow the existing pattern: validation first, then logic, then return
json.dumps(...)for MCP tools. - Test manually using the CLI against a real or mock MCP server.
- Open a pull request with a clear description of what changed and why.
Code standards:
- No inline comments unless the reason is non-obvious.
- No docstring blocks beyond the existing MCP tool docstrings (which are user-facing).
- Match the surrounding code style:
Optional[str]type hints,_log.warning/errorfor operator-visible events,_log.debugfor internal traces.
License
Apache License 2.0. See LICENSE for details.
Roadmap
- Automated test suite (unit tests for classifier, profiler, and security_utils; integration tests with a mock MCP server).
- Redis-backed rate limiting for multi-replica deployments.
- Schema drift detection: alert when a wrapped tool's input or output schema changes between runs.
- Web dashboard for server health, tool risk overview, and run history.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcpsafetywarden-0.1.0.tar.gz.
File metadata
- Download URL: mcpsafetywarden-0.1.0.tar.gz
- Upload date:
- Size: 147.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
03605d488d3c83b40060b005efa07d4ce6c76b5f431fccde9759755d0ed8a8a3
|
|
| MD5 |
f83635ec50a11061ff278b64cc243d6b
|
|
| BLAKE2b-256 |
d7bfb2a6108fdfadd326f2d54fc5a1383b09edae59b5824edeea7cc7c15fe28e
|
Provenance
The following attestation bundles were made for mcpsafetywarden-0.1.0.tar.gz:
Publisher:
publish.yml on gautamvarmadatla/mcpsafetywarden
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mcpsafetywarden-0.1.0.tar.gz -
Subject digest:
03605d488d3c83b40060b005efa07d4ce6c76b5f431fccde9759755d0ed8a8a3 - Sigstore transparency entry: 1372628608
- Sigstore integration time:
-
Permalink:
gautamvarmadatla/mcpsafetywarden@4bc0bc14a11a0c0e6ea912743a92192f3149037b -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/gautamvarmadatla
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4bc0bc14a11a0c0e6ea912743a92192f3149037b -
Trigger Event:
push
-
Statement type:
File details
Details for the file mcpsafetywarden-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mcpsafetywarden-0.1.0-py3-none-any.whl
- Upload date:
- Size: 120.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfe48e167c139daf9b548382987a2d81bd1fd23efc14c168b81efa48ea24787e
|
|
| MD5 |
78125d12e5560e8dc6377597e916e074
|
|
| BLAKE2b-256 |
f9d97420e3fee6bc04a9c095f3d3be78a04233d9bb63cba437fcc84d80699d51
|
Provenance
The following attestation bundles were made for mcpsafetywarden-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on gautamvarmadatla/mcpsafetywarden
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mcpsafetywarden-0.1.0-py3-none-any.whl -
Subject digest:
dfe48e167c139daf9b548382987a2d81bd1fd23efc14c168b81efa48ea24787e - Sigstore transparency entry: 1372628740
- Sigstore integration time:
-
Permalink:
gautamvarmadatla/mcpsafetywarden@4bc0bc14a11a0c0e6ea912743a92192f3149037b -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/gautamvarmadatla
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4bc0bc14a11a0c0e6ea912743a92192f3149037b -
Trigger Event:
push
-
Statement type: