mcpsafetywarden

MCP proxy server with behavioral profiling, security scanning, risk gating, and safe execution

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

gautamvarmadatla

These details have not been verified by PyPI

Project description

MCP Safety Warden

MCP safety warden is a proxy server that wraps any MCP server and adds behavioral profiling, security scanning, risk gating, and safe execution to its tools.

Overview

Most MCP servers expose tools with no information about what those tools actually do at runtime: whether they write data, call external services, delete things, or produce outputs that contain adversarial content.

Instead of calling a wrapped server's tools directly, you route calls through this wrapper. It classifies each tool, builds a behavior profile from observed runs, checks for injection attacks, and blocks or gates risky tools before they execute.

Behavioral profiling

Static classification of effect class (read_only, additive_write, mutating_write, external_action, destructive), retry safety, and destructiveness.
LLM-assisted classification via Anthropic, OpenAI, Gemini, or Ollama - LLM and rule-based signals are combined via weighted voting, producing higher confidence across all tools.
Observed stats updated after every proxied call: p50/p95 latency, failure rate, output size, schema stability.

Security scanning

mcpsafety+ five-stage pipeline: Recon, Planner, Hacker (live probing), Auditor (CVE/Arxiv research), Supervisor (final report). Enhanced over mcpsafetyscanner (Radosevich & Halloran, arxiv 2504.03767).
LLM provider choice for mcpsafety+: Anthropic, OpenAI, Gemini, or Ollama (local, no API key).
Multi-server scan: run the full pipeline against every registered server in one call via scan_all_servers.
Cisco AI Defense: AST and taint analysis, YARA rules, optional cloud ML engine.
Snyk: prompt injection, tool shadowing, toxic data flows, hardcoded secrets.
Kali MCP integration: if a Kali Linux MCP server is registered, quick_scan, vulnerability_scan, and traceroute run against the target host at the start of the pipeline. The results are embedded in the Recon output so the Planner can ground its attack hypotheses in real port and service data rather than guessing from tool schemas alone.
Burp Suite MCP integration: if a Burp Suite MCP server is registered, the Hacker stage sends raw HTTP/1.1 probes directly to the MCP endpoint (malformed JSON, missing headers, oversized payloads), triggers Collaborator out-of-band payloads to detect blind SSRF (Pro edition), and pulls automated scanner findings (Pro edition). Proxy history feeds the Auditor as raw evidence. Community edition tools run automatically; Pro-only tools are tried and silently skipped if unavailable.
All findings stored and surfaced automatically in subsequent preflight assessments.

Safe execution

Argument scanning on every tool call: 20+ attack categories (SSRF, SQL/NoSQL/LDAP/XPath injection, command injection, path traversal, XXE, template injection, prompt injection, deserialization payloads, base64-encoded variants, Windows-specific paths). When an LLM key is set, flagged args get a second-pass LLM verification to clear false positives.
Two-layer injection scanning on every tool output: 40+ regex patterns then LLM deep scan.
Injection-flagged output is quarantined and never returned to the caller.
Risk gating with per-tool permanent policies (allow/block) or per-call approval flow.
Alternatives suggestion: when a tool is blocked, the LLM ranks safer substitutes by risk reduction and functional coverage.

CLI

16 subcommands covering all 17 MCP tools (list covers both list_servers and list_server_tools).
Interactive risk menu for call: pick an alternative, approve the original, or abort.
scan-all runs the full pentest pipeline across all registered servers in one command.
--json flag on every command for scripting and pipelines.
--yes / -y flag on confirmation prompts for CI use.

Transport

stdio (default), SSE, and streamable_http.
Bearer token auth middleware for HTTP transports.

Use it when you need to audit what third-party or internal MCP tools actually do before trusting them in an agent workflow.

Architecture

MCP Client (Claude Desktop, agent, mcpsafetywarden CLI)
        |
        v
  mcpsafetywarden/server.py  (FastMCP, 17 tools, rate limiting, bearer auth)
        |
        +---> mcpsafetywarden/client_manager.py  (connects to wrapped servers, records telemetry, injection scan)
        |
        +---> mcpsafetywarden/database.py        (SQLite: servers, tools, runs, profiles, scans, policies)
        |
        +---> mcpsafetywarden/classifier.py      (rule-based + LLM tool classification)
        |
        +---> mcpsafetywarden/profiler.py        (computes behavior profiles from run history)
        |
        +---> mcpsafetywarden/scanner.py         (LLM, Cisco, Snyk scan orchestration)
        |
        +---> mcpsafetywarden/mcpsafety_scanner.py (five-stage pentest pipeline)
        |
        +---> mcpsafetywarden/security_utils.py  (redaction, normalisation, injection detection helpers)

mcpsafetywarden/cli.py imports from mcpsafetywarden/server.py and mcpsafetywarden/database.py directly. It does not use the MCP protocol; it calls the same Python functions that the MCP tools call, which means no network hop for CLI usage.

Request flow for safe_tool_call:

Lookup tool record and behavior profile in SQLite.
Check permanent policy (allow/block).
Run _preflight_assessment: compute risk level from profile and latest security scan findings.
If low or medium-low risk: scan args for threats -> forward call to wrapped server via client_manager -> scan output -> record telemetry -> return result.
If medium/high risk and not approved: fetch LLM-ranked alternatives, return blocked response with numbered menu.
If approved or alternative selected: scan args for threats -> execute -> scan output -> record telemetry -> return result.

Prerequisites

Python 3.10 or later.
pip for dependency installation.
At least one wrapped MCP server to proxy (stdio subprocess, SSE endpoint, or streamable_http endpoint).
Recommended: an API key for at least one LLM provider (Anthropic, OpenAI, Gemini, or a local Ollama instance).

Why an LLM key matters:

The wrapper has two operating modes depending on whether an LLM is available:

Capability	Without LLM key	With LLM key
Tool classification	Rule-based heuristics only - low confidence on ambiguous tool names	LLM resolves ambiguous cases; higher confidence across the board
Injection scanning	Regex patterns only (40+ rules)	Regex + LLM deep scan - catches obfuscated and novel injections
Risk gate alternatives	None - gate shows "More options" only	LLM ranks safer substitute tools by risk reduction and functional coverage
Security scanning	Snyk and Cisco only (metadata/static analysis, no LLM needed)	Full 5-stage pentest: Recon, Planner, Hacker, Auditor, Supervisor

Set at minimum ANTHROPIC_API_KEY, OPENAI_API_KEY, or GEMINI_API_KEY before starting the server. For a fully local setup with no API keys, run Ollama and set OLLAMA_MODEL - then pass --provider ollama (or scan_provider="ollama") explicitly on every command, as Ollama is not auto-detected from environment variables.

Installation

git clone <YOUR_REPO_URL>
cd mcpsafetywarden
pip install .

With all optional LLM providers and scanners:

pip install ".[all]"

Or pick specific extras:

pip install ".[anthropic,snyk]"

Verify the install:

mcpsafetywarden --help
mcpsafetywarden-server --help

The SQLite database is created automatically on first run in the platform user data directory (e.g. ~/.local/share/mcpsafetywarden/ on Linux, ~/Library/Application Support/mcpsafetywarden/ on macOS, %APPDATA%\mcpsafetywarden\ on Windows). Set MCP_DB_PATH to override the location.

Optional: at-rest encryption for stored credentials

The wrapper stores server env vars and HTTP headers in the database. To encrypt them at rest:

pip install cryptography
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"

Set the printed key as MCP_DB_ENCRYPTION_KEY before starting the server. Keep this key safe; losing it makes stored credentials unrecoverable.

Configuration

All configuration is via environment variables. No config file is required.

Variable	Default	Purpose
`MCP_TRANSPORT`	`stdio`	Transport mode: `stdio`, `sse`, or `streamable_http`
`MCP_HOST`	`127.0.0.1`	Bind address for HTTP transports
`MCP_PORT`	`8000`	Bind port for HTTP transports
`MCP_AUTH_TOKEN`	(unset)	Bearer token for HTTP transport auth. Unset means no auth (log warning is emitted).
`MCP_DB_ENCRYPTION_KEY`	(unset)	Fernet key to encrypt `env_json` and `headers_json` at rest
`ANTHROPIC_API_KEY`	(unset)	Enables Anthropic as LLM provider for classification and scanning
`OPENAI_API_KEY`	(unset)	Enables OpenAI as LLM provider
`GEMINI_API_KEY`	(unset)	Enables Gemini as LLM provider
`GOOGLE_API_KEY`	(unset)	Legacy alias for `GEMINI_API_KEY`
`OLLAMA_MODEL`	(unset)	Model name for Ollama provider (e.g. `llama3.1`, `mistral`)
`OLLAMA_BASE_URL`	`http://localhost:11434/v1`	Ollama API base URL (OpenAI-compatible)
`SNYK_TOKEN`	(unset)	Enables Snyk E001 prompt-injection detection
`MCP_SCANNER_API_KEY`	(unset)	Cisco AI Defense API key for cloud ML engine
`MCP_SCANNER_LLM_API_KEY`	(unset)	LLM key for Cisco internal AST analysis (falls back to `OPENAI_API_KEY`)
`MCP_DB_PATH`	(unset)	Override the SQLite database file path

Example .env for local development:

MCP_TRANSPORT=stdio
ANTHROPIC_API_KEY=sk-ant-...
MCP_DB_ENCRYPTION_KEY=<generated_fernet_key>

Security note: Never commit API keys or the encryption key to version control. Pass them via environment variables or a secrets manager. The wrapper strips its own secrets (MCP_AUTH_TOKEN, MCP_DB_ENCRYPTION_KEY, and all LLM/scanner API keys) from the child process environment before spawning stdio servers. Other variables present in the parent environment are passed through.

Auxiliary Security Tool Integrations

The wrapper detects Kali and Burp by looking for registered servers whose server_id contains "kali" or "burp" (case-insensitive). Registration is the only setup step - once registered, the tools activate automatically on every scan, ping, and replay test.

Kali Linux MCP (`ccq1/awsome_kali_MCPServers`)

Docker-based, Apache 2.0, no auth. Adds real network reconnaissance to the Recon stage and network data to ping_server.

What it contributes:

Pipeline stage / tool	Kali tools called	What it adds
Recon (before Planner)	`quick_scan(target)`, `vulnerability_scan(target)`, `traceroute(target)`	Open ports, running services, OS fingerprint, network path - Planner uses this to craft targeted hypotheses
`ping_server`	`quick_scan(target)`, `traceroute(target)`	Network reachability detail beyond the MCP protocol ping (sse/streamable_http only — no network target for stdio)

Setup:

# 1. Install Docker Desktop (if not already installed)
#    Windows: winget install Docker.DockerDesktop
#    macOS:   brew install --cask docker
#    Linux:   https://docs.docker.com/engine/install/

# 2. Clone and build the image
git clone https://github.com/ccq1/awsome_kali_MCPServers
cd awsome_kali_MCPServers
docker build -t kali-mcps:latest .

# 3. Register with the wrapper (server_id must contain "kali")
mcpsafetywarden register kali-mcp \
  --transport stdio \
  --command docker \
  --args '["run", "-i", "kali-mcps:latest"]'

Note: vulnerability_scan runs nmap vuln scripts which can take 60-90 seconds per target. On scan-all across many servers this adds up. Register only when you want network recon in your scans.

Burp Suite MCP (`PortSwigger/mcp-server`)

Kotlin, GPL-3.0, no auth, runs as an SSE server on port 9876. Community edition tools run always; Pro-only tools (Collaborator, scanner) are tried and silently skipped on failure.

What it contributes:

Pipeline stage / tool	Burp tools called	Edition	What it adds
Hacker (after LLM probing)	`SendHttp1Request` x3	Community	Raw HTTP probes: malformed JSON body, missing Content-Type, oversized method field
Hacker	`GenerateCollaboratorPayload`, `GetCollaboratorInteractions`	Pro	Out-of-band DNS/HTTP callbacks - detects blind SSRF and blind injection
Hacker	`GetScannerIssues`	Pro	Automated active scanner findings against the MCP endpoint
Auditor	`GetProxyHttpHistoryRegex`	Community	Raw HTTP traffic evidence for every finding the Auditor validates
`run_replay_test`	`GetProxyHttpHistoryRegex`	Community	HTTP traffic captured during both tool calls, appended to the replay result

Setup:

# 1. Install Burp Suite (Community or Professional)
#    Download from https://portswigger.net/burp/releases

# 2. Build the MCP extension JAR
git clone https://github.com/PortSwigger/mcp-server.git
cd mcp-server
./gradlew embedProxyJar
# produces build/libs/burp-mcp-all.jar

# 3. Load into Burp
#    Burp -> Extensions -> Add -> Java type -> select burp-mcp-all.jar
#    Then go to the "MCP" tab in Burp and enable the server.
#    SSE endpoint starts at http://127.0.0.1:9876/sse

# 4. Register with the wrapper (server_id must contain "burp")
mcpsafetywarden register burp-mcp \
  --transport sse \
  --url http://127.0.0.1:9876/sse

Snyk (`snyk-agent-scan`)

Python, Apache 2.0, requires a free Snyk account token. Connects to the target MCP server, lists its tools, and runs static analysis on the tool metadata (names, descriptions, schemas). It does not call any tools - it only reads what the server advertises.

What it checks:

Code	Severity	Check
E001	HIGH	Prompt injection strings in tool descriptions or schemas
E002	HIGH	Tool shadowing (a tool impersonates another)
E004	HIGH	Prompt injection embedded in skill definitions
E005	HIGH	Suspicious download URLs in tool metadata
E006	HIGH	Malicious code patterns in descriptions
W007	HIGH	Insecure credential handling patterns
W008	HIGH	Hardcoded secrets in tool metadata
W009	MEDIUM	Direct financial execution capabilities
W011	MEDIUM	Untrusted third-party content references
W012	HIGH	Unverifiable external dependencies
W013	MEDIUM	System service modification capabilities
W015	MEDIUM	Untrusted content flows
W017	MEDIUM	Sensitive data exposure patterns
W019	MEDIUM	Destructive capabilities
W001	LOW	Suspicious words
W014	LOW	Missing skill documentation
W016	LOW	Potential untrusted content
W018	LOW	Workspace data exposure
W020	LOW	Local destructive capabilities

E001 (prompt injection) requires a Snyk token for Snyk's AI-based detection. All other checks run with the token present but also degrade gracefully if the token is invalid - structural and pattern-based checks are fully offline.

How it runs:

Snyk is invoked as a subprocess (snyk-agent-scan) with a temporary config JSON pointing at the target server. The binary opens its own live MCP connection, fetches the tool list, analyzes the metadata, and returns JSON findings. The wrapper normalizes these into its common findings format and stores them in the database, where they are automatically included in future preflight_tool_call responses.

Setup:

pip install snyk-agent-scan

Get a free token at app.snyk.io/account. Set it as an environment variable:

export SNYK_TOKEN=snyk_uat.<your_token>

Or pass it directly on the scan command:

mcpsafetywarden scan my-server --provider snyk --api-key snyk_uat.<your_token> --yes

Unlike Kali and Burp, Snyk is not auto-activated on every scan - it only runs when explicitly chosen as the provider via --provider snyk or provider="snyk".

CLI Reference

Global flags

All commands support --json for machine-readable output. Commands with confirmation prompts support --yes / -y to skip them.

Typical workflow

# Register, inspect, and scan a local stdio server in one step
mcpsafetywarden onboard my-server \
  --transport stdio \
  --command python \
  --args '["my_mcp_server.py"]' \
  --scan-provider anthropic

# Check what tools were discovered
mcpsafetywarden list my-server

# Execute a tool safely
mcpsafetywarden call my-server read_file --args '{"path": "/tmp/data.txt"}'

# Execute a risky tool (interactive menu appears if blocked)
mcpsafetywarden call my-server delete_file --args '{"path": "/tmp/old.txt"}'

call interactive flow when a tool is blocked:

⚠ Blocked  risk: HIGH
  1.  list_files  -- reduction: HIGH  coverage: partial
  2.  More options

Pick: 2

  B.  Proceed with original tool despite risk
  C.  Abort

Pick [B/b/C/c]: B

✓  142ms  [explicit_approval]

To bypass the menu in scripts, pass --approved:

mcpsafetywarden call my-server delete_file \
  --args '{"path": "/tmp/old.txt"}' \
  --approved

Commands

list [server_id] List all registered servers. Pass server_id to list tools on a specific server.

mcpsafetywarden list
mcpsafetywarden list my-server
mcpsafetywarden list my-server --json

onboard <server_id> Register + inspect + security scan in one call. Prompts for authorization before scanning unless --yes is passed.

mcpsafetywarden onboard my-server --transport stdio --command python --args '["server.py"]'
mcpsafetywarden onboard my-server --transport streamable_http --url https://mcp.example.com/mcp \
  --headers '{"Authorization": "Bearer TOKEN"}' \
  --scan-provider anthropic --scan-model claude-opus-4-7 --scan-api-key sk-ant-... --yes

register <server_id> Register only, without scanning.

mcpsafetywarden register my-server --transport stdio --command python --args '["server.py"]'
mcpsafetywarden register my-server --transport stdio --command python --no-inspect
mcpsafetywarden register my-server --transport stdio --command python --args '["server.py"]' --provider anthropic

inspect <server_id> Reconnect to a registered server, refresh tools, re-classify.

mcpsafetywarden inspect my-server --provider anthropic
mcpsafetywarden inspect my-server --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...

scan <server_id> Run a security scan against a single server. Prompts for authorization before probing.

anthropic, openai, gemini, ollama - mcpsafety+ 5-stage pipeline (Recon -> Planner -> Hacker -> Auditor -> Supervisor)
cisco - Cisco AI Defense: AST taint analysis, YARA rules, optional cloud ML engine
snyk - Snyk: prompt injection, tool shadowing, toxic data flows, hardcoded secrets

For Ollama set OLLAMA_MODEL before running. Web research (DuckDuckGo/HackerNews/Arxiv CVE lookup in the Auditor stage) is skipped by default to avoid leaking findings externally; pass --web-research to enable it.

If a Kali MCP server is registered, nmap and traceroute results are shown after the findings table and included in --json output under network_scan. If a Burp Suite MCP server is registered, the number of HTTP-layer findings Burp contributed is shown as a summary line; use --json for the full evidence.

mcpsafetywarden scan my-server --provider anthropic
mcpsafetywarden scan my-server --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...
mcpsafetywarden scan my-server --provider ollama              # local model, no API key
mcpsafetywarden scan my-server --provider cisco
mcpsafetywarden scan my-server --provider anthropic --web-research --destructive --timeout 600 --yes

scan-all Run the full 5-stage mcpsafety+ pipeline against every registered server (or a comma-separated subset via --servers). Results are stored per server and displayed as a combined risk table. Only mcpsafety+ providers are supported (not cisco or snyk). Web research is skipped by default; pass --web-research to enable.

mcpsafetywarden scan-all --provider anthropic
mcpsafetywarden scan-all --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...
mcpsafetywarden scan-all --provider ollama --servers my-server,other-server --yes
mcpsafetywarden scan-all --provider openai --web-research --timeout 600 --json

call <server_id> <tool_name> Execute a tool through the risk gate. Interactive menu appears if the tool is blocked.

Every argument value is scanned for 20+ attack categories (SSRF, SQL/NoSQL/LDAP/XPath injection, command injection, path traversal, XXE, prompt injection, deserialization payloads, base64-encoded variants, and more) before the call is forwarded. If an LLM key is set, a second-pass LLM verification runs on flagged args to clear false positives. Without an LLM key, the CLI prompts you to confirm before proceeding.

mcpsafetywarden call my-server search_web --args '{"query": "site:example.com"}'
mcpsafetywarden call my-server delete_file --args '{"path": "/tmp/x"}' --approved
mcpsafetywarden call my-server run_query --args '{"sql": "SELECT id FROM users"}' --args-scan-override

Flag	Effect
`--approved`	Bypass the risk gate for a high-risk tool you have reviewed
`--args-scan-override`	Skip argument safety scanning (use only when you trust the args)
`--provider`	LLM provider for alternatives and arg verification (`anthropic`\|`openai`\|`gemini`\|`ollama`)

preflight <server_id> <tool_name> Assess risk without executing.

mcpsafetywarden preflight my-server delete_file
mcpsafetywarden preflight my-server delete_file --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...

profile <server_id> <tool_name> Print the full behavior profile.

mcpsafetywarden profile my-server read_file --json

retry-policy <server_id> <tool_name> Print retry and timeout recommendations.

mcpsafetywarden retry-policy my-server call_api
mcpsafetywarden retry-policy my-server call_api --provider anthropic --model claude-opus-4-7 --api-key sk-ant-...

alternatives <server_id> <tool_name> List safer alternatives to a tool.

mcpsafetywarden alternatives my-server delete_file --provider anthropic

replay <server_id> <tool_name> Run the tool twice and compare outputs. Prompts for confirmation. If a Burp Suite MCP server is registered, Burp proxy traffic captured during both calls is appended to the result - useful for spotting network-level differences even when output text is identical.

mcpsafetywarden replay my-server get_status --args '{"id": "123"}' --yes

policy <server_id> <tool_name> Read or set a permanent execution policy. Without --set, prints the current policy.

By default no policy is set and safe_tool_call decides at runtime based on the behavior profile: low or medium-low risk tools run immediately, medium/high-risk tools trigger the approval gate. Setting a policy overrides that completely - allow bypasses the risk gate (argument scanning still runs unless --args-scan-override is also passed), block rejects unconditionally.

mcpsafetywarden policy my-server read_file             # read current policy
mcpsafetywarden policy my-server read_file --set allow  # always execute without preflight
mcpsafetywarden policy my-server drop_table --set block # never execute
mcpsafetywarden policy my-server read_file --set clear  # remove policy, resume normal flow

history <server_id> <tool_name> Show recent execution history.

mcpsafetywarden history my-server delete_file --limit 50

ping <server_id> Check if a server is reachable. If a Kali MCP server is registered and the pinged server uses the sse or streamable_http transport, also runs quick_scan and traceroute against the target host and displays the output in labeled panels. Stdio servers have no network address to scan so Kali recon is skipped.

mcpsafetywarden ping my-server

get-scan <server_id> Print the latest stored security scan report.

mcpsafetywarden get-scan my-server --json

Exit codes:

0: success
1: error (tool not found, blocked by policy, unreachable server, invalid input)

MCP Integration

Connecting with Claude Desktop

Add the wrapper to claude_desktop_config.json:

{
  "mcpServers": {
    "mcpsafetywarden": {
      "command": "mcpsafetywarden-server",
      "args": [],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "MCP_DB_ENCRYPTION_KEY": "<generated_fernet_key>"
      }
    },

    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/yourname/Documents"]
    },

    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {
        "GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."
      }
    }
  }
}

The wrapper and the servers it proxies are registered separately in Claude Desktop. Claude sees all of them - but you route calls through mcpsafetywarden (using safe_tool_call, preflight_tool_call, etc.) instead of calling filesystem or github directly. First register each server with the wrapper:

mcpsafetywarden register filesystem --transport stdio \
  --command npx \
  --args '["-y", "@modelcontextprotocol/server-filesystem", "/Users/yourname/Documents"]'

mcpsafetywarden register github --transport stdio \
  --command npx \
  --args '["-y", "@modelcontextprotocol/server-github"]' \
  --env '{"GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."}'

Using the wrapper as a mandatory gateway for all tool calls

Instead of adding every MCP server to claude_desktop_config.json, you can add only the wrapper and register all other servers inside it. Claude then has no direct path to any underlying server - every tool call must go through safe_tool_call, making the wrapper a mandatory enforcement point for risk gating, arg scanning, and output inspection across your entire MCP setup.

claude_desktop_config.json - wrapper only:
{
  "mcpServers": {
    "mcpsafetywarden": {
      "command": "mcpsafetywarden-server",
      "args": [],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}
Register your servers once via CLI before starting Claude Desktop:
mcpsafetywarden register github --transport stdio \
  --command npx \
  --args '["-y", "@modelcontextprotocol/server-github"]' \
  --env '{"GITHUB_PERSONAL_ACCESS_TOKEN": "ghp_..."}'

mcpsafetywarden register slack --transport stdio \
  --command npx \
  --args '["-y", "@modelcontextprotocol/server-slack"]' \
  --env '{"SLACK_BOT_TOKEN": "xoxb-..."}'
Claude sees only the wrapper's 17 tools. To use github or slack it must call safe_tool_call(server_id="github", ...) - there is no other route. Registration is enforced because safe_tool_call rejects any server_id that is not registered.

Field notes:

Field	Required	Notes
`command`	Yes	`mcpsafetywarden-server` after pip install.
`ANTHROPIC_API_KEY`	Strongly recommended	Enables LLM classification, deep injection scanning, risk gate alternatives, and the full mcpsafety+ pentest pipeline. Use `OPENAI_API_KEY` or `GEMINI_API_KEY` instead if preferred. Without any key the wrapper operates in rule-based-only mode - see Prerequisites.
`MCP_DB_ENCRYPTION_KEY`	Recommended	Encrypts stored server credentials (env vars, headers) at rest. Generate with: `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"`
`MCP_TRANSPORT`	No	Defaults to `stdio`. Leave as-is for Claude Desktop.
`MCP_AUTH_TOKEN`	No	Not needed for stdio; only relevant for HTTP deployments. Omit or leave empty.

Restart Claude Desktop. All 17 wrapper tools appear in Claude's tool list.

Connecting with an HTTP client

MCP_TRANSPORT=streamable_http MCP_AUTH_TOKEN=mytoken mcpsafetywarden-server

Configure your MCP client to connect to http://127.0.0.1:8000/mcp with header Authorization: Bearer mytoken.

Available MCP tools

Tool	What it does
`onboard_server`	Register + inspect + security scan in one call
`register_server`	Register a server; optionally auto-inspect
`inspect_server`	Refresh tool list and profiles
`list_servers`	List all registered servers
`list_server_tools`	List tools on a server with summary profiles
`preflight_tool_call`	Risk assessment without execution
`safe_tool_call`	Execute with risk gating and interactive alternatives
`get_tool_profile`	Full behavior profile with observed stats
`get_retry_policy`	Retry and timeout recommendations
`suggest_safer_alternative`	LLM-ranked safer substitutes
`run_replay_test`	Idempotency test (runs tool twice); appends Burp proxy traffic if Burp is registered
`security_scan_server`	Live security audit (mcpsafety+, Cisco, Snyk); Kali nmap enriches Recon, Burp adds HTTP-layer probes to Hacker and evidence to Auditor
`scan_all_servers`	Run mcpsafety+ pipeline across all registered servers
`get_security_scan`	Latest stored scan report
`set_tool_policy`	Permanent allow/block policy for a tool
`get_run_history`	Recent execution history
`ping_server`	Reachability check with latency; adds Kali nmap + traceroute if Kali is registered

Project Structure

mcpsafetywarden/
├── mcpsafetywarden/
│   ├── server.py               # FastMCP server, all MCP tools, rate limiting, bearer auth
│   ├── cli.py                  # CLI entry point (typer + rich)
│   ├── client_manager.py       # Connects to wrapped servers, injection scanning, telemetry
│   ├── database.py             # SQLite persistence (servers, tools, runs, profiles, scans, policies)
│   ├── classifier.py           # Static rule-based + LLM tool classification
│   ├── profiler.py             # Builds behavior profiles from run history
│   ├── scanner.py              # LLM, Cisco AI Defense, Snyk scan orchestration
│   ├── mcpsafety_scanner.py    # Five-stage pentest pipeline (Recon, Planner, Hacker, Auditor, Supervisor)
│   └── security_utils.py       # Text normalisation, redaction, credential detection
├── tests/
│   └── test_suite.py
├── docs/
│   └── COMPARISON.md
├── assets/
│   └── logo.png
└── pyproject.toml

The database (behavior_profiles.db) is stored in the platform user data directory, not in the project root. Override with MCP_DB_PATH.

Development

Install in editable mode with all extras:

pip install -e ".[all]"

Run the server in stdio mode and observe logs:

mcpsafetywarden-server 2>server.log

Run the CLI against a test server:

mcpsafetywarden onboard test-server --transport stdio --command python --args '["<YOUR_TEST_SERVER>.py"]'
mcpsafetywarden list test-server
mcpsafetywarden call test-server <tool_name>

Adding a new MCP tool:

Define an async (or sync) function in mcpsafetywarden/server.py decorated with @mcp.tool().
Use db.* for persistence, cm.call_tool_with_telemetry for proxied execution.
Add a corresponding CLI command in mcpsafetywarden/cli.py with @app.command().
Follow the existing pattern: validate input, check rate limit if it is a management operation, return json.dumps(...).

Logging:

Every module uses logging.getLogger(__name__). The server does not call logging.basicConfig itself - configure logging in your entry point or launcher script before importing the server. Example: logging.basicConfig(level=logging.DEBUG, format="%(asctime)s %(name)s %(levelname)s %(message)s").

Testing

A test suite is available at tests/test_suite.py. Run it with:

python tests/test_suite.py

Set ANTHROPIC_API_KEY (or another provider key) before running if you want LLM-assisted classification and scanning tests to execute. To validate behavior manually:

Verify tool classification:

mcpsafetywarden onboard test-server --transport stdio --command python --args '["<YOUR_MCP_SERVER>.py"]'
mcpsafetywarden list test-server --json

Check that effect_class values match what you expect for each tool.

Verify injection scanning:

Call a tool that returns text content. Inject a test pattern such as "Ignore all previous instructions" into the tool output (by modifying the wrapped server temporarily) and confirm the wrapper returns a quarantined response.

Verify risk gating:

mcpsafetywarden preflight test-server <high_risk_tool>
mcpsafetywarden call test-server <high_risk_tool>
# Should block and show alternatives menu
mcpsafetywarden call test-server <high_risk_tool> --approved
# Should execute

Verify policy enforcement:

mcpsafetywarden policy test-server <tool_name> --set block
mcpsafetywarden call test-server <tool_name>
# Should return policy_blocked immediately
mcpsafetywarden policy test-server <tool_name> --set clear

Deployment

Starting the server

stdio (default):

mcpsafetywarden-server

The server reads from stdin and writes to stdout. This is the mode used by Claude Desktop and other MCP clients that manage the subprocess.

HTTP (streamable_http):

MCP_TRANSPORT=streamable_http MCP_PORT=8000 mcpsafetywarden-server

Set MCP_AUTH_TOKEN to require bearer auth on all requests:

MCP_TRANSPORT=streamable_http MCP_AUTH_TOKEN=mysecrettoken mcpsafetywarden-server

SSE:

MCP_TRANSPORT=sse MCP_PORT=8000 mcpsafetywarden-server

Local (stdio with Claude Desktop)

Set up claude_desktop_config.json as shown in the MCP Integration section. No additional setup is needed.

Local HTTP server

MCP_TRANSPORT=streamable_http \
MCP_HOST=127.0.0.1 \
MCP_PORT=8000 \
MCP_AUTH_TOKEN=<your_secret_token> \
ANTHROPIC_API_KEY=<your_key> \
mcpsafetywarden-server

Container

A Dockerfile is not included. A minimal setup:

FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir .
ENV MCP_TRANSPORT=streamable_http
ENV MCP_HOST=0.0.0.0
ENV MCP_PORT=8000
EXPOSE 8000
CMD ["mcpsafetywarden-server"]

Pass MCP_AUTH_TOKEN, MCP_DB_ENCRYPTION_KEY, and API keys as container environment variables. Do not bake them into the image.

Production considerations

Rate limiting is in-process and resets on restart. For multi-replica deployments, replace the deque-based limiter with a shared store such as Redis.
Database is a local SQLite file. For shared deployments, consider replacing with a networked database.
Bearer auth covers the HTTP transport layer. For multi-tenant deployments, place an API gateway (nginx, Caddy, AWS API Gateway) in front and leave MCP_AUTH_TOKEN unset.
Logging goes to stderr by default via Python's logging module. Redirect and aggregate as needed for your observability stack.
Database permissions are set to owner-only (0o600) on POSIX systems. On Windows this is a no-op; use filesystem ACLs.

Troubleshooting

Tool '<name>' not found on server '<id>'. Run mcpsafetywarden inspect <server_id> to refresh the tool list from the live server.

Server '<id>' not registered. Run mcpsafetywarden register or mcpsafetywarden onboard first.

Rate limit exceeded. There are two separate rate limits:

Management operations (register, inspect, scan, replay, etc.): 10 calls per 60 seconds per server and 100 globally. Limits are in mcpsafetywarden/server.py (_MGMT_RATE_LIMIT_MAX, _GLOBAL_RATE_LIMIT_MAX).
Tool calls via safe_tool_call: 20 calls per 60 seconds per tool. Limit is in mcpsafetywarden/client_manager.py (_RATE_LIMIT_MAX_CALLS).

Wait for the window to expire. For heavy automation, batch operations or increase the relevant limit constants.

URL targets a private or restricted address. The SSRF filter blocked a private IP, localhost, or cloud metadata endpoint. This is intentional. If you are proxying a legitimate internal server over stdio instead, use the stdio transport.

Registering a shell interpreter with an eval flag is not permitted. You tried to register bash -c or similar. Use a dedicated MCP server script as the command instead of a shell one-liner.

LLM classification shows confidence: 0% for all tools. No LLM API key was found. Set ANTHROPIC_API_KEY, OPENAI_API_KEY, or GEMINI_API_KEY. Classification falls back to rule-based when no key is available, which gives lower confidence on ambiguous tool names.

Scan fails immediately with confirm_authorized must be True. The mcpsafety+ scanner requires explicit authorization before sending live probes. Pass --yes on the CLI or confirm_authorized=True on the MCP tool.

snyk-agent-scan not available. Install with pip install snyk-agent-scan. If the binary is installed but not on PATH, the wrapper falls back to the Python module invocation automatically. If both fail, check that the install completed without errors and that pip show snyk-agent-scan shows the package.

SNYK_TOKEN is required for snyk-agent-scan. Set SNYK_TOKEN=snyk_uat.<your_token> in your environment or pass --api-key snyk_uat.<your_token> on the CLI. Get a free token at app.snyk.io/account.

Snyk scan returns 0 findings on a server that has obvious issues. Snyk analyzes tool metadata only - it does not call tools or inspect server-side logic. If the malicious content is not present in tool names, descriptions, or schemas as advertised by the server, Snyk will not detect it. Use --provider anthropic (or another LLM) with --yes for active probing.

MCP_DB_ENCRYPTION_KEY is set but Fernet init failed. The key is malformed. Regenerate it with:

python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"

Decryption failure logged at ERROR level. The encryption key changed after data was written (key rotation). The affected server's env and headers fields will read as empty until the data is re-written with the new key by re-registering the server.

Security

Secrets in arguments The wrapper redacts credential-shaped values (JWTs, API keys, PEM blocks, long hex and base64 blobs) from tool arguments before storing them. If a secret is detected in an argument, a warning is included in the telemetry response. Prefer setting secrets as environment variables on the wrapped server rather than passing them as tool arguments.

Child process isolation When spawning stdio servers, the wrapper strips its own secrets (MCP_AUTH_TOKEN, MCP_DB_ENCRYPTION_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, GOOGLE_API_KEY, SNYK_TOKEN, MCP_SCANNER_API_KEY, MCP_SCANNER_LLM_API_KEY) from the child process environment. Supply needed env vars explicitly via the env parameter in register_server.

Input validation All server IDs, URLs, commands, and argument values are length-checked before storage. URLs are checked against the SSRF blocklist. Shell interpreters with eval flags are rejected at registration time.

HTTP auth Set MCP_AUTH_TOKEN for any HTTP deployment. The token is compared with hmac.compare_digest to prevent timing attacks. Without a token, the server logs a warning and accepts all connections.

Database Enable at-rest encryption with MCP_DB_ENCRYPTION_KEY to protect stored server credentials. The database file is set to 0o600 on POSIX systems.

Argument scanning Every tool call argument is scanned for 20+ attack categories before the call is forwarded to the wrapped server. If an LLM key is available, flagged values are sent for a second-pass LLM verification to clear false positives. Blocked calls return a structured response showing exactly which argument triggered which category. Pass args_scan_override=True (or --args-scan-override on the CLI) to bypass after manual review.

Injection quarantine Tool output flagged as a prompt injection attempt is stored in the database under the run ID but is never returned to the calling agent. The response contains a quarantine notice and the run ID for forensic review.

Contributing

Fork the repository and create a branch from main.
Make your changes. Keep functions focused. Follow the existing pattern: validation first, then logic, then return json.dumps(...) for MCP tools.
Test manually using the CLI against a real or mock MCP server.
Open a pull request with a clear description of what changed and why.

Code standards:

No inline comments unless the reason is non-obvious.
No docstring blocks beyond the existing MCP tool docstrings (which are user-facing).
Match the surrounding code style: Optional[str] type hints, _log.warning/error for operator-visible events, _log.debug for internal traces.

License

Apache License 2.0. See LICENSE for details.

Roadmap

Automated test suite (unit tests for classifier, profiler, and security_utils; integration tests with a mock MCP server).
Redis-backed rate limiting for multi-replica deployments.
Schema drift detection: alert when a wrapped tool's input or output schema changes between runs.
Web dashboard for server health, tool risk overview, and run history.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

gautamvarmadatla

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.1

Apr 25, 2026

1.0.0

Apr 25, 2026

0.1.4

Apr 25, 2026

0.1.3

Apr 24, 2026

0.1.2

Apr 24, 2026

0.1.1

Apr 24, 2026

This version

0.1.0

Apr 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcpsafetywarden-0.1.0.tar.gz (147.3 kB view details)

Uploaded Apr 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcpsafetywarden-0.1.0-py3-none-any.whl (120.3 kB view details)

Uploaded Apr 24, 2026 Python 3

File details

Details for the file mcpsafetywarden-0.1.0.tar.gz.

File metadata

Download URL: mcpsafetywarden-0.1.0.tar.gz
Upload date: Apr 24, 2026
Size: 147.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mcpsafetywarden-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`03605d488d3c83b40060b005efa07d4ce6c76b5f431fccde9759755d0ed8a8a3`
MD5	`f83635ec50a11061ff278b64cc243d6b`
BLAKE2b-256	`d7bfb2a6108fdfadd326f2d54fc5a1383b09edae59b5824edeea7cc7c15fe28e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcpsafetywarden-0.1.0.tar.gz:

Publisher: publish.yml on gautamvarmadatla/mcpsafetywarden

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mcpsafetywarden-0.1.0.tar.gz
- Subject digest: 03605d488d3c83b40060b005efa07d4ce6c76b5f431fccde9759755d0ed8a8a3
- Sigstore transparency entry: 1372628608
- Sigstore integration time: Apr 24, 2026
Source repository:
- Permalink: gautamvarmadatla/mcpsafetywarden@4bc0bc14a11a0c0e6ea912743a92192f3149037b
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/gautamvarmadatla
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4bc0bc14a11a0c0e6ea912743a92192f3149037b
- Trigger Event: push

File details

Details for the file mcpsafetywarden-0.1.0-py3-none-any.whl.

File metadata

Download URL: mcpsafetywarden-0.1.0-py3-none-any.whl
Upload date: Apr 24, 2026
Size: 120.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mcpsafetywarden-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dfe48e167c139daf9b548382987a2d81bd1fd23efc14c168b81efa48ea24787e`
MD5	`78125d12e5560e8dc6377597e916e074`
BLAKE2b-256	`f9d97420e3fee6bc04a9c095f3d3be78a04233d9bb63cba437fcc84d80699d51`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcpsafetywarden-0.1.0-py3-none-any.whl:

Publisher: publish.yml on gautamvarmadatla/mcpsafetywarden

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mcpsafetywarden-0.1.0-py3-none-any.whl
- Subject digest: dfe48e167c139daf9b548382987a2d81bd1fd23efc14c168b81efa48ea24787e
- Sigstore transparency entry: 1372628740
- Sigstore integration time: Apr 24, 2026
Source repository:
- Permalink: gautamvarmadatla/mcpsafetywarden@4bc0bc14a11a0c0e6ea912743a92192f3149037b
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/gautamvarmadatla
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4bc0bc14a11a0c0e6ea912743a92192f3149037b
- Trigger Event: push

mcpsafetywarden 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Overview

Architecture

Prerequisites

Installation

Configuration

Auxiliary Security Tool Integrations

Kali Linux MCP (ccq1/awsome_kali_MCPServers)

Burp Suite MCP (PortSwigger/mcp-server)

Snyk (snyk-agent-scan)

CLI Reference

Global flags

Typical workflow

Commands

MCP Integration

Connecting with Claude Desktop

Using the wrapper as a mandatory gateway for all tool calls

Connecting with an HTTP client

Available MCP tools

Project Structure

Development

Testing

Deployment

Starting the server

Local (stdio with Claude Desktop)

Local HTTP server

Container

Production considerations

Troubleshooting

Security

Contributing

License

Roadmap

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Kali Linux MCP (`ccq1/awsome_kali_MCPServers`)

Burp Suite MCP (`PortSwigger/mcp-server`)

Snyk (`snyk-agent-scan`)