Skip to main content

Runtime behavioral auditor for MCP servers — strace-based scope-violation detection

Project description

mcp-behavioral-probe (Phase 0 spike)

A throwaway-quality spike that answers one question: can we get accurate behavioral ground truth out of a sandboxed MCP server? If yes, the real tool (behavioral auditing of MCP servers — "watch what it does, not what it says") is worth building. If running this was miserable, it wasn't.

This is intentionally ~200 lines. It is not the product. It is the go/no-go gate.

The idea in one contrast

targets/leaky_server.py and targets/honest_server.py expose a tool with the identical name, description, and schema:

format_note — "Formats a markdown note. Purely local text formatting."

A static scanner that reads tool descriptions sees two identical, harmless tools. Run them under this probe and the difference is obvious:

Target network egress sensitive file read findings
honest_server none none 0
leaky_server 93.184.216.34:80 ~/.ssh/id_rsa (a canary) 2 HIGH

The honest server producing zero findings matters as much as the leaky one tripping two — false positives are what would kill credibility.

How it works

Three steps, one syscall tracer:

  1. observe (probe/probe.py) — launches the MCP server wrapped in strace -f, does the MCP handshake over stdio, lists tools, and calls each with synthesized inputs. strace records openat / connect / execve / sendto to a log while passing stdio through transparently.
  2. profile (probe/analyze.py) — parses the trace into a structured behavioral profile (files opened, network connects, subprocesses), filtering out library/runtime noise. Pure observation, no judgement.
  3. diff (probe/report.py) — a deliberately crude declared-vs-observed comparison (a teaser of the real Phase 2 engine). Two rules only: network egress when a tool claims to be local, and reads of sensitive paths. Findings are framed as observations ("does X, undeclared"), never accusations.

Canaries (a fake ~/.ssh/id_rsa and ~/.env) are planted in sandbox_home/ and exposed as $HOME, so a server that reaches for secrets reveals itself.

Run it

Docker (works on macOS too — strace is Linux-only):

docker build -t mcp-probe .
docker run --rm mcp-probe                          # default: the leaky target
docker run --rm mcp-probe python targets/honest_server.py   # the control

Locally on Linux:

python -m venv .venv && . .venv/bin/activate
pip install -r requirements.txt
./run.sh                              # leaky target (default)
./run.sh python targets/honest_server.py

Point it at a real server (anything that speaks MCP over stdio), e.g.:

./run.sh python -m mcp_server_fetch

Known limits (deliberately out of scope for Phase 0)

  • Linux-only ground truth via strace. eBPF/seccomp is the Phase 1+ upgrade.
  • No DNS resolution — connects are reported as IP:port, not domains.
  • stdio transport only. HTTP/SSE servers come in Phase 1.
  • Input synthesis is dumb (one canary value per field). Phase 1 swaps in hypothesis-jsonschema for real coverage.
  • The diff is a toy. The real declared-scope model (allowlists, taxonomy, rug-pull manifest hashing) is Phase 2.
  • A server that only misbehaves on specific inputs, or after N calls, may not be triggered by a single synthesized call. Exercising state is later work.

If the gate passed

Next is Phase 1: generalize analyze.py into a reusable profiler, add the HTTP transport, and swap in schema-based input synthesis — then run it against ~5 real servers and confirm the profiles are accurate.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_behave-0.1.0.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_behave-0.1.0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file mcp_behave-0.1.0.tar.gz.

File metadata

  • Download URL: mcp_behave-0.1.0.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for mcp_behave-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1b2be556fcbb2eecd2725971a2a8cb44ded9be125ea9dbdbbc6cd7bcf692c625
MD5 edfb58bcbbbf3e993db5eedcbe6820bf
BLAKE2b-256 820679c3dc4a7b456edfe48c4494c0c7aff8e556313b2457397967b7e2e6a878

See more details on using hashes here.

File details

Details for the file mcp_behave-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mcp_behave-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for mcp_behave-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6f095ab83197c75af5512638e2c448ca0a0a11b256c952b9a36f87ea51f16991
MD5 1ee1dd461203d93afe3370c60ba6a317
BLAKE2b-256 b7b54e797f1b4a99680992aa21be11676b4afd3b5e312f8165765e44fdee2617

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page