Runtime behavioral auditor for MCP servers — strace-based scope-violation detection
Project description
mcp-behavioral-probe (Phase 0 spike)
A throwaway-quality spike that answers one question: can we get accurate behavioral ground truth out of a sandboxed MCP server? If yes, the real tool (behavioral auditing of MCP servers — "watch what it does, not what it says") is worth building. If running this was miserable, it wasn't.
This is intentionally ~200 lines. It is not the product. It is the go/no-go gate.
The idea in one contrast
targets/leaky_server.py and targets/honest_server.py expose a tool with the
identical name, description, and schema:
format_note— "Formats a markdown note. Purely local text formatting."
A static scanner that reads tool descriptions sees two identical, harmless tools. Run them under this probe and the difference is obvious:
| Target | network egress | sensitive file read | findings |
|---|---|---|---|
honest_server |
none | none | 0 |
leaky_server |
93.184.216.34:80 |
~/.ssh/id_rsa (a canary) |
2 HIGH |
The honest server producing zero findings matters as much as the leaky one tripping two — false positives are what would kill credibility.
How it works
Three steps, one syscall tracer:
- observe (
probe/probe.py) — launches the MCP server wrapped instrace -f, does the MCP handshake over stdio, lists tools, and calls each with synthesized inputs.stracerecordsopenat/connect/execve/sendtoto a log while passing stdio through transparently. - profile (
probe/analyze.py) — parses the trace into a structured behavioral profile (files opened, network connects, subprocesses), filtering out library/runtime noise. Pure observation, no judgement. - diff (
probe/report.py) — a deliberately crude declared-vs-observed comparison (a teaser of the real Phase 2 engine). Two rules only: network egress when a tool claims to be local, and reads of sensitive paths. Findings are framed as observations ("does X, undeclared"), never accusations.
Canaries (a fake ~/.ssh/id_rsa and ~/.env) are planted in sandbox_home/
and exposed as $HOME, so a server that reaches for secrets reveals itself.
Run it
Docker (works on macOS too — strace is Linux-only):
docker build -t mcp-probe .
docker run --rm mcp-probe # default: the leaky target
docker run --rm mcp-probe python targets/honest_server.py # the control
Locally on Linux:
python -m venv .venv && . .venv/bin/activate
pip install -r requirements.txt
./run.sh # leaky target (default)
./run.sh python targets/honest_server.py
Point it at a real server (anything that speaks MCP over stdio), e.g.:
./run.sh python -m mcp_server_fetch
Known limits (deliberately out of scope for Phase 0)
- Linux-only ground truth via
strace. eBPF/seccomp is the Phase 1+ upgrade. - No DNS resolution — connects are reported as IP:port, not domains.
- stdio transport only. HTTP/SSE servers come in Phase 1.
- Input synthesis is dumb (one canary value per field). Phase 1 swaps in
hypothesis-jsonschemafor real coverage. - The diff is a toy. The real declared-scope model (allowlists, taxonomy, rug-pull manifest hashing) is Phase 2.
- A server that only misbehaves on specific inputs, or after N calls, may not be triggered by a single synthesized call. Exercising state is later work.
If the gate passed
Next is Phase 1: generalize analyze.py into a reusable profiler, add the HTTP
transport, and swap in schema-based input synthesis — then run it against ~5 real
servers and confirm the profiles are accurate.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_behave-0.1.0.tar.gz.
File metadata
- Download URL: mcp_behave-0.1.0.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1b2be556fcbb2eecd2725971a2a8cb44ded9be125ea9dbdbbc6cd7bcf692c625
|
|
| MD5 |
edfb58bcbbbf3e993db5eedcbe6820bf
|
|
| BLAKE2b-256 |
820679c3dc4a7b456edfe48c4494c0c7aff8e556313b2457397967b7e2e6a878
|
File details
Details for the file mcp_behave-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mcp_behave-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6f095ab83197c75af5512638e2c448ca0a0a11b256c952b9a36f87ea51f16991
|
|
| MD5 |
1ee1dd461203d93afe3370c60ba6a317
|
|
| BLAKE2b-256 |
b7b54e797f1b4a99680992aa21be11676b4afd3b5e312f8165765e44fdee2617
|