MCP server for autonomous low-level security research with pwndbg
Project description
pwno-mcp
stateful system for autonomous pwn and binary research, designed for LLMs agents.
Overview
PwnoMCP is designed to run in scalable containerized environments (k8s) and expose stateful interaction, debugging capabilities through the Model Context Protocol (MCP). Each container runs an isolated instance with its own research self-sufficient environment.
We does it came from?: This version was a result of literation of redesigning, thinking and researching for around 6 month on this problem of "making a Pwn MCP for LLMs":
- I first tried writing a
gdbpluging and executing via gdb python api (autogdb.io), integrating MCP backend on a single backend, authorization by rewriting bit low-level implementation of early MCP's SSE (this was around March, see this post): Didn't work out well, since first of capturing program stdio was a problem forgdbapis (we did tried delimeters but another story regarding timing we will mention later), while stopping multi-thread binary is bit problematic (makes the entire part of actual executing pretty much unusable) although this version was pretty scalable with only one command backend was enough. (autogdb was only solving the problem of connecting from you're debugging (research) machine to agent client for research), it sounds easy but it was mixed with jumping between frontend, auth and specific compatibilization problem. - After realizing the scability problem of autogdb.io, I started this idea of bring even the entire research environment on cloud with scable pre-configured environments. Tons of time learning and making mistakes in k8s specifically gke, pretty much starting learning everything fron thin air. We got a working MVP on around 2 weeks diving into this (back then I still have my AP exams). Anyway backend it's still a major problem of "how to start a environment for everyone, and how to let everyone access their own environment?" We still sticked with the original centralized MCP backend approach, but this time we assign a k8s stream channel for each users, and use these io channels on one hand to natively interact with gdb (with delimiters), this was still intended to solve the problem of program IO capturing, it's a trick problem, I then thought about you should also let users see their gdb session on cloud, so I came up with the approach of duplicating a stdio channel back into frontend via k8s's stream and websockets, with around 2 months of development, we got our pwno.io up-and-running, but still tons of problem that spent incredible amount of time that i didnt mentioned, from gke integration to network issues.
- pwno.io was working I can't say well, but at a working level, there's still asynchroization problems and gke native problems but we managed to solve the most pain-in-the-ass scability, interactive IO problem that we spent around by far 3 months on. This is when I started working on pwnuous our cooperation with GGML, which will need a new thing like the previous version of pwno-mcp but for more stable support. Since for previous version, we're plugged into GDB via direct IO stream, asynchroization problem as I mentioned was another huge pain-in-the-ass, some IO slipped away and it just not stable enough for use. This is when I started thinking rewriting everything, and throw away some part just for usability for LLMs and it's full agentic compatabilization. I was working on my black hat talk back then so thought a little about statefulness, learnt about this wonderful thing that just seem to be born for us GDB/MI (Debugging with GDB), I spent few days rewriting the entire thing by reading docs. I definited did spent less time conceptualizing backend architecture for pwno.io for this version of
pwno-mcp(around 2 days mainly on gke gateway things), it's definite not a very elaborate or sophisticated framework by all mean, but it did came from a shit tons of experience of trial-and-erroring my self while thinking about the question of making something that's can scale (multi-agent, researcher using it), so I will say it's by far the best conceptualizations and work to best serve for the purpose of LLMs using it stabiliy and scability. And I do think it's the best time or the now-or-never time to open-source it, or this project or Pwno will die from lack of feedback loop, despitepwno-mcpis a little part of what we're doing.
Can I use it?:
- non-profit: yes, feel free to
- commercial: oss@pwno.io
Features
pwndbgmachine-interface, stateful execution with debugging (designed deliberately toward LLMs)- Execute Tool: Run arbitrary GDB/pwndbg commands
- Launch Tool: Load and run binaries with full control
- Step Control: Support for all stepping commands (run, c, n, s, ni, si)
- Context Retrieval: Get registers, stack, disassembly, code, and backtrace
- Breakpoint Management: Set conditional breakpoints
- Memory Operations: Read memory in various formats
- Session State: Track debugging session state and history
- Subprocess Tools:
- Compile binaries with sanitizers (ASAN, MSAN, etc.)
- Spawn and manage background processes
- Track process status and resource usage
- RetDec: RetDec decompiler service
- note: to use pwno's retdec server, you need to
export BINARY_URL=<your-binary-url>
- note: to use pwno's retdec server, you need to
- Auth: Authentication via a nonce
pwno-mcp
Installation
Using Docker
Prebuilt image (recommended)
# Pull the image
docker pull ghcr.io/pwno-io/pwno-mcp:latest
# Run with required capabilities (HTTP mode by default)
docker run -p 5500:5500 \
--cap-add=SYS_PTRACE \
--cap-add=SYS_ADMIN \
--security-opt seccomp=unconfined \
--security-opt apparmor=unconfined \
-v $(pwd)/workspace:/workspace \
ghcr.io/pwno-io/pwno-mcp:latest
# Open a shell instead of starting the server
docker run --entrypoint /bin/bash -it \
--cap-add=SYS_PTRACE \
--cap-add=SYS_ADMIN \
--security-opt seccomp=unconfined \
--security-opt apparmor=unconfined \
-v $(pwd)/workspace:/workspace \
ghcr.io/pwno-io/pwno-mcp:latest
The image is built for linux/amd64; on Apple Silicon, add --platform linux/amd64 to docker pull and docker run.
If :latest is not available, use :edge to track the main branch.
Tags:
latest: stable defaultedge: tracksmainsha-<short>: pin to a commitX.Y.Z,X.Y,X: release tags fromvX.Y.Z
Build locally
docker build -t pwno-mcp:latest . --platform linux/amd64
If you build locally, replace ghcr.io/pwno-io/pwno-mcp:latest with pwno-mcp:latest in the commands above.
Or use Docker Compose (builds locally by default):
# Build and run
docker-compose up -d
# View logs
docker-compose logs -f
# Execute commands in the container
docker-compose exec pwno-mcp bash
The Docker image includes:
- Ubuntu 24.04 LTS base
- GDB with pwndbg pre-installed
- Build tools (gcc, g++, clang, make, cmake)
- Address sanitizer libraries
- Python with uv package manager
- All required dependencies
Prerequisites
- GDB with Python support
- pwndbg installed and configured in
~/.gdbinit - Python 3.12+
Usage
Running the Server
The server supports two modes:
HTTP Mode (Default - Streamable HTTP)
Running the module without extra flags starts the Streamable HTTP transport:
python -m pwnomcp
# or with Docker:
docker run -p 5500:5500 \
--cap-add=SYS_PTRACE \
--cap-add=SYS_ADMIN \
--security-opt seccomp=unconfined \
--security-opt apparmor=unconfined \
ghcr.io/pwno-io/pwno-mcp:latest
The MCP server is hosted directly via FastMCP.run() using the Streamable HTTP
transport. The default endpoint is http://0.0.0.0:5500/debug, so tools such as
Claude Desktop should point at that base URL (e.g. /debug/messages/). The
attach helper API continues to be served on http://127.0.0.1:5501/attach.
stdio Mode (for MCP Clients)
To use stdio transport (e.g., for Claude Desktop's local integration), pass the
--stdio flag explicitly:
python -m pwnomcp --stdio
# or with Docker:
docker run -i \
--cap-add=SYS_PTRACE \
--cap-add=SYS_ADMIN \
--security-opt seccomp=unconfined \
--security-opt apparmor=unconfined \
ghcr.io/pwno-io/pwno-mcp:latest --stdio
Using with Claude Desktop
To use pwno-mcp with Claude Desktop, add the following to your claude_desktop_config.json:
On macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
On Linux: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"pwndbg-mcp": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"--cap-add=SYS_PTRACE",
"--cap-add=SYS_ADMIN",
"--security-opt",
"seccomp=unconfined",
"--security-opt",
"apparmor=unconfined",
"-v",
"/path/to/your/workspace:/workspace",
"ghcr.io/pwno-io/pwno-mcp:latest",
"--stdio"
]
}
}
}
Replace /path/to/your/workspace with your actual workspace directory path.
MCP Tools
1. Execute
Execute any GDB/pwndbg command:
{
"tool": "execute",
"arguments": {
"command": "info registers"
}
}
2. Set File
Load a binary file for debugging:
{
"tool": "set_file",
"arguments": {
"binary_path": "/path/to/binary"
}
}
3. Run
Run the loaded binary (requires at least one enabled breakpoint set beforehand):
{
"tool": "run",
"arguments": {
"args": "arg1 arg2"
}
}
Note: You must set at least one enabled breakpoint before running.
4. Step Control
Control program execution:
{
"tool": "step_control",
"arguments": {
"command": "n"
}
}
5. Get Context
Retrieve debugging context:
{
"tool": "get_context",
"arguments": {
"context_type": "all"
}
}
6. Set Breakpoint
Set breakpoints with optional conditions:
{
"tool": "set_breakpoint",
"arguments": {
"location": "main",
"condition": "$rax == 0"
}
}
7. Get Memory
Read memory at specific addresses:
{
"tool": "get_memory",
"arguments": {
"address": "0x400000",
"size": 128,
"format": "hex"
}
}
8. Get Session Info
Get current debugging session information:
{
"tool": "get_session_info",
"arguments": {}
}
9. Run Command
Execute system commands (primarily for compilation):
{
"tool": "run_command",
"arguments": {
"command": "gcc -g -fsanitize=address vuln.c -o vuln",
"cwd": "/path/to/src",
"timeout": 30.0
}
}
10. Spawn Process
Start a background process and get its PID:
{
"tool": "spawn_process",
"arguments": {
"command": "python3 -m http.server 8080",
"cwd": "/path/to/serve"
}
}
11. Get Process Status
Check status of a spawned process:
{
"tool": "get_process_status",
"arguments": {
"pid": 12345
}
}
12. Kill Process
Terminate a process:
{
"tool": "kill_process",
"arguments": {
"pid": 12345,
"signal": 15
}
}
13. List Processes
List all tracked background processes:
{
"tool": "list_processes",
"arguments": {}
}
Typical Workflow
-
Load a binary:
{"tool": "set_file", "arguments": {"binary_path": "/path/to/binary"}}
-
Prepare breakpoints, then run:
{"tool": "set_breakpoint", "arguments": {"location": "main"}} {"tool": "run", "arguments": {"args": ""}}
-
Use stepping commands and examine state:
{"tool": "step_control", "arguments": {"command": "n"}} {"tool": "get_context", "arguments": {"context_type": "all"}}
Compilation Workflow Example
-
Compile with AddressSanitizer:
{"tool": "run_command", "arguments": {"command": "gcc -g -fsanitize=address -fno-omit-frame-pointer vuln.c -o vuln"}}
-
Load and debug the compiled binary:
{"tool": "set_file", "arguments": {"binary_path": "./vuln"}} {"tool": "set_breakpoint", "arguments": {"location": "main"}} {"tool": "run", "arguments": {"args": ""}}
-
If running a server for exploitation:
{"tool": "spawn_process", "arguments": {"command": "./vulnerable_server 8080"}}
Then check its status:
{"tool": "get_process_status", "arguments": {"pid": 12345}}
Development
Project Structure
pwnomcp/
├── __init__.py
├── __main__.py
├── server.py # FastMCP server implementation
├── gdb/
│ ├── __init__.py
│ └── controller.py # GDB/pygdbmi interface
├── state/
│ ├── __init__.py
│ └── session.py # Session state management
└── tools/
├── __init__.py
└── pwndbg.py # MCP tool implementations
Key Design Decisions
-
Synchronous Tool Execution: Unlike pwndbg-gui, each MCP tool invocation returns complete results immediately, suitable for LLM interaction.
-
State Management: The server maintains session state including binary info, breakpoints, watches, and command history.
-
GDB/MI Native Commands: Leverages GDB Machine Interface commands for structured output, as recommended in the pygdbmi documentation:
-file-exec-and-symbolsinstead offilefor loading binaries-break-insertinstead ofbreakfor structured breakpoint data-exec-run,-exec-continue,-exec-next, etc. for execution control-data-evaluate-expressionfor expression evaluation-break-list,-break-delete,-break-enable/disablefor breakpoint management
This provides structured JSON-like responses instead of parsing text output, making the server more reliable and efficient.
-
Per-Container Isolation: Each container runs its own GDB instance, ensuring complete isolation between debugging sessions.
Future Enhancements
- WebSocket endpoint for streaming I/O
- Advanced memory analysis tools
- Heap exploitation helpers
- ROP chain generation
- Symbolic execution integration
License
This project is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).
- Non-commercial use only
- No derivatives or modifications may be distributed
- Attribution required
See the LICENSE file for the full legal text or visit the license page: https://creativecommons.org/licenses/by-nc-nd/4.0/
Contributing
Contributions are welcome! Please submit pull requests or open issues on GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file iflow_mcp_pwno_io_pwno_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: iflow_mcp_pwno_io_pwno_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 44.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41961baa7a0f259b98599fb5475d8536857009e5733042b5cf276f5fd05838eb
|
|
| MD5 |
49fb649dd376493455e0f51f8da8d787
|
|
| BLAKE2b-256 |
b940ee44d2a1fd62a4827afee6b9801aeb6813bbc5624cec50b9bd6f00407cc8
|