Skip to main content

Windows 桌面自动化 MCP 服务器 - 让 AI 代理能看见和操作 Windows 桌面应用

Project description

PeekabooWin MCP

Windows 桌面自动化 MCP 服务器 — 让 AI 代理能看见和操作 Windows 桌面应用。

基于 Windows UI Automation (UIA) + SendInput + PaddleOCR,提供完整的桌面交互能力:元素发现、输入模拟、截图、窗口管理、剪贴板、OCR 识别。


系统要求

  • 操作系统: Windows 10 或 Windows 11
  • 工具: 推荐 uv(Python 包管理)
  • 权限: 管理员身份运行可解锁所有功能(向提升权限窗口发送输入)
  • OCR: 可选,需安装 PaddleOCR(约 1GB)

安装

方式一:一键安装(推荐)

irm https://raw.githubusercontent.com/wangneal/PeekabooWin/main/install.ps1 | iex

脚本自动安装 uv 和 PeekabooWin,打印 MCP 配置。

方式二:uvx 直接运行(无需安装)

uvx peekaboowin

第一次自动从 PyPI 下载缓存,之后秒级启动。

方式三:安装到本地

# 需要先安装 uv: https://docs.astral.sh/uv/#installation

# 从 PyPI 安装(推荐)
uv tool install peekaboowin

# 可选:OCR 扩展(约 1GB)
uv tool install peekaboowin[ocr]

# 或从源码安装
git clone https://github.com/wangneal/PeekabooWin.git
cd PeekabooWin
uv pip install -e ".[ocr]"

安装后可用 peekaboowin 命令直接启动。


MCP 客户端配置

安装后,在不同客户端中添加以下配置。

OpenCode

配置文件: C:\Users\<用户名>\.config\opencode\opencode.json

uvx 方式(推荐,无需安装):

{
  "mcp": {
    "peekaboowin": {
      "type": "local",
      "command": ["uvx", "peekaboowin", "-y"],
      "enabled": true
    }
  }
}

安装后直接运行:

{
  "mcp": {
    "peekaboowin": {
      "type": "local",
      "command": ["peekaboowin"],
      "enabled": true
    }
  }
}

两种方式任选一种,uvx 方式会自动从 PyPI 下载并缓存,无需手动安装。

Claude Desktop

配置文件: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "peekaboowin": {
      "command": "uvx",
      "args": ["peekaboowin", "-y"]
    }
  }
}

Cursor

设置路径: Settings → Features → MCP Servers → Add custom MCP

{
  "mcpServers": {
    "peekaboowin": {
      "command": "uvx",
      "args": ["peekaboowin", "-y"]
    }
  }
}

Windsurf

配置文件: ~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "peekaboowin": {
      "command": "uvx",
      "args": ["peekaboowin", "-y"]
    }
  }
}

GitHub Copilot (VS Code)

配置文件: .vscode/mcp.json

{
  "servers": {
    "peekaboowin": {
      "command": "uvx",
      "args": ["peekaboowin", "-y"]
    }
  }
}

所有客户端均已统一为 uvx 方式,自动从 PyPI 拉取,无需手动安装 pip。


工具参考

共 32 个工具,按功能分类:

截图与屏幕

工具 参数 说明
screenshot monitor?, region?, image_format?, quality? 捕获屏幕或区域截图
capture_window hwnd, image_format?, quality? 捕获指定窗口截图
list_monitors 列出所有显示器
health_check 系统诊断:DPI、权限、OCR 状态

UI 元素发现

工具 参数 说明
find_element name?, class_name?, automation_id?, scope? 按名称/类名/ID 查找元素;UIA 无结果时自动 OCR 降级
get_element_info element_ref 获取元素详细属性
get_children element_ref, depth? 获取子元素树
get_desktop depth? 获取桌面元素树
harvest_ui target?, depth? 一站式 UI 发现,UIA 稀疏时自动 OCR 补充

元素引用格式: hwnd:12345 / point:100,200 / desktop

harvest_ui target: desktop(所有窗口)/ foreground(前台窗口)/ 12345(指定 HWND)

输入模拟

工具 参数 说明
click x, y, button?, double? 鼠标点击(自动聚焦目标窗口)
move_mouse x, y 移动鼠标(自动聚焦目标窗口)
drag x1, y1, x2, y2, duration? 拖拽(自动聚焦起始窗口)
scroll x, y, delta? 滚动(自动聚焦目标窗口)
type_text text 输入 Unicode 文本(盲发,需先激活窗口)
press_keys keys 按键组合:ctrl+s, alt+tab, win+r
click_element element_ref 语义点击(自动聚焦目标窗口)
type_into_element element_ref, text 语义输入(自动聚焦目标窗口)

press_keys 支持的键名: 修饰键(ctrl/alt/shift/win)、导航(enter/tab/escape/方向键/delete/backspace)、功能键(f1-f24)、编辑(home/end/pageup/pagedown/insert)、小键盘(num0-num9/numpad 操作符)、符号单字符(= + - [ ] \ ; ' , . / `)自动通过 VkKeyScanW 解析。

窗口管理

工具 说明
list_windows 列出所有可见窗口
get_foreground_window 获取前台窗口信息
activate_window hwnd 激活窗口到前台
move_window / resize_window 移动/调整窗口
minimize_window / maximize_window / restore_window 最小化/最大化/还原
close_window 发送 WM_CLOSE 关闭窗口

等待/轮询

工具 参数 说明
wait_for_element name?, automation_id?, class_name?, control_type?, timeout?, interval?, visible? 等待元素出现/消失
wait_for_window title?, class_name?, timeout?, interval?, appear? 等待窗口出现/消失

剪贴板

工具 参数 说明
read_clipboard 读取剪贴板文本
write_clipboard text 写入剪贴板
paste_text text 写入剪贴板 + Ctrl+V

OCR

工具 参数 说明
ocr_read source?, hwnd?, region?, lang? OCR 文本识别,支持屏幕/窗口/区域

配置

通过环境变量配置,前缀 PEEKABOOWIN_

变量 默认值 说明
PEEKABOOWIN_LOG_LEVEL info 日志级别
PEEKABOOWIN_LOG_FORMAT json 日志格式
PEEKABOOWIN_SCREENSHOT_FORMAT png 截图格式
PEEKABOOWIN_SCREENSHOT_QUALITY 85 JPEG 质量
PEEKABOOWIN_SCREENSHOT_MAX_WIDTH 1920 截图最大宽度
PEEKABOOWIN_OCR_ENABLED false OCR 开关(有 paddleocr 时自动开启)
PEEKABOOWIN_OCR_LANGUAGE ch OCR 语言
PEEKABOOWIN_UIA_TIMEOUT 2.0 UIA 操作超时(秒)
PEEKABOOWIN_UIA_MAX_DEPTH 10 UIA 树遍历最大深度
PEEKABOOWIN_INPUT_CLICK_DELAY 0.05 点击后延迟(秒)
PEEKABOOWIN_INPUT_TYPE_DELAY 0.02 按键间延迟(秒)

示例:

{
  "mcpServers": {
    "peekaboowin": {
      "command": "peekaboowin",
      "env": {
        "PEEKABOOWIN_LOG_LEVEL": "debug",
        "PEEKABOOWIN_SCREENSHOT_MAX_WIDTH": "3840"
      }
    }
  }
}

AI Agent 提示词指南

核心规则

  1. 每步操作后必须等待确认press_keys("win+r") 后必须 wait_for_window(title="运行"),不能连续发操作
  2. 先观察后操作 — 用 screenshot / harvest_ui 了解界面状态再执行
  3. 语义操作优先click_element / type_into_element 优于坐标操作
  4. UWP 应用特殊处理 — UIA 树稀疏时 find_element 自动降级到 OCR,结果带 "source": "ocr_fallback" 标记

示例:打开记事本

press_keys("win+r")
wait_for_window(title="运行", timeout=3)
type_text("notepad")
press_keys("enter")
wait_for_window(title="记事本", timeout=5)
type_text("Hello from AI!")
screenshot()

自动聚焦说明

  • click / click_element / type_into_element / drag / scroll / move_mouse 会自动 AttachThreadInput + SetForegroundWindow 激活目标窗口
  • type_text / press_keys 是盲发操作,不会自动聚焦,使用前需 activate_window

架构

Tool Layer      — 32 个 @mcp.tool(),统一 @tool_error_handler 装饰器
Service Layer   — 单例服务,业务逻辑编排
Platform Layer  — comtypes (UIA) / ctypes (SendInput) / mss (截图) / PaddleOCR

三层职责清晰:Tool 层做参数校验,Service 层编排逻辑,Platform 层封装 Win32 API。


限制

  • 仅 Windows 10/11 — 依赖 Windows UI Automation API
  • UWP 应用 UIA 稀疏 — 自动降级到 OCR,但 OCR 元素缺少 automation_id / hwnd,无法用于 click_element
  • 窗口截图非真实窗口内容capture_window 使用 BitBlt 屏幕裁剪,UWP/DirectX 窗口显示黑屏
  • 后台点击有限 — UIA InvokePattern 可实现后台操作,当前尚未实现
  • UAC 隔离 — 非管理员进程无法向提升权限窗口发送输入

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

peekaboowin-0.1.1.tar.gz (41.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

peekaboowin-0.1.1-py3-none-any.whl (50.0 kB view details)

Uploaded Python 3

File details

Details for the file peekaboowin-0.1.1.tar.gz.

File metadata

  • Download URL: peekaboowin-0.1.1.tar.gz
  • Upload date:
  • Size: 41.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for peekaboowin-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f82e2103047153cefd208337a302434f7909dec8d9a20ed7bbe80190a67c0019
MD5 f0939bd439c2e2f1b56a26d39db0be77
BLAKE2b-256 595eebf801efe13356c6eac148a092f48e23b5bdbf7531d7f87f3391cf6164e6

See more details on using hashes here.

File details

Details for the file peekaboowin-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: peekaboowin-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 50.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for peekaboowin-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 222940d158a661dff923ba42af4433bbc5f00ad395bdf1db16fbe6d1ac66e0e7
MD5 f1eabdfd4c560bcf6c9105d0880c3cfa
BLAKE2b-256 7bfe5624eadd62116d2a6f0a9d6fd5a71c5e959e9a0e3cb9e9a15c8c0ebd6f99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page