Skip to main content

Hyprland-first computer-use foundation for MCP clients (observe/capture/input/tasks).

Project description

Wisp Hand

PyPI Python CI Pages License

MCP Hyprland Wayland uv structlog rich

Wisp Hand

一个面向 Hyprland/Wayland 的 computer-use MCP runtime。它不做 agent 的规划与决策,只提供可被外部 AI/客户端调用的能力:观察、截图、对比、输入、批处理、可选本地视觉,以及面向耗时任务的 task-augmented 执行。

适合用来做:

  • 让 AI 帮你查看桌面应用状态(例如 Godot 编辑器)并执行少量输入
  • capture + diff 做 GUI 行为验证(动作前后是否发生变化)
  • 以 session/scope 为安全边界做可审计的输入回放与排障

快速开始

连接前自检:

uvx wisp-hand doctor --json | jq .

启动(默认 stdio transport):

uvx wisp-hand mcp --config ~/.config/wisp-hand/config.toml

开发/调试(等价入口):

uv run wisp-hand mcp --config ./config.toml
python -m wisp_hand mcp --config ./config.toml

用 MCP Inspector 验证:

just inspector

启动文档站点(MkDocs):

just docs-serve

工具概览(MCP tools)

工具命名空间固定为 wisp_hand.*

  • 发现与基础:wisp_hand.capabilitieswisp_hand.session.openwisp_hand.session.close
  • 只读观察:wisp_hand.desktop.get_active_windowwisp_hand.desktop.get_monitorswisp_hand.desktop.list_windowswisp_hand.desktop.get_topologywisp_hand.cursor.get_position
  • 截图与对比:wisp_hand.capture.screenwisp_hand.capture.diff(capture artifact store: png + json metadata,默认走 MCP resources)
  • 批处理与等待:wisp_hand.batch.runwisp_hand.wait
  • 可选本地视觉:wisp_hand.vision.describe / wisp_hand.vision.locate(Ollama,可关闭)
  • Scoped 输入:wisp_hand.pointer.*wisp_hand.keyboard.*

默认 token-efficient:

  • tool result 的 content 只返回极短摘要(成功 ok;失败 code: message),完整结果只在 structuredContent
  • wisp_hand.capture.screen 默认不把 png/base64/path 塞进 tool result,而是返回 image_uri/metadata_uri,让客户端按需 resources/read 拉取。
  • wisp_hand.desktop.get_topology 支持 detail=summary|full|raw,默认 summary(不返回 windows 列表)。排障时再用 full/raw

安全默认值:

  • 新 session 默认 armed=false,输入类工具会被拒绝
  • 危险快捷键会被策略拒绝(并进入 audit/log)
  • 默认脱敏:keyboard.type 文本等不会以明文进入日志/审计(可配置放开)

核心概念:Session + Scope

Wisp Hand 的输入必须在 session 内执行,并且 session 绑定明确的 scope(作用域)。截图/输入都用同一套 scope-relative 坐标闭环,方便安全控制与审计。

常用 scope:

  • window:绑定某个窗口(推荐用于 Godot/IDE/浏览器等应用)
  • region:绑定一个明确矩形区域(适合做局部验证与安全输入)

输入建议流程:

  1. 先用 armed=true, dry_run=true 打开 session 校验坐标(尤其多显示器+缩放)。
  2. 再用 armed=true, dry_run=false 进行真实输入。
  3. 优先用 capture.screen + capture.diff 判断动作是否生效,而不是高频轮询 desktop.get_topology

场景:让 AI 观察与操作 Godot

最小闭环建议拆成:

  1. 获取 Godot 窗口 selector(切到 Godot 前台后读取 desktop.get_active_window
  2. session.open(scope_type="window", scope_target="<selector>")
  3. capture.screen 获取截图(必要时 vision.locate 辅助定位 Run/Play)
  4. pointer.click 点击运行,wait 等待 UI 变化
  5. capture.screen 并用 capture.diff 做验证

更完整流程见文档:docs/mkdocs/scenarios/godot.md

坐标与缩放(重要)

  • 对外输入坐标统一使用 Hyprland 的 layout/logical px(scope-relative)
  • 截图尺寸是 image px(可能与 layout px 因 scale 不一致)
  • runtime 会在 topology 中附带坐标映射信息,并通过自适应坐标后端处理混合缩放与多显示器场景

坐标诊断脚本:

uv run python examples/attempts/diagnose_coordinates.py --capture-check

Task-Augmented Execution(长耗时调用)

当客户端在 tools/call params 里携带 task 元数据时,服务会立即返回 CreateTaskResult,并在后台执行,客户端可用 tasks/get 轮询、用 tasks/result 获取最终 CallToolResult,也可 tasks/cancel 取消。

可参考 smoke 脚本:

  • uv run python examples/attempts/smoke_mcp_transports.py --transport stdio
  • uv run python examples/attempts/smoke_mcp_transports.py --transport sse

配置示例(config.toml

默认配置路径:~/.config/wisp-hand/config.toml(也可用环境变量 WISP_HAND_CONFIG 或 CLI --config 指定)

[server]
transport = "stdio"          # stdio | sse | streamable-http
host = "127.0.0.1"
port = 8000

[paths]
state_dir = "~/.local/state/wisp-hand"
audit_file = "~/.local/state/wisp-hand/audit.jsonl"
runtime_log_file = "~/.local/state/wisp-hand/runtime.jsonl"
# capture_dir 如不显式指定,默认在 state_dir/captures

[logging]
level = "INFO"
allow_sensitive = false

[logging.console]
enabled = true
format = "rich"              # rich | plain | json

[logging.file]
enabled = true
format = "json"              # json | plain | rich(会自动降级)

[retention.captures]
max_age_seconds = 604800      # 7d
max_total_bytes = 268435456   # 256MB

[retention.audit]
max_bytes = 10485760
backup_count = 5

[retention.runtime_log]
max_bytes = 10485760
backup_count = 5

[vision]
mode = "disabled"            # disabled | assist
base_url = "http://127.0.0.1:11434"
model = "qwen3.5:0.8b"
timeout_seconds = 30

[coordinates]
mode = "auto"                # auto | hyprctl-infer | grim-probe | active-pointer-probe
cache_enabled = true
probe_region_size = 120
min_confidence = 0.75

文档

本仓库提供 MkDocs 文档站点(内容在 docs/mkdocs/):

uv run mkdocs serve
uv run mkdocs build --strict

排障

docs/TROUBLESHOOTING.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wisp_hand-0.1.1.tar.gz (51.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wisp_hand-0.1.1-py3-none-any.whl (63.6 kB view details)

Uploaded Python 3

File details

Details for the file wisp_hand-0.1.1.tar.gz.

File metadata

  • Download URL: wisp_hand-0.1.1.tar.gz
  • Upload date:
  • Size: 51.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wisp_hand-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5bf24564c67cab6355bc01bf5b11b945db719598165049c6f998f38a412e2474
MD5 502806100e6394ae563c0e429f21e271
BLAKE2b-256 c69d50798c00ce7bfc876857506bae01086298094c2a3041edee5d752b77dd76

See more details on using hashes here.

File details

Details for the file wisp_hand-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: wisp_hand-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 63.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for wisp_hand-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c36e9c5eb6c015c66b9f4bc0ee2d487fde86ddab7bdf56af723154b4cd7c22b3
MD5 8b22abd1b5b3809003582cefb1cb1a6d
BLAKE2b-256 9617a76c058ef17ce9f538830e079ddf33379eb6ee751ab7943d78a5f97e5703

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page