Skip to main content

MCP server for Windows UI Automation — control any Windows app from AI agents

Project description

WinUI MCP Server

中文文档

An MCP (Model Context Protocol) server that enables AI agents to control any Windows desktop application through UI Automation. No screen coordinates needed — controls are located by their class hierarchy and names.

What It Does

This server exposes 26 tools that let your AI agent:

  • Discover — explore the UIA tree of any window to find controls
  • Click / Double-click / Right-click / Hover — interact with controls by name or class
  • Type / Send keys / Hotkeys — keyboard input to any focused or targeted control
  • Scroll — scroll up/down on a control or window
  • Read / Set values — get or set text fields, spinboxes, checkboxes, comboboxes
  • Wait — wait for a control to appear or disappear (async UI)
  • Window management — list windows, get window state, focus, restore minimized windows

Every tool returns structured JSON: {"success": bool, "message": str, "data": dict}.

Prerequisites

  • Windows 10/11
  • Python 3.10+
  • uv — fast Python package manager (pip install uv)

Setup

Install uv (if not already installed)

pip install uv

Install the package

git clone https://github.com/your-username/winui_mcp_server.git
cd winui_mcp_server

uv sync

Or install directly from PyPI (once published):

uv tool install winui-mcp-server

Install as MCP Server

The server runs via stdio transport. Below are configuration instructions for every major agent client.


Claude Code

Add to your project's .claude/settings.json or global ~/.claude/settings.json:

{
  "mcpServers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"]
    }
  }
}

Restart Claude Code. The winui tools will appear in your tool list.


Codex (OpenAI)

In your Codex project, create or edit .codex/config.json:

{
  "mcp_servers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"]
    }
  }
}

Restart Codex to pick up the new server.


Cline (VS Code Extension)

  1. Open VS Code with the Cline extension installed.
  2. Open Cline's MCP settings (gear icon in the Cline panel).
  3. Add a new MCP server with the following configuration:
{
  "mcpServers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"],
      "disabled": false
    }
  }
}

Alternatively, edit the Cline MCP settings file directly at:

  • Windows: %APPDATA%/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json
  • macOS: ~/Library/Application Support/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json

OpenCode

Edit ~/.opencode/config.json:

{
  "mcpServers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"]
    }
  }
}

Trae (ByteDance)

  1. Open Trae IDE.
  2. Go to Settings > MCP Servers.
  3. Add a new server with:
{
  "winui": {
    "command": "uvx",
    "args": ["winui-mcp-server"]
  }
}

Or edit the Trae MCP config file directly (location varies by OS, typically under the Trae user data directory).


Antigravity

Edit your Antigravity MCP configuration file:

{
  "mcpServers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"]
    }
  }
}

GitHub Copilot (VS Code)

GitHub Copilot supports MCP in agent mode. Add to your VS Code settings.json:

{
  "github.copilot.chat.mcp.servers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"]
    }
  }
}

Or use the VS Code settings UI: search for github.copilot.chat.mcp.servers and add the server entry.

Note: MCP support in GitHub Copilot requires VS Code 1.99+ and Copilot Chat in agent mode.


Qoder

Edit your Qoder MCP settings:

{
  "mcpServers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"]
    }
  }
}

CodeBuddy

Edit your CodeBuddy MCP configuration:

{
  "mcpServers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"]
    }
  }
}

Cursor

Edit .cursor/mcp.json in your project or ~/.cursor/mcp.json globally:

{
  "mcpServers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"]
    }
  }
}

Windsurf

Edit ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "winui": {
      "command": "uvx",
      "args": ["winui-mcp-server"]
    }
  }
}

Any MCP Client (Generic)

This server uses stdio transport and follows the standard MCP protocol. For any client that supports MCP:

Property Value
Transport stdio
Command uvx
Args ["winui-mcp-server"]
Server name winui

Available Tools (26)

Category Tool Description
Window list_windows List all top-level windows on the desktop
get_window_state Get window position, size, minimized state
focus_window Bring a window to the foreground
Discovery discover Explore the UIA tree (summary view, default depth 2)
describe List direct children with class, name, patterns
dump_tree Full UIA tree dump with detailed info (default depth 4)
get_control_rect Get bounding rectangle of a control
find_control Find controls by name/class without clicking (read-only)
Mouse click Click a control by name or class
double_click Double-click a control
right_click Right-click a control
hover Move mouse to a control
Scroll scroll_up Scroll up on a control or window
scroll_down Scroll down on a control or window
Keyboard send_key Send a single key press
send_hotkey Send a key combination (e.g. Ctrl+c)
long_press_key Hold a key for a duration
type_text Type text into a control
Value get_value Read value of an edit/spinbox (ValuePattern)
get_text Read Name text of a control (labels, buttons)
set_value Set value of an edit/spinbox
toggle Toggle a checkbox/switch (force on/off or flip)
combo_select Select an item from a combobox
Wait wait_for Wait for a control to appear or disappear

Tool Notes

  • get_value vs get_text: get_value reads from input fields / spinboxes that support ValuePattern. get_text reads the Name property of any control (labels, buttons, headers). Use get_text for static text, get_value for editable fields.

  • discover vs dump_tree: discover shows a summary (class, name, type) at shallow depth — good for quick exploration. dump_tree goes deeper and includes rect, patterns, and visibility info — use when you need the full picture.

  • toggle: Pass enable=true to force checked, enable=false to force unchecked. Omit enable to flip the current state.

  • find_control vs click: find_control searches and returns info without clicking. Use it to check if a control exists or inspect multiple matches before deciding which to interact with.

  • wait_for: Polls until a control appears or disappears. Useful for loading screens, async dialogs, or waiting for a spinner to go away. Default timeout is 10 seconds.

  • type_text: If you specify a name or control_class, it types into that control. If you omit both, it types into whatever is currently focused.

Usage Examples

With an AI Agent (natural language)

After installing the MCP server, just ask your agent:

  • "List all open windows on my desktop"
  • "Click the Save button in Notepad"
  • "Type 'Hello World' into the search box in my app"
  • "Toggle the Dark Mode checkbox in Settings"
  • "Wait for the loading spinner to disappear"
  • "Read the text of the status label"

CLI (direct usage)

# List all windows
uv run python cli_gateway.py list-windows

# Explore Notepad's UI
uv run python cli_gateway.py --window "Notepad" describe
uv run python cli_gateway.py --window "Notepad" dump-tree --depth 3

# Find controls without clicking
uv run python cli_gateway.py --window "Notepad" find --name "Save"

# Interact
uv run python cli_gateway.py --window "Notepad" click --name "Save"
uv run python cli_gateway.py --window "Notepad" type --text "Hello World"
uv run python cli_gateway.py --window "Notepad" hotkey --keys "Ctrl+s"

# Read text from a label
uv run python cli_gateway.py --window "Notepad" get-text --name "Status"

# Wait for a control
uv run python cli_gateway.py --window "MyApp" wait-for --name "Loading" --timeout 15 --disappear

# Focus a window
uv run python cli_gateway.py --window "Notepad" focus

# Launch Accessibility Insights for visual inspection
uv run python cli_gateway.py inspect

Python Script (multi-step workflows)

from driver import AppDriver
import skills_library as sk

driver = AppDriver(window_title="Notepad")

# Discover controls
tree = sk.discover_ui(driver, max_depth=3)

# Find without clicking
result = sk.find_control(driver, name="Save")

# Click and type
sk.click_by_name(driver, "Edit")
sk.type_in(driver, "Hello from Python!")
sk.send_hotkey(driver, "Ctrl+s")

# Wait for confirmation dialog
sk.wait_for_control(driver, name="Save As", timeout=5)

Architecture

mcp_server.py          MCP server — exposes all skills as MCP tools (with driver cache)
cli_gateway.py         CLI interface — single-command automation
driver.py              Core UIA engine — window binding, element resolution, driver cache
skills_library.py      Skill primitives — click, type, scroll, toggle, wait, find, etc.
config.py              Project constants
pyproject.toml         Package metadata (pip/uv install support)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

winui_mcp_server-0.1.0.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

winui_mcp_server-0.1.0-py3-none-any.whl (37.3 kB view details)

Uploaded Python 3

File details

Details for the file winui_mcp_server-0.1.0.tar.gz.

File metadata

  • Download URL: winui_mcp_server-0.1.0.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.18 {"installer":{"name":"uv","version":"0.11.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for winui_mcp_server-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e0378c965ce6d79a22c771d90bac9873ca75a5d62436ff741cf4c2970e5d9e51
MD5 559ef4be1a869a088f1b82ba2c335257
BLAKE2b-256 fb166d137944bd8e7b1e8e1f9fc43494628716787d570ed8d96e7e61d1ef5dc8

See more details on using hashes here.

File details

Details for the file winui_mcp_server-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: winui_mcp_server-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 37.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.18 {"installer":{"name":"uv","version":"0.11.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for winui_mcp_server-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0f1fb38e75af23bfc74eb13669419bf43aa5ad156cd834d63af6bc7208475d15
MD5 7be2b46a9b279bdfec48c7b2c34a4778
BLAKE2b-256 f87f48a5d6a6dd70380d5a296e549c48141b2455c9f96ebdc8cc09e15902f64d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page