Skip to main content

FastMCP 2.13+ server for Windows UI automation using PyWinAuto - Portmanteau Edition

Project description

PyWinAuto MCP - Portmanteau Edition

Version 0.3.1 | 8 Comprehensive Portmanteau Tools | FastMCP 2.13.1 | SOTA 2026 Compliant

A sophisticated, FastMCP 2.13.1 compliant server for Windows UI automation using PyWinAuto. Features 8 comprehensive portmanteau tools consolidating 60+ operations, face recognition security, and professional packaging.

๐Ÿš€ What's New in v0.3.0 - Portmanteau Edition

Tool Consolidation

Previous versions had 60+ individual tools scattered across multiple files with duplicates. The Portmanteau Edition consolidates everything into 8 comprehensive tools:

Tool Operations Description
automation_windows 11 Window management (list, find, maximize, minimize, etc.)
automation_elements 14 UI element interaction (click, hover, text, etc.)
automation_mouse 9 Mouse control (move, click, scroll, drag)
automation_keyboard 4 Keyboard input (type, press, hotkey)
automation_visual 4 Visual operations (screenshot, OCR, find image)
automation_face 5 Face recognition (add, recognize, list, delete)
automation_system 7 System utilities (health, help, clipboard, processes)
get_desktop_state 1 Comprehensive desktop UI element discovery

Benefits

  • Reduced tool explosion: 60+ tools โ†’ 8 tools
  • No duplicates: Each operation defined once
  • Better discoverability: Related operations grouped together
  • FastMCP 2.13.1 compliant: Latest features and security fixes
  • SOTA 2026 Standard: 100% docstring compliance (Ruff D-rules) and industrial technical documentation

๐Ÿ† Features

๐Ÿ” Window Management (automation_windows)

# List all windows
automation_windows("list")

# Find window by title
automation_windows("find", title="Notepad", partial=True)

# Maximize, minimize, restore
automation_windows("maximize", handle=12345)
automation_windows("minimize", handle=12345)
automation_windows("restore", handle=12345)

# Position and size
automation_windows("position", handle=12345, x=100, y=100, width=800, height=600)

๐ŸŽฏ Element Interaction (automation_elements)

# Click elements
automation_elements("click", window_handle=12345, control_id="btnOK")
automation_elements("double_click", window_handle=12345, control_id="listItem")
automation_elements("right_click", window_handle=12345, x=100, y=200)

# Get/set text
automation_elements("text", window_handle=12345, control_id="Edit1")
automation_elements("set_text", window_handle=12345, control_id="Edit1", text="Hello!")

# Wait and verify
automation_elements("wait", window_handle=12345, control_id="loading", timeout=10.0)
automation_elements("verify_text", window_handle=12345, control_id="status", expected_text="Ready")

๐Ÿ–ฑ๏ธ Mouse Control (automation_mouse)

# Position and movement
automation_mouse("position")
automation_mouse("move", x=500, y=300)
automation_mouse("move_relative", x=10, y=-5)

# Clicking
automation_mouse("click", x=500, y=300)
automation_mouse("double_click", x=500, y=300)
automation_mouse("right_click")

# Scrolling and dragging
automation_mouse("scroll", amount=3)
automation_mouse("drag", x=100, y=100, target_x=500, target_y=300)

โŒจ๏ธ Keyboard Input (automation_keyboard)

# Type text
automation_keyboard("type", text="Hello World!")

# Press keys
automation_keyboard("press", key="enter")
automation_keyboard("hotkey", keys=["ctrl", "c"])
automation_keyboard("hotkey", keys=["ctrl", "shift", "s"])

๐Ÿ“ธ Visual Intelligence (automation_visual)

# Screenshots
automation_visual("screenshot")
automation_visual("screenshot", window_handle=12345, return_base64=True)

# OCR text extraction
automation_visual("extract_text", image_path="screen.png")

# Find image on screen
automation_visual("find_image", template_path="button.png", threshold=0.8)

๐Ÿ”’ Face Recognition (automation_face)

# Add and recognize faces
automation_face("add", name="John Doe", image_path="john.jpg")
automation_face("recognize", image_path="unknown.jpg")

# List and manage
automation_face("list")
automation_face("delete", name="John Doe")

# Webcam capture
automation_face("capture", camera_index=0)

โš™๏ธ System Utilities (automation_system)

# Health and help
automation_system("health")
automation_system("help")

# Wait operations
automation_system("wait", seconds=2.5)
automation_system("wait_for_window", title="Notepad", timeout=10.0)

# Clipboard
automation_system("clipboard_get")
automation_system("clipboard_set", text="Copied!")

# Process list
automation_system("process_list")

๐Ÿ“Š Desktop State Capture

# Basic UI discovery
get_desktop_state()

# With visual annotations
get_desktop_state(use_vision=True)

# With OCR text extraction
get_desktop_state(use_ocr=True)

# Full analysis
get_desktop_state(use_vision=True, use_ocr=True, max_depth=15)

๐Ÿ›  Installation

Prerequisites

  • Windows 10/11
  • Python 3.10+
  • Microsoft UI Automation (UIA) support

Install from source

# Clone the repository
git clone https://github.com/sandraschi/pywinauto-mcp.git
cd pywinauto-mcp

# Create and activate a virtual environment
python -m venv venv
.\venv\Scripts\Activate.ps1

# Install core package
pip install -e .

# Install with face recognition
pip install -e ".[face]"

# Install with all dependencies (including dev tools)
pip install -e ".[all]"

Install Tesseract OCR (for OCR features)

Download and install Tesseract from UB Mannheim

๐Ÿš€ Quick Start

Start the MCP Server

# Direct run
python -m pywinauto_mcp

# Or using the entry point
pywinauto-mcp

Claude Desktop Configuration

Add to your Claude Desktop claude_desktop_config.json:

{
  "mcpServers": {
    "pywinauto": {
      "command": "python",
      "args": ["-m", "pywinauto_mcp"],
      "cwd": "D:\\Dev\\repos\\pywinauto-mcp"
    }
  }
}

๐Ÿ”ง Configuration

Create a .env file in the project root:

# Server Configuration
HOST=0.0.0.0
PORT=8000
LOG_LEVEL=INFO

# PyWinAuto Settings
TIMEOUT=10.0
RETRY_ATTEMPTS=3
RETRY_DELAY=1.0

# Face Recognition Settings
FACE_RECOGNITION_TOLERANCE=0.6
FACE_RECOGNITION_MODEL=hog

# Screenshot Settings
SCREENSHOT_DIR=./screenshots
SCREENSHOT_FORMAT=png

๐Ÿ“š Architecture

Portmanteau Pattern

The Portmanteau Edition follows FastMCP 2.13+ best practices:

pywinauto_mcp/
โ”œโ”€โ”€ app.py                    # FastMCP app instance
โ”œโ”€โ”€ main.py                   # Entry point
โ””โ”€โ”€ tools/
    โ”œโ”€โ”€ __init__.py           # Tool registration
    โ”œโ”€โ”€ portmanteau_windows.py    # Window management
    โ”œโ”€โ”€ portmanteau_elements.py   # UI elements
    โ”œโ”€โ”€ portmanteau_mouse.py      # Mouse control
    โ”œโ”€โ”€ portmanteau_keyboard.py   # Keyboard input
    โ”œโ”€โ”€ portmanteau_visual.py     # Visual/OCR
    โ”œโ”€โ”€ portmanteau_face.py       # Face recognition
    โ”œโ”€โ”€ portmanteau_system.py     # System utilities
    โ”œโ”€โ”€ desktop_state.py          # Desktop state (standalone)
    โ””โ”€โ”€ archived/                 # Legacy tools (preserved)

Why Portmanteau?

  1. Prevents tool explosion: Instead of 60+ tools, 8 comprehensive tools
  2. Better discoverability: Related operations grouped logically
  3. Reduced cognitive load: Fewer tools to remember
  4. Consistent interface: Each tool follows the same pattern
  5. Easier maintenance: Changes in one place affect all operations

๐Ÿค Contributing

See CONTRIBUTING.md for development workflow and guidelines.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_sandraschi_pywinauto_mcp-0.3.1.tar.gz (376.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file iflow_mcp_sandraschi_pywinauto_mcp-0.3.1.tar.gz.

File metadata

  • Download URL: iflow_mcp_sandraschi_pywinauto_mcp-0.3.1.tar.gz
  • Upload date:
  • Size: 376.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_sandraschi_pywinauto_mcp-0.3.1.tar.gz
Algorithm Hash digest
SHA256 ee0cb567f0a01ac31feee97d526830331daeae5ab632ce6b0de9b8152975c5b5
MD5 f75b2624744fcb76c82def6dea5ce46e
BLAKE2b-256 db9dde08a351406b0efd2f2252441e06ec7b896dfc4e9e5d42e16f1d1b5d7c7a

See more details on using hashes here.

File details

Details for the file iflow_mcp_sandraschi_pywinauto_mcp-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: iflow_mcp_sandraschi_pywinauto_mcp-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 113.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_sandraschi_pywinauto_mcp-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 868210bfb867aa31830bff754ef5c572f2cd559a1f1630ea02e7984adc16df05
MD5 47ee2cdf28f6e730163120e87cf1e903
BLAKE2b-256 a4e8f9daca961cf805ac5d05dd6574cdb6addd32a05b98aed5d829449d0fed4f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page