Skip to main content

MCP server for Linux desktop GUI automation on KDE Plasma 6 Wayland via isolated KWin virtual sessions

Project description

kwin-mcp

PyPI version Python 3.12+ License: MIT CI

A Model Context Protocol (MCP) server for Linux desktop GUI automation on KDE Plasma 6 Wayland. It lets AI agents like Claude Code launch, interact with, and observe any Wayland application (Qt, GTK, Electron) in a fully isolated virtual KWin session — without touching the user's desktop.

Why kwin-mcp?

  • Isolated sessions — Each session runs in its own dbus-run-session + kwin_wayland --virtual sandbox. Your host desktop is never affected.
  • No screenshots required for interaction — The AT-SPI2 accessibility tree gives the AI agent structured widget data (roles, names, coordinates, states, available actions), so it can interact with UI elements without relying solely on vision.
  • Zero authorization prompts — Uses KWin's private EIS (Emulated Input Server) D-Bus interface directly, bypassing the XDG RemoteDesktop portal. No user confirmation dialogs.
  • Works with any Wayland app — Anything that runs on KDE Plasma 6 Wayland works: Qt, GTK, Electron, and more. Input is injected via the standard libei protocol.

Quick Start

Requires KDE Plasma 6 on Wayland. See System Requirements for details.

1. Install

# Using uv (recommended)
uv tool install kwin-mcp

# Or using pip
pip install kwin-mcp

2. Configure Claude Code

Add to your project's .mcp.json:

{
  "mcpServers": {
    "kwin-mcp": {
      "command": "uvx",
      "args": ["kwin-mcp"]
    }
  }
}

3. Use it

Ask Claude Code to launch and interact with any GUI application:

Start a KWin session, launch kcalc, and press the buttons to calculate 2 + 3.

Claude Code will autonomously start an isolated session, launch the app, read the accessibility tree to find buttons, click them, and take a screenshot to verify the result.

Features

  • Session management — Start and stop isolated KWin Wayland sessions with configurable screen resolution
  • Screenshot capture — Capture the virtual display as PNG via KWin's ScreenShot2 D-Bus interface
  • Accessibility tree — Read the full AT-SPI2 widget tree with roles, names, states, coordinates, and available actions
  • Element search — Find UI elements by name, role, or description (case-insensitive)
  • Mouse input — Click (left/right/middle, single/double), move, scroll (vertical/horizontal), and drag with smooth interpolation
  • Keyboard input — Type text (full US QWERTY layout) and press key combinations with modifier support (Ctrl, Alt, Shift, Super)

System Requirements

Requirement Details
OS Linux with KDE Plasma 6 (Wayland session)
Python 3.12 or later
KWin kwin_wayland with --virtual flag support (KDE Plasma 6.x)
libei Usually bundled with KWin 6.x (EIS input emulation)
spectacle KDE screenshot tool (CLI mode)
AT-SPI2 at-spi2-core for accessibility tree support
PyGObject GObject introspection Python bindings
D-Bus dbus-python bindings

Installing System Dependencies

Arch Linux / Manjaro
sudo pacman -S kwin spectacle at-spi2-core python-gobject dbus-python-common
Fedora (KDE Spin)
sudo dnf install kwin-wayland spectacle at-spi2-core python3-gobject dbus-python
openSUSE (KDE)
sudo zypper install kwin6 spectacle at-spi2-core python3-gobject python3-dbus-python
Kubuntu / KDE Neon
sudo apt install kwin-wayland spectacle at-spi2-core python3-gi gir1.2-atspi-2.0 python3-dbus

Installation

Using uv (recommended)

uv tool install kwin-mcp

Using pip

pip install kwin-mcp

From source

git clone https://github.com/isac322/kwin-mcp.git
cd kwin-mcp
uv sync
uv run kwin-mcp

Configuration

Claude Code

Add to your project's .mcp.json:

{
  "mcpServers": {
    "kwin-mcp": {
      "command": "uvx",
      "args": ["kwin-mcp"]
    }
  }
}

Or if installed globally:

{
  "mcpServers": {
    "kwin-mcp": {
      "command": "kwin-mcp"
    }
  }
}

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "kwin-mcp": {
      "command": "uvx",
      "args": ["kwin-mcp"]
    }
  }
}

Running Directly

# As an installed script
kwin-mcp

# As a Python module
python -m kwin_mcp

Available Tools

Session Management

Tool Parameters Description
session_start app_command?, screen_width?, screen_height? Start an isolated KWin Wayland session, optionally launching an app
session_stop (none) Stop the session and clean up all processes

Observation

Tool Parameters Description
screenshot include_cursor? Capture a screenshot of the virtual display (saved as PNG, returns file path)
accessibility_tree app_name?, max_depth? Get the AT-SPI2 widget tree with roles, names, states, and coordinates
find_ui_elements query, app_name? Search for UI elements by name, role, or description (case-insensitive)

Mouse Input

Tool Parameters Description
mouse_click x, y, button?, double? Click at coordinates (left/right/middle, single/double)
mouse_move x, y Move the cursor to coordinates without clicking
mouse_scroll x, y, delta, horizontal? Scroll at coordinates (positive = down/right, negative = up/left)
mouse_drag from_x, from_y, to_x, to_y Drag from one point to another with smooth interpolation

Keyboard Input

Tool Parameters Description
keyboard_type text Type a string of text character by character (US QWERTY layout)
keyboard_key key Press a key or key combination (e.g., Return, ctrl+c, alt+F4, shift+Tab)

How It Works

Claude Code / AI Agent
  │
  │  MCP (stdio)
  ▼
kwin-mcp server
  │
  ├── session_start ─────────► dbus-run-session
  │                               ├── at-spi-bus-launcher
  │                               └── kwin_wayland --virtual
  │                                      └── [your app]
  │
  ├── screenshot ────────────► spectacle (via D-Bus)
  │
  ├── accessibility_tree ────► AT-SPI2 (via PyGObject)
  ├── find_ui_elements ──────► AT-SPI2 (via PyGObject)
  │
  └── mouse_* / keyboard_* ─► KWin EIS D-Bus ──► libei

Triple Isolation

kwin-mcp provides three layers of isolation from the host desktop:

  1. D-Bus isolationdbus-run-session creates a private session bus. The isolated session's services (KWin, AT-SPI2, portals) are invisible to the host.
  2. Display isolationkwin_wayland --virtual creates its own Wayland compositor with a virtual framebuffer. No windows appear on the host display.
  3. Input isolation — Input events are injected through KWin's EIS interface into the isolated compositor only. The host desktop receives no input from kwin-mcp.

Input Injection

Mouse and keyboard events are injected through KWin's private org.kde.KWin.EIS.RemoteDesktop D-Bus interface. This returns a libei file descriptor that allows low-level input emulation without requiring the XDG RemoteDesktop portal (which would show a user authorization dialog). The connection uses:

  • Absolute pointer positioning for precise coordinate-based interaction
  • evdev keycodes with full US QWERTY mapping for keyboard input
  • Smooth drag interpolation (10+ intermediate steps) for realistic drag operations

Screenshot Capture

Screenshots are captured using spectacle --dbus --background --nonotify connected to the isolated session's D-Bus and Wayland socket. The KWin org.kde.KWin.ScreenShot2 interface provides the framebuffer data.

Accessibility Tree

The AT-SPI2 accessibility bus within the isolated session is queried via PyGObject (gi.repository.Atspi). This provides a structured tree of all UI widgets with their roles (button, text field, menu item, etc.), names, states (focused, enabled, visible, etc.), screen coordinates, and available actions (click, toggle, etc.).

Limitations

  • US QWERTY keyboard layout only — Other keyboard layouts are not yet supported for text typing.
  • KDE Plasma 6+ required — Older KDE versions or other Wayland compositors (GNOME, Sway) are not supported.
  • AT-SPI2 availability varies — Some applications may not fully expose their widget tree via AT-SPI2.

Contributing

Contributions are welcome! Please open an issue or pull request on GitHub.

git clone https://github.com/isac322/kwin-mcp.git
cd kwin-mcp
uv sync
uv run ruff check src/
uv run ruff format --check src/
uv run ty check src/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kwin_mcp-0.1.0.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kwin_mcp-0.1.0-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file kwin_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: kwin_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kwin_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1b4f87926d0dc863b0c77bbdc4e4de7f4dde6ac1420b3e0d9361b4e77d801c10
MD5 b2e131262b1523212bdec9a89f2202ab
BLAKE2b-256 6f7fbdecc762f8db26b2aa03bb64b2298568931492f4fed2cc8d3a9c51f785b7

See more details on using hashes here.

Provenance

The following attestation bundles were made for kwin_mcp-0.1.0.tar.gz:

Publisher: ci.yml on isac322/kwin-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file kwin_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: kwin_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for kwin_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 17d2ed22ed79190867b5210e35b1f34eb4226af7699f603bb0ab9469aa95ba58
MD5 13bc755a1894fcfa9a72effa44be7baf
BLAKE2b-256 75867af3080a1445a48cbdd845fc2b6387b211d500337912477b7e839ed108a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for kwin_mcp-0.1.0-py3-none-any.whl:

Publisher: ci.yml on isac322/kwin-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page