MCP server for Linux desktop automation using AT-SPI2
Project description
Linux Desktop MCP Server
Built with Claude Code - This entire MCP server was developed using Claude Code, Anthropic's AI-powered coding assistant. We're proud to showcase what's possible with AI-assisted development!
An MCP server that provides Chrome-extension-level semantic element targeting for native Linux desktop applications using AT-SPI2 (Assistive Technology Service Provider Interface).
Features
- Semantic Element References: Just like Chrome extension's
ref_1,ref_2system - Role Detection: Identifies buttons, text fields, links, menus, etc.
- State Detection: Tracks focused, enabled, checked, editable states
- Natural Language Search: Find elements by description ("save button", "search field")
- Cross-Platform Input: Works on X11, Wayland, and XWayland
- GTK/Qt/Electron Support: Works with any application that exposes accessibility
Installation
System Dependencies
# Ubuntu/Debian
sudo apt install python3-pyatspi gir1.2-atspi-2.0 at-spi2-core
# For X11 input simulation
sudo apt install xdotool
# For Wayland input simulation (recommended)
# Install ydotool from source or your package manager
# Then start the daemon:
sudo ydotoold &
Python Package
# From PyPI
pip install linux-desktop-mcp
# Or from source
git clone https://github.com/yourusername/linux-desktop-mcp.git
cd linux-desktop-mcp
pip install -e .
Enable Accessibility
Ensure accessibility is enabled in your desktop environment:
- GNOME: Settings → Accessibility → Enable accessibility features
- KDE: System Settings → Accessibility
- Most modern desktops have this enabled by default
Configuration
Add to ~/.claude/settings.json:
{
"mcpServers": {
"linux-desktop": {
"command": "linux-desktop-mcp"
}
}
}
Or if installed from source:
{
"mcpServers": {
"linux-desktop": {
"command": "python",
"args": ["-m", "linux_desktop_mcp"]
}
}
}
Available Tools
desktop_snapshot
Capture the accessibility tree with semantic element references.
Parameters:
app_name: str (optional) - Filter to specific application
max_depth: int (default: 15) - Tree traversal depth
Returns:
Tree of elements with ref_ids:
- ref_1: [application] Firefox
- ref_2: [frame] "GitHub - Mozilla Firefox"
- ref_3: [button] "Back" (clickable)
- ref_4: [entry] "Search or enter address" (editable, focused)
desktop_find
Find elements by natural language query.
Parameters:
query: str - "save button", "search field", "menu containing File"
app_name: str (optional)
Returns:
Matching elements with refs, states, and actions
desktop_click
Click an element by reference or coordinates.
Parameters:
ref: str - Element reference (e.g., "ref_5")
element: str - Human description for logging
coordinate: [x, y] - Fallback if no ref
button: left|right|middle
click_type: single|double
modifiers: [ctrl, shift, alt, super]
desktop_type
Type text into an element.
Parameters:
text: str - Text to type
ref: str - Element to focus first (optional)
element: str - Human description
clear_first: bool - Ctrl+A, Delete before typing
submit: bool - Press Enter after
desktop_key
Press keyboard keys/shortcuts.
Parameters:
key: str - Key name (Return, Tab, Escape, a, etc.)
modifiers: [ctrl, shift, alt, super]
desktop_capabilities
Check available automation capabilities.
Example Usage
# 1. Take a snapshot to see available elements
desktop_snapshot(app_name="Firefox")
# 2. Find a specific element
desktop_find(query="search bar")
# 3. Click the element
desktop_click(ref="ref_5", element="URL bar")
# 4. Type into it
desktop_type(text="https://github.com", ref="ref_5", submit=True)
# 5. Use keyboard shortcuts
desktop_key(key="t", modifiers=["ctrl"]) # New tab
Platform Support
| Feature | X11 | Wayland | XWayland |
|---|---|---|---|
| AT-SPI discovery | Full | Full | Full |
| Click by ref | Full | Full | Full |
| Type text | Full | Full | Full |
| ydotool input | Full | Full | Full |
| xdotool input | Full | No | Yes |
Troubleshooting
"AT-SPI2 not available"
sudo apt install python3-pyatspi gir1.2-atspi-2.0 at-spi2-core
"AT-SPI2 registry not running"
Ensure accessibility is enabled in your desktop settings. You may need to log out and back in.
"No input backend available" (Wayland)
# Install and start ydotool daemon
sudo ydotoold &
Elements not showing up
Some applications may not expose accessibility information. Modern GTK3/4, Qt5/6, and Electron apps generally work well.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ MCP Protocol Layer │
│ (JSON-RPC over stdio, tool defs) │
└─────────────────────────────────────────────────────────────┘
│
┌─────────────────────────────────────────────────────────────┐
│ Reference Manager │
│ (ref_1, ref_2 mapping, lifecycle, GC) │
└─────────────────────────────────────────────────────────────┘
│
┌───────────────────┴───────────────────┐
│ │
┌─────────────────────┐ ┌─────────────────────────┐
│ AT-SPI2 Backend │ │ Input Backends │
│ (pyatspi) │ │ (ydotool/xdotool/wtype) │
└─────────────────────┘ └─────────────────────────┘
Contributing
This project was created with Claude Code and we warmly welcome contributions! Whether you want to:
- Report bugs or request features
- Submit pull requests
- Fork and build your own version
- Improve documentation
We're very open to help and collaboration. See CONTRIBUTING.md for guidelines.
License
MIT - See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file linux_desktop_mcp-0.1.0.tar.gz.
File metadata
- Download URL: linux_desktop_mcp-0.1.0.tar.gz
- Upload date:
- Size: 31.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3cb3c7ade8cc07a6cd06371f3db709c9249fd882134ca06beb8d3c2ce296881
|
|
| MD5 |
0cf359104992cb8f9e2977d5e7bc673c
|
|
| BLAKE2b-256 |
a3695df6a02eb39fb30ee1da05cb62450e63a3edb989a821b38b7d4ce402fe83
|
File details
Details for the file linux_desktop_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: linux_desktop_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 36.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5e1c94f4375880588aba5b3c88595a6c5177c7605c140c7872d703bd60606a2
|
|
| MD5 |
10efeaacf4af328f3c1bc22b27fc29e4
|
|
| BLAKE2b-256 |
0fd65d340869f50a2cda04a0b60d6a8147128d1bd089e9c125645ddd1a066c51
|