ScreenPilot MCP server for screen automation
Project description
Screen Pilot MCP
A Model Context Protocol server that provides screen automation capabilities. This server enables LLMs to control and interact with the screen, keyboard, and mouse, allowing AI to navigate and manipulate graphical user interfaces.
The server provides a consistent interface regardless of the actual screen resolution, with coordinates automatically scaled between the target resolution (1366x768) and the actual screen size.
Available Tools
-
screen_capture - Captures screenshots and provides screen information
see_screen(format: str = "PNG"): Takes a screenshot of the current screenget_screen_info(): Returns screen resolution and current mouse position
-
mouse - Controls mouse actions
mouse_click(x: int, y: int, button: str = "left", clicks: int = 1, take_screenshot: bool = True, format: str = "PNG"): Moves the mouse to specified coordinates and performs a click
-
keyboard - Controls keyboard inputs
keyboard_action(action_type: str, value: str, take_screenshot: bool = True, format: str = "PNG"): Performs keyboard actions (type, press, hotkey)
-
scroll - Controls screen scrolling
scroll(direction: str = "down", amount: int = 300, take_screenshot: bool = True, format: str = "PNG"): Scrolls the screen in specified directionscroll_to_position(percent: float = 50, take_screenshot: bool = True, format: str = "PNG"): Scrolls to an approximate position in document
-
element - Detects and waits for screen elements
element_exists(image_path: str, confidence: float = 0.9): Checks if an element exists on screenwait_for_element(image_path: str, max_wait_seconds: int = 10, confidence: float = 0.9): Waits for an element to appear
-
action_sequence - Performs sequences of actions
perform_actions(actions: List[Dict], take_screenshots: bool = True, format: str = "PNG"): Executes a sequence of mouse and keyboard actions
Prompts
- use_my_device
- Provides guidance on proper device interaction sequence
Installation
Using uv (recommended)
uvx screen-pilot-mcp
Using PIP
pip install screen-pilot-mcp
After installation, you can run it as a script using:
python -m screen_pilot_mcp
Configuration
Configure for Claude Desktop
Add to your Claude Desktop config file claude_desktop_config.json:
Using uvx
{
"mcpServers": {
"screen-pilot": {
"command": "uvx",
"args": ["run", "screen-pilot-mcp"]
}
}
}
Using pip installation
{
"mcpServers": {
"screen-pilot": {
"command": "python",
"args": ["-m", "screen_pilot_mcp"]
}
}
}
Example Prompts
Use the screen capture tool to take a screenshot of the current screen. Then analyze what's visible, and help me click the login button on the page.
Take a screenshot, find the search box, type "weather forecast", and press Enter.
Notes
- Requires Python 3.10 or higher
- First run may request screen access permissions
- Do not run multiple instances simultaneously
Contributing
We encourage contributions to help expand and improve screen-pilot-mcp. Whether you want to add new tools, enhance existing functionality, or improve documentation, your input is valuable.
For examples of other MCP servers and implementation patterns, see: https://github.com/modelcontextprotocol/servers
Pull requests are welcome! Feel free to contribute new ideas, bug fixes, or enhancements to make screen-pilot-mcp even more powerful and useful.
License
screen-pilot-mcp is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License. For more details, please see the LICENSE file in the project repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file screen_pilot_mcp-0.1.1.tar.gz.
File metadata
- Download URL: screen_pilot_mcp-0.1.1.tar.gz
- Upload date:
- Size: 15.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64f3e4fa80587d8740785d51d6f5b721085253f5ae5a316ba21e879a7cde22b6
|
|
| MD5 |
8bedef9aa4007974092d064fc0c83b20
|
|
| BLAKE2b-256 |
d4e3462e00d7643943af963d543366757c1914aaf0d4ce76a337045155d69913
|
File details
Details for the file screen_pilot_mcp-0.1.1-py3-none-any.whl.
File metadata
- Download URL: screen_pilot_mcp-0.1.1-py3-none-any.whl
- Upload date:
- Size: 18.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf2a0d9704cb1e77dc3ae60e5328e81ac11c1a430bb4b115c69d155c6828a7b5
|
|
| MD5 |
66da910dd8ced59b17de3a11deba72ee
|
|
| BLAKE2b-256 |
29f9af1c6d84b739f3cb16037cbb3b54a18e40d85a794f96eef7a86117b7ebbf
|