Local-first MCP server for desktop automation (accessibility + vision)
Project description
vadgr-computer-use
Local MCP server for desktop automation — 13 tools for capture, mouse, keyboard, and platform introspection.
The LLM takes screenshots, reasons over them, and drives mouse and keyboard through the server. The model picks coordinates directly from the pixels.
Install
pip install vadgr-computer-use
Run as MCP server
vadgr-cua # stdio (default) -- what MCP clients use
vadgr-cua --transport sse --port 8000 # SSE variant
Wire it into any MCP client (Claude Desktop, Cursor, Cline, custom agents).
How it works
The loop is intentionally simple:
- Agent calls
screenshot()— server returns a downscaled PNG. - Agent reasons over the image and picks coordinates.
- Agent calls
click(x, y)/type_text(...)/key_press(...). - Agent calls
screenshot()again to verify the effect.
That's it. The LLM owns the "where to click" decision; the server owns "how to click it precisely". No other abstraction in between.
Platform support
| Platform | Screenshots | Mouse / keyboard | Status |
|---|---|---|---|
| WSL2 → Windows host | TCP bridge daemon (mss on Windows) |
TCP bridge daemon (Win32 SendInput) |
primary, well-tested |
| Linux / X11 | mss |
xdotool |
works |
| Windows native | Win32 GDI | SendInput | should work; not part of the v0.1.0 test matrix |
| macOS | screencapture |
osascript / cliclick |
WIP, not functional yet |
macOS is a work-in-progress: the backend imports but actions and screenshots do not round-trip correctly yet. Fixes welcome.
On WSL2 the bridge daemon is launched automatically on first use and persists across MCP sessions; if it can't be started (e.g. no Windows Python available), the server silently falls back to a slower PowerShell path. See Daemon management below.
MCP tools (13)
Capture (2)
screenshot()— full screen, downscaled toCU_MAX_WIDTH(auto-picks 1024 / 1280 / 1366).screenshot_region(x, y, w, h)— cropped region.
Input (8)
click(x, y)/double_click(x, y)/right_click(x, y)move_mouse(x, y)/drag(start_x, start_y, end_x, end_y, duration=0.5)scroll(x, y, amount)— positive = up, negative = downtype_text(text)/key_press(keys)— keys likectrl+s,alt+tab,enter
Platform info (3)
get_platform()/get_platform_info()/get_screen_size()
Daemon management (WSL2)
On WSL2 the server reaches Windows through a small background daemon that launches on first use and survives across MCP sessions — most users never need to touch it. For when you do:
vadgr-cua doctor # JSON: platform, Windows Python, daemon state, port, hash
vadgr-cua install-daemon # Eager deploy + launch (useful in provisioning scripts)
vadgr-cua stop-daemon # Kill the running daemon
vadgr-cua restart-daemon # Stop then start
The daemon file is deployed to %USERPROFILE%\vadgr\daemon.py and listens on TCP 127.0.0.1:19542. After pip install -U vadgr-computer-use, the next MCP session detects the version-hash drift via a ping handshake and redeploys the daemon automatically — no manual restart required.
Library usage
Direct (agent picks what to do):
from computer_use import ComputerUseEngine
engine = ComputerUseEngine()
shot = engine.screenshot()
engine.click(500, 300)
engine.type_text("hello")
Autonomous (engine drives itself via an LLM provider — optional):
engine = ComputerUseEngine(provider="anthropic") # reads ANTHROPIC_API_KEY
results = engine.run_task("Open Notepad and type hello", max_steps=50)
Environment
| Variable | Purpose |
|---|---|
CU_MAX_WIDTH |
Override screenshot downscale target (default: auto 1024/1280/1366) |
CUE_BRIDGE_PORT |
Override WSL2 bridge daemon TCP port (default: 19542) |
VADGR_DATA |
Override data directory for debug screenshots |
VADGR_DEBUG |
Set to 1 to dump screenshots to $VADGR_DATA/screenshots/ |
ANTHROPIC_API_KEY |
Only for autonomous mode (provider="anthropic") |
OPENAI_API_KEY |
Only for autonomous mode (provider="openai") |
Tests
pip install -e ".[dev]"
pytest computer_use/tests
License
Apache 2.0. See LICENSE.
Part of Vadgr
- vadgr — workflow engine (brain)
- vadgr-computer-use — desktop automation MCP (eyes)
- vadgr-agent-os — containerized agent runtime
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vadgr_computer_use-0.1.0.tar.gz.
File metadata
- Download URL: vadgr_computer_use-0.1.0.tar.gz
- Upload date:
- Size: 70.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93a2d0b691d67a525d775bec2464396801894172cb30c7d3d01c055e932304bb
|
|
| MD5 |
074a3f267d6e50e57a043328412f6ec8
|
|
| BLAKE2b-256 |
02cc4025e2bae1bb64c31ccb54eb1e56a604f1bc377b2e3864b13164bd546f16
|
Provenance
The following attestation bundles were made for vadgr_computer_use-0.1.0.tar.gz:
Publisher:
publish.yml on MONTBRAIN/vadgr-computer-use
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vadgr_computer_use-0.1.0.tar.gz -
Subject digest:
93a2d0b691d67a525d775bec2464396801894172cb30c7d3d01c055e932304bb - Sigstore transparency entry: 1354079893
- Sigstore integration time:
-
Permalink:
MONTBRAIN/vadgr-computer-use@6c2dde44f7fec323cbeb6025c3dc2569bac7468d -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/MONTBRAIN
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6c2dde44f7fec323cbeb6025c3dc2569bac7468d -
Trigger Event:
push
-
Statement type:
File details
Details for the file vadgr_computer_use-0.1.0-py3-none-any.whl.
File metadata
- Download URL: vadgr_computer_use-0.1.0-py3-none-any.whl
- Upload date:
- Size: 87.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f328e94d45dd186129743640c3edea4172b9b5031bb39224805ea2e7ad1e0c08
|
|
| MD5 |
ad5a4a67b9655b4887526cf1230b3aed
|
|
| BLAKE2b-256 |
d8d57528b8859d853faa0eb233955b65d9067c5191a08f66b26f54be1e74f02c
|
Provenance
The following attestation bundles were made for vadgr_computer_use-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on MONTBRAIN/vadgr-computer-use
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vadgr_computer_use-0.1.0-py3-none-any.whl -
Subject digest:
f328e94d45dd186129743640c3edea4172b9b5031bb39224805ea2e7ad1e0c08 - Sigstore transparency entry: 1354079978
- Sigstore integration time:
-
Permalink:
MONTBRAIN/vadgr-computer-use@6c2dde44f7fec323cbeb6025c3dc2569bac7468d -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/MONTBRAIN
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6c2dde44f7fec323cbeb6025c3dc2569bac7468d -
Trigger Event:
push
-
Statement type: