Skip to main content

Local-first MCP server for desktop automation (accessibility + vision)

Project description

vadgr-computer-use

Local MCP server for desktop automation — 13 tools for capture, mouse, keyboard, and platform introspection.

The LLM takes screenshots, reasons over them, and drives mouse and keyboard through the server. The model picks coordinates directly from the pixels.

Install

pip install vadgr-computer-use

Run as MCP server

vadgr-cua                             # stdio (default) -- what MCP clients use
vadgr-cua --transport sse --port 8000 # SSE variant

Wire it into any MCP client (Claude Desktop, Cursor, Cline, custom agents).

How it works

The loop is intentionally simple:

  1. Agent calls screenshot() — server returns a downscaled PNG.
  2. Agent reasons over the image and picks coordinates.
  3. Agent calls click(x, y) / type_text(...) / key_press(...).
  4. Agent calls screenshot() again to verify the effect.

That's it. The LLM owns the "where to click" decision; the server owns "how to click it precisely". No other abstraction in between.

Platform support

Platform Screenshots Mouse / keyboard Status
WSL2 → Windows host TCP bridge daemon (mss on Windows) TCP bridge daemon (Win32 SendInput) primary, well-tested
Linux / X11 mss xdotool works
Windows native Win32 GDI SendInput should work; not part of the v0.1.0 test matrix
macOS screencapture osascript / cliclick WIP, not functional yet

macOS is a work-in-progress: the backend imports but actions and screenshots do not round-trip correctly yet. Fixes welcome.

On WSL2 the bridge daemon is launched automatically on first use and persists across MCP sessions; if it can't be started (e.g. no Windows Python available), the server silently falls back to a slower PowerShell path. See Daemon management below.

MCP tools (13)

Capture (2)

  • screenshot() — full screen, downscaled to CU_MAX_WIDTH (auto-picks 1024 / 1280 / 1366).
  • screenshot_region(x, y, w, h) — cropped region.

Input (8)

  • click(x, y) / double_click(x, y) / right_click(x, y)
  • move_mouse(x, y) / drag(start_x, start_y, end_x, end_y, duration=0.5)
  • scroll(x, y, amount) — positive = up, negative = down
  • type_text(text) / key_press(keys) — keys like ctrl+s, alt+tab, enter

Platform info (3)

  • get_platform() / get_platform_info() / get_screen_size()

Daemon management (WSL2)

On WSL2 the server reaches Windows through a small background daemon that launches on first use and survives across MCP sessions — most users never need to touch it. For when you do:

vadgr-cua doctor           # JSON: platform, Windows Python, daemon state, port, hash
vadgr-cua install-daemon   # Eager deploy + launch (useful in provisioning scripts)
vadgr-cua stop-daemon      # Kill the running daemon
vadgr-cua restart-daemon   # Stop then start

The daemon file is deployed to %USERPROFILE%\vadgr\daemon.py and listens on TCP 127.0.0.1:19542. After pip install -U vadgr-computer-use, the next MCP session detects the version-hash drift via a ping handshake and redeploys the daemon automatically — no manual restart required.

Library usage

Direct (agent picks what to do):

from computer_use import ComputerUseEngine

engine = ComputerUseEngine()
shot = engine.screenshot()
engine.click(500, 300)
engine.type_text("hello")

Autonomous (engine drives itself via an LLM provider — optional):

engine = ComputerUseEngine(provider="anthropic")  # reads ANTHROPIC_API_KEY
results = engine.run_task("Open Notepad and type hello", max_steps=50)

Environment

Variable Purpose
CU_MAX_WIDTH Override screenshot downscale target (default: auto 1024/1280/1366)
CUE_BRIDGE_PORT Override WSL2 bridge daemon TCP port (default: 19542)
VADGR_DATA Override data directory for debug screenshots
VADGR_DEBUG Set to 1 to dump screenshots to $VADGR_DATA/screenshots/
ANTHROPIC_API_KEY Only for autonomous mode (provider="anthropic")
OPENAI_API_KEY Only for autonomous mode (provider="openai")

Tests

pip install -e ".[dev]"
pytest computer_use/tests

License

Apache 2.0. See LICENSE.

Part of Vadgr

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vadgr_computer_use-0.1.0.tar.gz (70.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vadgr_computer_use-0.1.0-py3-none-any.whl (87.6 kB view details)

Uploaded Python 3

File details

Details for the file vadgr_computer_use-0.1.0.tar.gz.

File metadata

  • Download URL: vadgr_computer_use-0.1.0.tar.gz
  • Upload date:
  • Size: 70.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vadgr_computer_use-0.1.0.tar.gz
Algorithm Hash digest
SHA256 93a2d0b691d67a525d775bec2464396801894172cb30c7d3d01c055e932304bb
MD5 074a3f267d6e50e57a043328412f6ec8
BLAKE2b-256 02cc4025e2bae1bb64c31ccb54eb1e56a604f1bc377b2e3864b13164bd546f16

See more details on using hashes here.

Provenance

The following attestation bundles were made for vadgr_computer_use-0.1.0.tar.gz:

Publisher: publish.yml on MONTBRAIN/vadgr-computer-use

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vadgr_computer_use-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for vadgr_computer_use-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f328e94d45dd186129743640c3edea4172b9b5031bb39224805ea2e7ad1e0c08
MD5 ad5a4a67b9655b4887526cf1230b3aed
BLAKE2b-256 d8d57528b8859d853faa0eb233955b65d9067c5191a08f66b26f54be1e74f02c

See more details on using hashes here.

Provenance

The following attestation bundles were made for vadgr_computer_use-0.1.0-py3-none-any.whl:

Publisher: publish.yml on MONTBRAIN/vadgr-computer-use

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page