Skip to main content

Local Windows desktop control for AI agents — Python library and CLI.

Project description

agent-aid

A small Python library and CLI to control the local Windows desktop from AI agents. Open Interpreter, GPT, Claude, or your own agent — call agent-aid to read the screen, drive mouse and keyboard, and manage windows. Single dependency: mss.

pip install agent-aid

With uv (recommended — installs Python automatically if needed):

uv tool install agent-aid --python 3.11

Quick look

agent-aid health
agent-aid state
agent-aid screenshot active_window=true save_path=captures/active.png include_base64=false
agent-aid click x=500 y=300
agent-aid type text="hello"
agent-aid press keys=ctrl+s
agent-aid focus_window title_fragment=Chrome
agent-aid open target=https://example.com

List every route:

agent-aid --list

Print the full AI-oriented usage spec to stdout (markdown):

agent-aid --readme

Capabilities

  • Screenshots: full desktop, single monitor, active window, specific hwnd, or rectangular region
  • PNG sha256 hash returned for every screenshot (use with wait_screen_change)
  • Mouse: click, double-click, right-click, drag, move, scroll (vertical/horizontal), hold buttons
  • Keyboard: short text, hotkeys (ctrl+shift+a), modifier hold/release
  • Clipboard: clipboard text=... + press keys=ctrl+v for long pastes
  • Windows: find, focus, minimize/maximize/restore/close, move/resize, hide/show
  • System: list processes, open file/URL/shell: target, read pixel color, query state
  • Verify: wait_screen_change, wait_pixel, wait_window
  • batch for atomic multi-step flows in a single CLI call

Targeting windows / coordinates

All coordinates are physical screen pixels. To work relative to a specific window:

agent-aid click x=120 y=80 relative_to=active_window
agent-aid click x=120 y=80 hwnd=123456

Use as a Python library

from agent_aid import core

core.set_dpi_aware()
print(core.active_window())
core.click(800, 500)
core.type_text("hello")

Same capabilities as the CLI — just call the core module directly from your Python code.

Argument formats

# key=value (shortest)
agent-aid click x=500 y=300 button=left

# JSON (for nested fields)
agent-aid screenshot '{"region":{"left":0,"top":0,"width":800,"height":600},"save_path":"r.png"}'

# Pretty-print output
agent-aid --pretty state

Practical AI agent flow

  1. agent-aid state — see what you're looking at
  2. agent-aid screenshot active_window=true save_path=captures/now.png include_base64=false
  3. Inspect the image, choose target coordinates
  4. Act: click / type / press / clipboard
  5. Verify: another screenshot or wait_screen_change

Safety notes

  • Sends real mouse and keyboard input — types into the focused window.
  • clipboard overwrites the user's clipboard.
  • open launches a Windows target (same effect as a user double-click).
  • window/manage close posts WM_CLOSE — apps with unsaved data may prompt.
  • After any action, prefer to verify with wait_* or a fresh screenshot.

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_aid-1.4.2.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_aid-1.4.2-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file agent_aid-1.4.2.tar.gz.

File metadata

  • Download URL: agent_aid-1.4.2.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agent_aid-1.4.2.tar.gz
Algorithm Hash digest
SHA256 53c040e4086224dfb830a220e2c23ffbf8a3876a289b5c129a1598b161c4912d
MD5 d994f5b0e944df9a1185bfc8f271c642
BLAKE2b-256 a73cb506fc136c9d5080146c36ae95453128e3d9e7459d2f6a5d534826d5a6d5

See more details on using hashes here.

File details

Details for the file agent_aid-1.4.2-py3-none-any.whl.

File metadata

  • Download URL: agent_aid-1.4.2-py3-none-any.whl
  • Upload date:
  • Size: 22.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for agent_aid-1.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c764a0cee6e3bf58557cc5d2795d6e77532d9a6e88c4eb71887c03c3f0e8fd62
MD5 a04c9fd277b46caaece6feaf508e441d
BLAKE2b-256 a58c99b2b24f838b389964e9dd3606c515e99391067100535df03b590ae7080b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page