Skip to main content

Computer Use SDK for Python — build screen-controlling AI agents in a few lines of code

Project description

screenagent

Control any macOS app with Claude — Python SDK + CLI.

demo

Browser Use and Skyvern only work inside the browser. screenagent uses macOS Accessibility API + CGEvent for native input, so it works with System Settings, Finder, Notes, Calculator, and any app.

Install

pip install screenagent

Requires macOS and Python 3.11+.

Usage 1: CLI

Run directly from the terminal. Works with Claude Code out of the box.

# Control native apps (finds and launches via Spotlight automatically)
screenagent run "Open Calculator and compute 42 * 17"
screenagent run "Open System Settings and switch to Dark Mode"

# Browser automation
screenagent run "Open Chrome, go to youtube.com, search for ycombinator"
screenagent run --app "Google Chrome" "Go to google.com and search for AI news"

# Individual actions (no API key needed)
screenagent screenshot --file screen.png
screenagent ax-tree "Google Chrome"
screenagent click 640 400
screenagent type "hello world"
screenagent key return --modifiers command

Usage 2: Python SDK

3-line agent

from screenagent import Agent

agent = Agent()
result = agent.run("Open System Settings and switch to Dark Mode")
print(result.summary)
print(result.success)

Component functions (no API key needed)

from screenagent import screenshot, click, type_text, key_press, get_ui_tree

png_bytes = screenshot()
click(640, 400)
type_text("hello world")
key_press("return")

tree = get_ui_tree("Google Chrome")
print(tree.to_text())

Configuration

Environment Variable Default Description
ANTHROPIC_API_KEY Claude API key (required for agent)
AGENT_MODEL claude-sonnet-4-6 Model to use
AGENT_MAX_STEPS 20 Maximum agent loop iterations
AGENT_COMPUTER_USE true Use Claude computer-use tool
CDP_PORT 9222 Chrome DevTools Protocol port

Also supports .env files.

Requirements

  • macOS (Quartz CGEvent, Accessibility API)
  • Python 3.11+
  • Accessibility permission granted to your terminal/IDE
  • Chrome with --remote-debugging-port=9222 for CDP features (optional)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

screenagent_ai-0.3.0.tar.gz (99.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

screenagent_ai-0.3.0-py3-none-any.whl (39.6 kB view details)

Uploaded Python 3

File details

Details for the file screenagent_ai-0.3.0.tar.gz.

File metadata

  • Download URL: screenagent_ai-0.3.0.tar.gz
  • Upload date:
  • Size: 99.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.10

File hashes

Hashes for screenagent_ai-0.3.0.tar.gz
Algorithm Hash digest
SHA256 c1276a069b345d3ab5d5f2a26c5162ebd22b4e76532eed2213ea360ea6df1af0
MD5 5b93688745c0fc2f79b5fd0eed60168e
BLAKE2b-256 589bf7bc5a1a7c3b11e330de6d72a353f7d420aac41b48d4d340809ed726cb07

See more details on using hashes here.

File details

Details for the file screenagent_ai-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for screenagent_ai-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9e9885edfa361cbceacd1e3e00b7406dc675ea3cb4537a269e708a10b9d291b4
MD5 c5c7264da9ddb363abfa322ddc21c0b7
BLAKE2b-256 f812b610ae923738af1a33d9653ad223c1a2b65acaa8d1760f9e035e07b52686

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page