Skip to main content

Computer Use SDK for Python — build screen-controlling AI agents in a few lines of code

Project description

screenagent

Control any macOS app with Claude — Python SDK + CLI.

demo

Browser Use and Skyvern only work inside the browser. screenagent uses macOS Accessibility API + CGEvent for native input, so it works with System Settings, Finder, Notes, Calculator, and any app.

Install

pip install screenagent-ai

This installs both the Python SDK (from screenagent import ...) and the screenagent CLI command.

Requires macOS and Python 3.11+.

Setup

1. Anthropic API Key

Get your key at console.anthropic.com and set it:

export ANTHROPIC_API_KEY="sk-ant-..."

2. Accessibility Permission

macOS requires you to grant accessibility access to your terminal app:

System Settings → Privacy & Security → Accessibility → add your terminal (Terminal.app, iTerm2, VS Code, etc.)

Without this, screenagent cannot read UI elements or send keyboard/mouse events.

Usage 1: CLI

Run directly from the terminal. Works with Claude Code out of the box.

# Control native apps (finds and launches via Spotlight automatically)
screenagent run "Open Calculator and compute 42 * 17"
screenagent run "Open System Settings and switch to Dark Mode"

# Browser automation
screenagent run "Open Chrome, go to youtube.com, search for ycombinator"
screenagent run --app "Google Chrome" "Go to google.com and search for AI news"

# Individual actions (no API key needed)
screenagent screenshot --file screen.png
screenagent ax-tree "Google Chrome"
screenagent click 640 400
screenagent type "hello world"
screenagent key return --modifiers command

Usage 2: Python SDK

3-line agent

from screenagent import Agent

agent = Agent()
result = agent.run("Open System Settings and switch to Dark Mode")
print(result.summary)
print(result.success)

Component functions (no API key needed)

from screenagent import screenshot, click, type_text, key_press, get_ui_tree

png_bytes = screenshot()
click(640, 400)
type_text("hello world")
key_press("return")

tree = get_ui_tree("Google Chrome")
print(tree.to_text())

Configuration

Environment Variable Default Description
ANTHROPIC_API_KEY Claude API key (required for agent)
AGENT_MODEL claude-sonnet-4-6 Model to use
AGENT_MAX_STEPS 20 Maximum agent loop iterations
AGENT_COMPUTER_USE true Use Claude computer-use tool
CDP_PORT 9222 Chrome DevTools Protocol port

Also supports .env files.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

screenagent_ai-0.3.4.tar.gz (106.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

screenagent_ai-0.3.4-py3-none-any.whl (41.4 kB view details)

Uploaded Python 3

File details

Details for the file screenagent_ai-0.3.4.tar.gz.

File metadata

  • Download URL: screenagent_ai-0.3.4.tar.gz
  • Upload date:
  • Size: 106.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.10

File hashes

Hashes for screenagent_ai-0.3.4.tar.gz
Algorithm Hash digest
SHA256 5cc4eba56320a68de686d381a8af5931d253f176f768e7d90d8ad6b54c6eb7e8
MD5 41c523de0f995813e6265cee00f2251b
BLAKE2b-256 8b7b23a6bf8a9318a80fe1e93a86676f163d8967880ed6b9ed6af4257cc41629

See more details on using hashes here.

File details

Details for the file screenagent_ai-0.3.4-py3-none-any.whl.

File metadata

File hashes

Hashes for screenagent_ai-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 68e7782f9edd8ed508dc9c2a84963ac3d07b500dbd2c332f4300e585e6142bc2
MD5 ac6a28881e490ec4b18fcb0d5812fe46
BLAKE2b-256 6f2bf532ac637b6d9e50b8543e9767c144de9963dfe5d606d877259097f6d003

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page