Skip to main content

Computer Use SDK for Python — build screen-controlling AI agents in a few lines of code

Project description

screenagent

Control any macOS app with Claude — Python SDK + CLI.

demo

Browser Use and Skyvern only work inside the browser. screenagent uses macOS Accessibility API + CGEvent for native input, so it works with System Settings, Finder, Notes, Calculator, and any app.

Install

pip install screenagent-ai

Requires macOS and Python 3.11+.

Setup

1. Anthropic API Key

Get your key at console.anthropic.com and set it:

export ANTHROPIC_API_KEY="sk-ant-..."

2. Accessibility Permission

macOS requires you to grant accessibility access to your terminal app:

System Settings → Privacy & Security → Accessibility → add your terminal (Terminal.app, iTerm2, VS Code, etc.)

Without this, screenagent cannot read UI elements or send keyboard/mouse events.

Usage 1: CLI

Run directly from the terminal. Works with Claude Code out of the box.

# Control native apps (finds and launches via Spotlight automatically)
screenagent run "Open Calculator and compute 42 * 17"
screenagent run "Open System Settings and switch to Dark Mode"

# Browser automation
screenagent run "Open Chrome, go to youtube.com, search for ycombinator"
screenagent run --app "Google Chrome" "Go to google.com and search for AI news"

# Individual actions (no API key needed)
screenagent screenshot --file screen.png
screenagent ax-tree "Google Chrome"
screenagent click 640 400
screenagent type "hello world"
screenagent key return --modifiers command

Usage 2: Python SDK

3-line agent

from screenagent import Agent

agent = Agent()
result = agent.run("Open System Settings and switch to Dark Mode")
print(result.summary)
print(result.success)

Component functions (no API key needed)

from screenagent import screenshot, click, type_text, key_press, get_ui_tree

png_bytes = screenshot()
click(640, 400)
type_text("hello world")
key_press("return")

tree = get_ui_tree("Google Chrome")
print(tree.to_text())

Configuration

Environment Variable Default Description
ANTHROPIC_API_KEY Claude API key (required for agent)
AGENT_MODEL claude-sonnet-4-6 Model to use
AGENT_MAX_STEPS 20 Maximum agent loop iterations
AGENT_COMPUTER_USE true Use Claude computer-use tool
CDP_PORT 9222 Chrome DevTools Protocol port

Also supports .env files.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

screenagent_ai-0.3.1.tar.gz (99.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

screenagent_ai-0.3.1-py3-none-any.whl (39.8 kB view details)

Uploaded Python 3

File details

Details for the file screenagent_ai-0.3.1.tar.gz.

File metadata

  • Download URL: screenagent_ai-0.3.1.tar.gz
  • Upload date:
  • Size: 99.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.10

File hashes

Hashes for screenagent_ai-0.3.1.tar.gz
Algorithm Hash digest
SHA256 bf749ce7211f65bc8903a81838bd069ac113b1646fafeab753f2dc4e92c74837
MD5 7440e74482cc35480b798ac8e44ebbd0
BLAKE2b-256 d7c95fae1ca39793581e4031fea282fce13bec60baff845ac9a49fefb1e74cdb

See more details on using hashes here.

File details

Details for the file screenagent_ai-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for screenagent_ai-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0d9c7e3ea4ef75dfc7334218d14e606ac087bd694bc8e67ae52ab1d086078348
MD5 40154823dcf718a97b107054239737ae
BLAKE2b-256 bed8cd6029e589a06c20f69331da5690c58d1c9356f32a51aa1dcecf18d22103

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page