Skip to main content

AI-powered desktop automation SDK - like Browser-Use but for your entire desktop

Project description

🖥️ Desktop-Use

AI-powered desktop automation SDK — like Browser-Use but for your entire desktop.

from desktop_use import Agent

agent = Agent(task="Open Calculator and calculate 2+2")
result = agent.run_sync()

✨ Features

  • 🤖 AI-Powered - Uses Claude's vision to understand and interact with any desktop application
  • 🖱️ Full Control - Mouse, keyboard, screenshots - everything you need
  • 🍎 macOS First - Native performance with cliclick (Windows/Linux coming soon)
  • 📐 Ultra-wide Ready - Automatic resolution scaling for any display size
  • 🧠 Memory Integration - Optional Remembra integration for persistent context
  • ⚡ Simple API - One-liner to automate complex tasks

🚀 Quick Start

Installation

# Install cliclick (macOS only)
brew install cliclick

# Install Desktop-Use
pip install desktop-use

Grant Permissions

Go to System Settings → Privacy & Security → Accessibility and enable your Terminal app.

Run Your First Task

# CLI
desktop-use "Open Spotlight and search for Notes"

# Python
python -c "from desktop_use import run; run('Open Calculator')"

📖 Usage

Python SDK

from desktop_use import Agent, AgentConfig

# Simple usage
agent = Agent(task="Open FaceTime")
result = agent.run_sync()

# With configuration
config = AgentConfig(
    model="claude-4-sonnet-20250514",
    max_steps=20,
    verbose=True
)
agent = Agent(task="Create a new folder called 'Test' on Desktop", config=config)
result = agent.run_sync()

if result.success:
    print(f"Done: {result.final_message}")
else:
    print(f"Failed: {result.error}")

Low-Level Control

from desktop_use import Desktop

desktop = Desktop()

# Direct control
desktop.click(100, 200)
desktop.type("Hello World")
desktop.press("return")
desktop.hotkey("cmd", "space")  # Open Spotlight

# Screenshot
img = desktop.screenshot()
img.save("screen.png")

CLI

# Basic usage
desktop-use "Open Safari and go to google.com"

# With options
desktop-use --max-steps 25 "Fill out the form on the current page"
desktop-use --model claude-4-opus-20250514 "Complex multi-step task"
desktop-use --quiet "Background task"

🖥️ Platform Support

Platform Status Mouse/Keyboard Screenshots
macOS ✅ Ready cliclick native/mss
Windows 🔜 Soon pyautogui mss
Linux 🔜 Soon pyautogui mss

🔧 Requirements

  • Python 3.10+
  • macOS 12+ (for now)
  • cliclick (brew install cliclick)
  • Anthropic API key

📐 How It Works

  1. Capture - Takes a screenshot of your desktop
  2. Scale - Resizes to fit Claude's vision constraints (max 1568px edge)
  3. Analyze - Claude sees the screenshot and decides what to do
  4. Execute - Mouse/keyboard commands are executed via cliclick
  5. Repeat - Loop until task is complete or max steps reached

The coordinate scaling is automatic - Claude works in scaled coordinates, and Desktop-Use maps them back to your actual resolution.

🧠 Memory Integration (Optional)

Connect to Remembra for persistent memory:

config = AgentConfig(
    memory_url="http://localhost:8787",
    memory_project_id="my-automation"
)
agent = Agent(task="Do what I asked yesterday", config=config)

🛣️ Roadmap

  • macOS support with cliclick
  • Ultra-wide display support
  • CLI interface
  • Windows support (pyautogui)
  • Linux support (pyautogui/xdotool)
  • MCP server for Claude Desktop
  • Async operations
  • Action recording/playback
  • Visual debugging mode

🤝 Contributing

Contributions welcome! See CONTRIBUTING.md.

📄 License

MIT License - see LICENSE.


Built with ❤️ by DolphyTech

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_desktop_use-0.1.0.tar.gz (102.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_desktop_use-0.1.0-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file ai_desktop_use-0.1.0.tar.gz.

File metadata

  • Download URL: ai_desktop_use-0.1.0.tar.gz
  • Upload date:
  • Size: 102.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for ai_desktop_use-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f00c228d6e6771c5b458d4c960492812474d84b7f6cebf41a3f65ea4977fa543
MD5 29dc6bae6dbc8af9ac54cb69cd794b03
BLAKE2b-256 9182f674a01b428c9c7a7371c5ce276ca2723131811da326965bc623089babce

See more details on using hashes here.

File details

Details for the file ai_desktop_use-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ai_desktop_use-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 33.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for ai_desktop_use-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41f0b3560d12203ad110ca7d916c777f198bce517889af58674c5ca223f9cab1
MD5 1a759c6f5d18747e3c52dc0b3dd71cdc
BLAKE2b-256 fc50081d7f505863ef151061309bc91986c62681ff3961e9bd0ca9a04ce9e9a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page