Skip to main content

Open Computer Use Agent — framework for desktop and browser automation

Project description

opendesk — Python SDK

Give any AI agent eyes and hands on your desktop.

macOS · Linux · Windows

PyPI Python License: MIT


Install

pip install 'opendesk[core,mcp]'
opendesk install

opendesk install registers the MCP server with Claude Code globally.


Quick start

import asyncio
from opendesk import create_registry, allow_all_context

async def main():
    registry = create_registry()
    ctx = allow_all_context()

    # Screenshot with Set-of-Marks
    shot = registry.get("screenshot")
    result = await shot.execute(ctx, shot.Params(marks=True))
    print(result.output)

    # Click a button by name — no coordinates needed
    ui = registry.get("ui")
    await ui.execute(ctx, ui.Params(action="click", app="TextEdit", title="File"))

asyncio.run(main())

Installation options

pip install opendesk                              # core framework only
pip install 'opendesk[core,mcp]'                  # + screen capture + MCP server (recommended)
pip install 'opendesk[core,mcp,learn]'            # + task recording and replay
pip install 'opendesk[core,mcp,learn,schedule]'   # + scheduled tasks
pip install 'opendesk[core,mcp,remote]'           # + remote machine control
pip install 'opendesk[all]'                       # everything

Tools

Tool What it does
screenshot Capture screen with Set-of-Marks on every interactive element
ui Click and type by element name — no coordinates needed
mouse Pixel-level mouse control for anything ui can't reach
keyboard Type text, press keys, send hotkeys
app Open, close, and focus applications
clipboard Read and write the system clipboard
ocr Extract text from any region of the screen
learn Record a workflow once, replay it anytime
schedule Run any task on a timer
audit Show the session audit log in any MCP session

Full reference: docs/tools.md


Remote machine control

opendesk supports controlling remote machines over an encrypted WebSocket connection with mDNS peer discovery.

# On the machine to be controlled:
pip install 'opendesk[core,mcp,remote]'
opendesk pair            # prints a pairing code

# On the controlling machine:
opendesk pair-with <host> <code>
opendesk serve           # start the server

See docs/remote.md and docs/protocol.md for full details.


MCP integrations

Claude Code

opendesk install        # register globally
opendesk uninstall      # remove

Claude Desktop

{
  "mcpServers": {
    "opendesk": { "command": "opendesk-mcp" }
  }
}

Cursor / Continue

{
  "mcpServers": [{ "name": "opendesk", "command": "opendesk-mcp", "transport": "stdio" }]
}

Agent integrations

Anthropic SDK

import anthropic
from opendesk.integrations.claude_code import ClaudeCodeAdapter
from opendesk.registry import create_registry

client = anthropic.Anthropic()
adapter = ClaudeCodeAdapter(create_registry())

result = await adapter.run_loop(
    client=client,
    model="claude-opus-4-6",
    messages=[{"role": "user", "content": "Open TextEdit and type Hello."}],
    system="Use the ui tool first. Mouse is a last resort.",
)

OpenAI / on-device models (Ollama, vLLM, llama.cpp)

from openai import OpenAI
from opendesk.integrations.openai_compat import OpenAIAdapter

client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
adapter = OpenAIAdapter()
result = await adapter.run_loop(client, model="qwen2.5:72b", messages=messages)

LangChain

from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent
from opendesk.integrations.langchain_compat import as_langchain_tools
from opendesk.registry import create_registry

tools = as_langchain_tools(create_registry())
agent = create_react_agent(ChatAnthropic(model="claude-opus-4-6"), tools)

Build from source

cd python
pip install -e '.[core,mcp]'

Docs

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opendesk-0.2.0.tar.gz (184.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

opendesk-0.2.0-py3-none-any.whl (172.8 kB view details)

Uploaded Python 3

File details

Details for the file opendesk-0.2.0.tar.gz.

File metadata

  • Download URL: opendesk-0.2.0.tar.gz
  • Upload date:
  • Size: 184.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.16

File hashes

Hashes for opendesk-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c3ee53fe8fd4e09e95dda656cf56fef2e07bcc98d3309cff81fa4030ad2b7d58
MD5 d37a70bc01d76db73d0ab297c3a9afcf
BLAKE2b-256 7bb7390ea8576432a6760c7807c86c2bd6911aa6c256469334b91d2baeece4fb

See more details on using hashes here.

File details

Details for the file opendesk-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: opendesk-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 172.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.16

File hashes

Hashes for opendesk-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3747a993776f4e01f929f1f5af0f20cdf77bbce70b6d6b489f6535fe9b85f39b
MD5 ea2ff1ee4e56df40e7a6f83288ae5924
BLAKE2b-256 09a4e3ad5ebf9e61f6c165d4a848dc4976a6c91fce998b3a68e0bdf3c2180f32

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page