Skip to main content

Anthropic Computer Use with Modal Sandboxes

Project description

Anthropic Computer Use on Modal

Anthropic's recent release of Computer Use is fantastic, but spinning up random Docker images didn't fit into our production workflow. Luckily, we use Modal and they have all the primitives we need to implement the Computer Use API.

If you're curious about why this library exists, check out this blog post.

This library provides an out-of-the-box implementation that can be deployed into your Modal environment. Everything runs in a Sandbox, with tool calls being translated into Modal API calls. It may or may not spectacularly explode. It's also quite slow at the moment. Caveat emptor!

Features

  • Deploys into its own app that can be called from your existing apps
  • Sandboxes scale to zero and are resumable
  • VNC tunnel to each sandbox for debugging
  • One NFS per sandbox, available for inspection
  • Image processing outside the sandbox, greatly speeding up screenshot generation
  • Fuzzy matching for the Edit tool, since the model often misses a newline or two
  • Hardware-accelerated browsing in the sandbox
  • Pre-warming of the sandbox for faster startup times
  • Tools for the LLM to work faster, such as apt-fast

Installation

You can install this library without cloning the repo by running:

pip install computer-use-modal

To use it in your own project, simply deploy it once:

modal deploy computer_use_modal

Then you can use it in your app like this:

from modal import Cls

server = Cls.lookup("anthropic-computer-use-modal", "ComputerUseServer")
response = server.messages_create.remote.aio(
    request_id=uuid4().hex,
    user_messages=[{"role": "user", "content": "What is the weather in San Francisco?"}],
)
print(response)
{
    "role": "assistant",
    "content": [
        BetaTextBlock(
            text="According to the National Weather Service, the current weather in San Francisco is:\n\nTemperature: 65°F (18°C)\nHumidity: 53%\nDewpoint: 48°F (9°C)\nLast update: October 23, 2:43 PM PDT\n\nThe website shows the forecast details as well. Would you like me to provide the extended forecast for the coming days?",
            type="text",
        )
    ]
}

You can also watch the progress with a VNC tunnel:

manager = Cls.lookup("anthropic-computer-use-modal", "SandboxManager")
urls = manager.debug_urls.remote()
print(urls["vnc"])
"https://x2xzanmu4yg.r9.modal.host"

If you want to stream the responses, you can use ComputerUseServer.messages_create_gen.

Demo

You can clone this repo and run two demos locally.

CLI Demo

This demo will deploy an ephemeral Modal app, and ask the LLM to browse the web to fetch the weather in San Francisco. Screenshots will be shown in your terminal so you can follow along!

git clone https://github.com/yasyf/anthropic-tool-use-modal
cd anthropic-tool-use-modal
uv sync
modal run computer_use_modal.demo

Streamlit Demo

This demo deploys the app persistently to its own namespace in your Modal account, then starts a Streamlit app that can interact with it.

git clone https://github.com/yasyf/anthropic-tool-use-modal
cd anthropic-tool-use-modal
uv sync --dev
modal deploy computer_use_modal
python -m streamlit run computer_use_modal/streamlit.py

Streamlit Demo

Thanks

Thanks to the Anthropic team for the awesome starting point!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

computer_use_modal-0.1.2.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

computer_use_modal-0.1.2-py3-none-any.whl (27.6 kB view details)

Uploaded Python 3

File details

Details for the file computer_use_modal-0.1.2.tar.gz.

File metadata

File hashes

Hashes for computer_use_modal-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ef22e0660e65e82343fab97a7e8fea9e0c52081c86b44eec08c7337aaa7d3c3b
MD5 a45891e83e05826fa48563aee6b1ed1d
BLAKE2b-256 424c0457f1f3ab5d0f84a3a0873477cf845ea10140e485fc0ec6652e04d8de14

See more details on using hashes here.

File details

Details for the file computer_use_modal-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for computer_use_modal-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a758887ba9fc1839a645b470ae930f78ac4a1116daa45db520508751e7aff6e0
MD5 d2cf024cb22fbdf56f7c28bf5ac3e2d4
BLAKE2b-256 d16ef921363ffc80f5ccdf8a5c914c6edfc065483c74516dc6a15e6ddf612c2c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page