Skip to main content

Anthropic Computer Use with Modal Sandboxes

Project description

Anthropic Computer Use on Modal

Anthropic's recent release of Computer Use is fantastic, but spinning up random Docker images didn't fit into our production workflow. Luckily, we use Modal and they have all the primitives we need to implement the Computer Use API.

This library provides an out-of-the-box implementation that can be deployed into your Modal environment. Everything runs in a Sandbox, with tool calls being translated into Modal API calls. It may or may not spectacularly explode. It's also quite slow at the moment. Caveat emptor!

Features

  • Deploys into its own app that can be called from your existing apps
  • Sandboxes scale to zero and are resumable
  • VNC tunnel to each sandbox for debugging
  • One NFS per sandbox, available for inspection
  • Image processing outside the sandbox, greatly speeding up screenshot generation
  • Fuzzy matching for the Edit tool, since the model often misses a newline or two
  • Hardware-accelerated browsing in the sandbox
  • Pre-warming of the sandbox for faster startup times
  • Tools for the LLM to work faster, such as apt-fast

Installation

You can install this library without cloning the repo by running:

pip install computer-use-modal

To use it in your own project, simply deploy it once:

modal deploy computer_use_modal

Then you can use it in your app like this:

from modal import Cls

server = Cls.lookup("anthropic-computer-use-modal", "ComputerUseServer")
response = server.messages_create.remote.aio(
    request_id=uuid4().hex,
    user_messages=[{"role": "user", "content": "What is the weather in San Francisco?"}],
)
print(response)
{
    "role": "assistant",
    "content": [
        BetaTextBlock(
            text="According to the National Weather Service, the current weather in San Francisco is:\n\nTemperature: 65°F (18°C)\nHumidity: 53%\nDewpoint: 48°F (9°C)\nLast update: October 23, 2:43 PM PDT\n\nThe website shows the forecast details as well. Would you like me to provide the extended forecast for the coming days?",
            type="text",
        )
    ]
}

You can also watch the progress with a VNC tunnel:

manager = Cls.lookup("anthropic-computer-use-modal", "SandboxManager")
urls = manager.debug_urls.remote()
print(urls["vnc"])
"https://x2xzanmu4yg.r9.modal.host"

If you want to stream the responses, you can use ComputerUseServer.messages_create_gen.

Demo

You can clone this repo and run two demos locally.

CLI Demo

This demo will deploy an ephemeral Modal app, and ask the LLM to browse the web to fetch the weather in San Francisco. Screenshots will be shown in your terminal so you can follow along!

git clone https://github.com/yasyf/anthropic-tool-use-modal
cd anthropic-tool-use-modal
uv sync
modal run computer_use_modal.demo

Streamlit Demo

This demo deploys the app persistently to its own namespace in your Modal account, then starts a Streamlit app that can interact with it.

git clone https://github.com/yasyf/anthropic-tool-use-modal
cd anthropic-tool-use-modal
uv sync --dev
modal deploy computer_use_modal
python -m streamlit run computer_use_modal/streamlit.py

Streamlit Demo

Thanks

Thanks to the Anthropic team for the awesome starting point!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

computer_use_modal-0.1.1.tar.gz (21.6 kB view details)

Uploaded Source

Built Distribution

computer_use_modal-0.1.1-py3-none-any.whl (27.6 kB view details)

Uploaded Python 3

File details

Details for the file computer_use_modal-0.1.1.tar.gz.

File metadata

File hashes

Hashes for computer_use_modal-0.1.1.tar.gz
Algorithm Hash digest
SHA256 13e7574e95870615c01e5660e7b66be842bc17651cb343eb4b5783086325b53a
MD5 0c1a633d4a59220afe974637c7b2612a
BLAKE2b-256 d97c526b708b8cc3227356d3b472ff087e1058e5145630297e55fa967f124419

See more details on using hashes here.

File details

Details for the file computer_use_modal-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for computer_use_modal-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7cef6cfcfa45877eb57bf40fe3404fe1ce5fa79b348b2dd3fc9c202b80b9fe71
MD5 09060ba33f3abec39b5c8c9a5ccafc41
BLAKE2b-256 27041521d0bd0965852f48595e8fd0efa3f10f8489672a9b96361fa6617f0cde

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page