Skip to main content

An implementation of a computer use agent (CUA) using LangGraph

Project description

🤖 LangGraph Computer Use Agent (CUA)

[!WARNING] THIS REPO IS A WORK IN PROGRESS AND NOT INTENDED FOR USE YET

A Python library for creating computer use agent (CUA) systems using LangGraph. A CUA is a type of agent which has the ability to interact with a computer to preform tasks.

Short demo video:

[!TIP] This demo used the following prompt:

I want to contribute to the LangGraph.js project. Please find the GitHub repository, and inspect the read me,
along with some of the issues and open pull requests. Then, report back with a plan of action to contribute.

This library is built on top of LangGraph, a powerful framework for building agent applications, and comes with out-of-box support for streaming, short-term and long-term memory and human-in-the-loop.

Installation

pip install langgraph-cua

Quickstart

This project by default uses Scrapybara for accessing a virtual machine to run the agent. To use LangGraph CUA, you'll need both OpenAI and Scrapybara API keys.

export OPENAI_API_KEY=<your_api_key>
export SCRAPYBARA_API_KEY=<your_api_key>

Then, create the graph by importing the create_cua function from the langgraph_cua module.

from langgraph_cua import create_cua
from dotenv import load_dotenv
from langchain_core.messages import HumanMessage, SystemMessage

# Load environment variables from .env file
load_dotenv()


cua_graph = create_cua()

# Define the input messages
messages = [
    SystemMessage(
        content=(
            "You're an advanced AI computer use assistant. The browser you are using "
            "is already initialized, and visiting google.com."
        )
    ),
    HumanMessage(
        content=(
            "I want to contribute to the LangGraph.js project. Please find the GitHub repository, and inspect the read me, "
            "along with some of the issues and open pull requests. Then, report back with a plan of action to contribute."
        )
    ),
]

async def main():
    # Stream the graph execution
    stream = cua_graph.astream(
        {"messages": messages},
        stream_mode="updates"
    )

    # Process the stream updates
    async for update in stream:
        if "create_vm_instance" in update:
            print("VM instance created")
            stream_url = update.get("create_vm_instance", {}).get("stream_url")
            # Open this URL in your browser to view the CUA stream
            print(f"Stream URL: {stream_url}")

    print("Done")

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

The above example will invoke the graph, passing in a request for it to do some research into LangGraph.js from the standpoint of a new contributor. The code will log the stream URL, which you can open in your browser to view the CUA stream.

You can find more examples inside the examples directory.

How to customize

The create_cua function accepts a few configuration parameters. These are the same configuration parameters that the graph accepts, along with recursion_limit.

You can either pass these parameters when calling create_cua, or at runtime when invoking the graph by passing them to the config object.

Configuration Parameters

  • scrapybara_api_key: The API key to use for Scrapybara. If not provided, it defaults to reading the SCRAPYBARA_API_KEY environment variable.
  • timeout_hours: The number of hours to keep the virtual machine running before it times out.
  • zdr_enabled: Whether or not Zero Data Retention is enabled in the user's OpenAI account. If True, the agent will not pass the previous_response_id to the model, and will always pass it the full message history for each request. If False, the agent will pass the previous_response_id to the model, and only the latest message in the history will be passed. Default False.
  • recursion_limit: The maximum number of recursive calls the agent can make. Default is 100. This is greater than the standard default of 25 in LangGraph, because computer use agents are expected to take more iterations.
  • auth_state_id: The ID of the authentication state. If defined, it will be used to authenticate with Scrapybara. Only applies if 'environment' is set to 'web'.
  • environment: The environment to use. Default is web. Options are web, ubuntu, and windows.

Auth States

LangGraph CUA integrates with Scrapybara's auth states API to persist browser authentication sessions. This allows you to authenticate once (e.g., logging into Amazon) and reuse that session in future runs.

Using Auth States

Pass an auth_state_id when creating your CUA graph:

from langgraph_cua import create_cua

cua_graph = create_cua(auth_state_id="<your_auth_state_id>")

The graph stores this ID in the authenticated_id state field. If you change the auth_state_id in future runs, the graph will automatically reauthenticate.

Managing Auth States with Scrapybara SDK

Save an Auth State

from scrapybara import Scrapybara

client = Scrapybara(api_key="<api_key>")
instance = client.get("<instance_id>")
auth_state_id = instance.save_auth(name="example_site").auth_state_id

Modify an Auth State

client = Scrapybara(api_key="<api_key>")
instance = client.get("<instance_id>")
instance.modify_auth(auth_state_id="your_existing_auth_state_id", name="renamed_auth_state")

[!NOTE] To apply changes to an auth state in an existing run, set the authenticated_id state field to None to trigger re-authentication.

Zero Data Retention (ZDR)

LangGraph CUA supports Zero Data Retention (ZDR) via the zdr_enabled configuration parameter. When set to true, the graph will not assume it can use the previous_message_id, and all AI & tool messages will be passed to the OpenAI on each request.

Development

To get started with development, first clone the repository:

git clone https://github.com/langchain-ai/langgraph-cua.git

Create a virtual environment:

uv venv

Activate it:

source .venv/bin/activate

Then, install dependencies:

uv sync --all-groups

Next, set the required environment variables:

cp .env.example .env

Finally, you can then run the integration tests:

pytest -xvs tests/integration/test_cua.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langgraph_cua-0.0.0.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langgraph_cua-0.0.0-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file langgraph_cua-0.0.0.tar.gz.

File metadata

  • Download URL: langgraph_cua-0.0.0.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langgraph_cua-0.0.0.tar.gz
Algorithm Hash digest
SHA256 f94e7fe0c4071e8b741c82e25cfa9d6d6edd05619fa886e959314251182a8fce
MD5 f3be76ba16c9fa7fc248d0a62a1c9c31
BLAKE2b-256 d161393fac450e445d8674816fb6ee9f9dc7fc5e1a5e91d23ce25c33db342be8

See more details on using hashes here.

File details

Details for the file langgraph_cua-0.0.0-py3-none-any.whl.

File metadata

  • Download URL: langgraph_cua-0.0.0-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langgraph_cua-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae4dd04a8bebd38003ca284cd1e5981cb458e15328c3b439c38ad046bbabd0f3
MD5 8513b5e277561f14c374af2a8c3362b2
BLAKE2b-256 b420bbce769247723131591d274d979b389e5f6b1effa16aaf6bd917e40a6632

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page