Skip to main content

Python SDK for Stagehand

Project description

An AI web browsing framework focused on simplicity and extensibility.

PyPI version MIT License Slack Community

NOTE: This is a Python SDK for Stagehand. The original implementation is in TypeScript and is available here.

Stagehand is the easiest way to build browser automations with AI-powered interactions.

  • act — Instruct the AI to perform actions (e.g. click a button or scroll).
await stagehand.page.act("click on the 'Quickstart' button")
  • extract — Extract and validate data from a page using a JSON schema (generated either manually or via a Pydantic model).
await stagehand.page.extract("the summary of the first paragraph")
  • observe — Get natural language interpretations to, for example, identify selectors or elements from the DOM.
await stagehand.page.observe("find the search bar")
  • agent — Execute autonomous multi-step tasks with provider-specific agents (OpenAI, Anthropic, etc.).
await stagehand.agent.execute("book a reservation for 2 people for a trip to the Maldives")

Installation

Install the Python package via pip:

pip install stagehand-py

Requirements

  • Python 3.7+
  • httpx (for async client)
  • requests (for sync client)
  • asyncio (for async client)
  • pydantic
  • python-dotenv (optional, for .env support)
  • playwright
  • rich (for examples/ terminal support)

You can simply run:

pip install -r requirements.txt

Environment Variables

Before running your script, set the following environment variables:

export BROWSERBASE_API_KEY="your-api-key"
export BROWSERBASE_PROJECT_ID="your-project-id"
export MODEL_API_KEY="your-openai-api-key"  # or your preferred model's API key
export STAGEHAND_SERVER_URL="url-of-stagehand-server"

You can also make a copy of .env.example and add these to your .env file.

Quickstart

Stagehand supports both synchronous and asynchronous usage. Here are examples for both approaches:

Sync Client

import os
from stagehand.sync import Stagehand, StagehandConfig
from dotenv import load_dotenv

load_dotenv()

def main():
    # Configure Stagehand
    config = StagehandConfig(
        env="BROWSERBASE",
        api_key=os.getenv("BROWSERBASE_API_KEY"),
        project_id=os.getenv("BROWSERBASE_PROJECT_ID"),
        model_name="gpt-4o",
        model_client_options={"apiKey": os.getenv("MODEL_API_KEY")}
    )

    # Initialize Stagehand
    stagehand = Stagehand(config=config, server_url=os.getenv("STAGEHAND_SERVER_URL"))
    stagehand.init()
    print(f"Session created: {stagehand.session_id}")

    # Navigate to a page
    stagehand.page.goto("https://google.com/")

    # Use Stagehand AI primitives
    stagehand.page.act("search for openai")

    # Combine with Playwright
    stagehand.page.keyboard.press("Enter")

    # Observe elements on the page
    observed = stagehand.page.observe("find the news button")
    if observed:
        stagehand.page.act(observed[0])  # Act on the first observed element

    # Extract data from the page
    data = stagehand.page.extract("extract the first result from the search")
    print(f"Extracted data: {data}")

    # Close the session
    stagehand.close()

if __name__ == "__main__":
    main()

Async Client

import os
import asyncio
from stagehand import Stagehand, StagehandConfig
from dotenv import load_dotenv

load_dotenv()

async def main():
    # Configure Stagehand
    config = StagehandConfig(
        env="BROWSERBASE",
        api_key=os.getenv("BROWSERBASE_API_KEY"),
        project_id=os.getenv("BROWSERBASE_PROJECT_ID"),
        model_name="gpt-4o",
        model_client_options={"apiKey": os.getenv("MODEL_API_KEY")}
    )

    # Initialize Stagehand
    stagehand = Stagehand(config=config, server_url=os.getenv("STAGEHAND_SERVER_URL"))
    await stagehand.init()
    print(f"Session created: {stagehand.session_id}")
    
    # Get page reference
    page = stagehand.page

    # Navigate to a page
    await page.goto("https://google.com/")

    # Use Stagehand AI primitives
    await page.act("search for openai")

    # Combine with Playwright
    await page.keyboard.press("Enter")

    # Observe elements on the page
    observed = await page.observe("find the news button")
    if observed:
        await page.act(observed[0])  # Act on the first observed element

    # Extract data from the page
    data = await page.extract("extract the first result from the search")
    print(f"Extracted data: {data}")

    # Close the session
    await stagehand.close()

if __name__ == "__main__":
    asyncio.run(main())

Agent Example

import os
from stagehand.sync import Stagehand, StagehandConfig
from stagehand.schemas import AgentConfig, AgentExecuteOptions, AgentProvider
from dotenv import load_dotenv

load_dotenv()

def main():
    # Configure Stagehand
    config = StagehandConfig(
        env="BROWSERBASE",
        api_key=os.getenv("BROWSERBASE_API_KEY"),
        project_id=os.getenv("BROWSERBASE_PROJECT_ID"),
        model_name="gpt-4o",
        model_client_options={"apiKey": os.getenv("MODEL_API_KEY")}
    )

    # Initialize Stagehand
    stagehand = Stagehand(config=config, server_url=os.getenv("STAGEHAND_SERVER_URL"))
    stagehand.init()
    print(f"Session created: {stagehand.session_id}")
    
    # Navigate to Google
    stagehand.page.goto("https://google.com/")
    
    # Configure the agent
    agent_config = AgentConfig(
        provider=AgentProvider.OPENAI,
        model="computer-use-preview",
        instructions="You are a helpful web navigation assistant. You are currently on google.com."
        options={"apiKey": os.getenv("MODEL_API_KEY")}
    )
    
    # Define execution options
    execute_options = AgentExecuteOptions(
        instruction="Search for 'latest AI news' and extract the titles of the first 3 results",
        max_steps=10,
        auto_screenshot=True
    )
    
    # Execute the agent task
    agent_result = stagehand.agent.execute(agent_config, execute_options)
    
    print(f"Agent execution result: {agent_result}")
    
    # Close the session
    stagehand.close()

if __name__ == "__main__":
    main()

Pydantic Schemas

  • ActOptions

    The ActOptions model takes an action field that tells the AI what to do on the page, plus optional fields such as useVision and variables:

    from stagehand.schemas import ActOptions
    
    # Example:
    await page.act(ActOptions(action="click on the 'Quickstart' button"))
    
  • ObserveOptions

    The ObserveOptions model lets you find elements on the page using natural language. The onlyVisible option helps limit the results:

    from stagehand.schemas import ObserveOptions
    
    # Example:
    await page.observe(ObserveOptions(instruction="find the button labeled 'News'", onlyVisible=True))
    
  • ExtractOptions

    The ExtractOptions model extracts structured data from the page. Pass your instructions and a schema defining your expected data format. Note: If you are using a Pydantic model for the schema, call its .model_json_schema() method to ensure JSON serializability.

    from stagehand.schemas import ExtractOptions
    from pydantic import BaseModel
    
    class DescriptionSchema(BaseModel):
        description: str
    
    # Example:
    data = await page.extract(
        ExtractOptions(
            instruction="extract the description of the page",
            schemaDefinition=DescriptionSchema.model_json_schema()
        )
    )
    description = data.get("description") if isinstance(data, dict) else data.description
    

Why?

Stagehand adds determinism to otherwise unpredictable agents.

While there's no limit to what you could instruct Stagehand to do, our primitives allow you to control how much you want to leave to an AI. It works best when your code is a sequence of atomic actions. Instead of writing a single script for a single website, Stagehand allows you to write durable, self-healing, and repeatable web automation workflows that actually work.

[!NOTE] Stagehand is currently available as an early release, and we're actively seeking feedback from the community. Please join our Slack community to stay updated on the latest developments and provide feedback.

Configuration

Stagehand can be configured via environment variables or through a StagehandConfig object. Available configuration options include:

  • stagehand_server_url: URL of the Stagehand API server.
  • browserbase_api_key: Your Browserbase API key (BROWSERBASE_API_KEY).
  • browserbase_project_id: Your Browserbase project ID (BROWSERBASE_PROJECT_ID).
  • model_api_key: Your model API key (e.g. OpenAI, Anthropic, etc.) (MODEL_API_KEY).
  • verbose: Verbosity level (default: 1).
    • Level 0: Error logs
    • Level 1: Basic info logs (minimal, maps to INFO level)
    • Level 2: Medium logs including warnings (maps to WARNING level)
    • Level 3: Detailed debug information (maps to DEBUG level)
  • model_name: Optional model name for the AI (e.g. "gpt-4o").
  • dom_settle_timeout_ms: Additional time (in ms) to have the DOM settle.
  • debug_dom: Enable debug mode for DOM operations.
  • stream_response: Whether to stream responses from the server (default: True).
  • timeout_settings: Custom timeout settings for HTTP requests.

Example using a unified configuration:

from stagehand.config import StagehandConfig
import os

config = StagehandConfig(
    env="BROWSERBASE" if os.getenv("BROWSERBASE_API_KEY") and os.getenv("BROWSERBASE_PROJECT_ID") else "LOCAL",
    api_key=os.getenv("BROWSERBASE_API_KEY"),
    project_id=os.getenv("BROWSERBASE_PROJECT_ID"),
    debug_dom=True,
    headless=False,
    dom_settle_timeout_ms=3000,
    model_name="gpt-4o-mini",
    model_client_options={"apiKey": os.getenv("MODEL_API_KEY")},
    verbose=3  # Set verbosity level: 1=minimal, 2=medium, 3=detailed logs
)

License

MIT License (c) 2025 Browserbase, Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stagehand_py-0.3.3.tar.gz (42.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stagehand_py-0.3.3-py3-none-any.whl (46.5 kB view details)

Uploaded Python 3

File details

Details for the file stagehand_py-0.3.3.tar.gz.

File metadata

  • Download URL: stagehand_py-0.3.3.tar.gz
  • Upload date:
  • Size: 42.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.9

File hashes

Hashes for stagehand_py-0.3.3.tar.gz
Algorithm Hash digest
SHA256 1232b28874e6ee868d40b60dfb5249b4c7374631a6222188aef2a42902d82bc9
MD5 8ec2a5ded2f8f53954d43b9d12f4bbcb
BLAKE2b-256 adb501eb1e45a73db7120bc7612b506f802218688e9812d7e77628422d4cd5c2

See more details on using hashes here.

File details

Details for the file stagehand_py-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: stagehand_py-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 46.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.9

File hashes

Hashes for stagehand_py-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1a08a6c1b55a7698a3419b29a6b6678569e696732985f50632029bd42c786b2c
MD5 892957e04544074372d6839e8b7b0a12
BLAKE2b-256 5d2c97698cad2158480d079882011ba98b48c65fac520f2019bd4e709f5366a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page