Skip to main content

Python SDK for the Browser Agent Protocol (BAP) - control browsers with AI agents

Project description

browser-agent-protocol

Python SDK for the Browser Agent Protocol (BAP) - control browsers with AI agents.

Installation

pip install browser-agent-protocol

Quick Start

Async API (recommended)

import asyncio
from browseragentprotocol import BAPClient, role, text, label

async def main():
    async with BAPClient("ws://localhost:9222") as client:
        # Launch browser
        await client.launch(browser="chromium", headless=True)

        # Create page and navigate
        await client.create_page(url="https://example.com")

        # Click using semantic selectors
        await client.click(role("button", "Submit"))

        # Fill form fields
        await client.fill(label("Email"), "user@example.com")

        # Take screenshot
        screenshot = await client.screenshot()
        print(f"Screenshot: {len(screenshot.data)} bytes")

        # Get accessibility tree (ideal for AI agents)
        tree = await client.accessibility()
        print(f"Found {len(tree.tree)} nodes")

asyncio.run(main())

High-Level Session Helper

from browseragentprotocol.context import bap_session, role

async with bap_session(
    "ws://localhost:9222",
    start_url="https://example.com"
) as client:
    await client.click(role("button", "Accept"))
    content = await client.content()

Sync API (for scripts and notebooks)

from browseragentprotocol import BAPClientSync, role

with BAPClientSync("ws://localhost:9222") as client:
    client.launch(browser="chromium", headless=True)
    client.create_page(url="https://example.com")

    client.click(role("button", "Submit"))
    screenshot = client.screenshot()

CLI

# Test connection to a BAP server
bap connect ws://localhost:9222

# Get server info (with JSON output)
bap info ws://localhost:9222 --json

Semantic Selectors

BAP uses semantic selectors instead of brittle CSS selectors:

from browseragentprotocol import role, text, label, css, xpath, test_id, ref

# Recommended: Semantic selectors
role("button", "Submit")           # ARIA role + accessible name
text("Sign in")                    # Visible text content
label("Email address")             # Associated label

# Developer-controlled identifiers
test_id("submit-button")           # data-testid attribute

# Stable element references
ref("@submitBtn")                  # Element ref from agent/observe

# Fallback: CSS/XPath
css(".btn-primary")
xpath("//button[@type='submit']")

AI Agent Methods

BAP provides three composite methods optimized for AI agents:

agent/observe - Get AI-optimized page snapshots

observation = await client.observe(
    include_accessibility=True,
    include_interactive_elements=True,
    include_screenshot=True,
    max_elements=50,
    annotate_screenshot=True,  # Set-of-Marks style annotation
)

# Interactive elements with stable refs
for element in observation.interactive_elements:
    print(f"{element.ref}: {element.role} - {element.name}")
    # @e1: button - Submit
    # @e2: textbox - Email

# Screenshot with numbered badges linking to elements
if observation.annotation_map:
    for annotation in observation.annotation_map:
        print(f"[{annotation.label}] -> {annotation.ref}")

agent/act - Execute multi-step sequences atomically

from browseragentprotocol import BAPClient

result = await client.act([
    BAPClient.step("action/fill", {"selector": label("Email"), "value": "user@example.com"}),
    BAPClient.step("action/fill", {"selector": label("Password"), "value": "secret123"}),
    BAPClient.step("action/click", {"selector": role("button", "Sign In")}),
])

print(f"Completed {result.completed}/{result.total} steps")
print(f"Success: {result.success}")

agent/extract - Extract structured data

data = await client.extract(
    instruction="Extract all product names and prices",
    schema={
        "type": "array",
        "items": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "price": {"type": "number"},
            },
        },
    },
)

if data.success:
    for product in data.data:
        print(f"{product['name']}: ${product['price']}")

Multi-Context Support

Create isolated browser contexts with separate cookies/storage:

# Create isolated context
context = await client.create_context(
    context_id="user-session",
    options={
        "viewport": {"width": 1920, "height": 1080},
        "locale": "en-US",
    },
)

# Create page in specific context
page = await client.create_page(
    url="https://example.com",
    context_id=context.context_id,
)

# Clean up
await client.destroy_context(context.context_id)

Frame Support

Navigate iframes and cross-origin frames:

# List frames
frames = await client.list_frames()
for frame in frames.frames:
    print(f"{frame.frame_id}: {frame.url}")

# Switch to iframe
await client.switch_frame(selector=css("iframe#payment"))

# Interact within frame
await client.fill(label("Card number"), "4242424242424242")

# Return to main frame
await client.main_frame()

Human-in-the-Loop Approval

Handle approval requests for sensitive actions:

def handle_approval(params):
    print(f"Approval needed: {params.rule}")
    print(f"Action: {params.original_request}")
    # In a real app, show UI to user
    return "approve"

client.on_approval_required(handle_approval)

# Respond to approval request
await client.respond_to_approval(
    request_id="...",
    decision="approve",  # or "deny", "approve-session"
    reason="User approved the action",
)

Error Handling

from browseragentprotocol import (
    BAPError,
    BAPTimeoutError,
    BAPElementNotFoundError,
    BAPApprovalDeniedError,
)

try:
    await client.click(role("button", "Missing"))
except BAPTimeoutError as e:
    print(f"Timeout: {e.message}")
    if e.retryable:
        # Retry the operation
        pass
except BAPElementNotFoundError as e:
    print(f"Element not found: {e.details}")
except BAPApprovalDeniedError as e:
    print(f"Action denied: {e.message}")
except BAPError as e:
    print(f"Error {e.code}: {e.message}")

Requirements

  • Python 3.10+
  • aiohttp >= 3.9.0
  • pydantic >= 2.0.0
  • anyio >= 4.0.0
  • httpx >= 0.27.0
  • httpx-sse >= 0.4.0

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browser_agent_protocol-0.5.0.tar.gz (34.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

browser_agent_protocol-0.5.0-py3-none-any.whl (43.2 kB view details)

Uploaded Python 3

File details

Details for the file browser_agent_protocol-0.5.0.tar.gz.

File metadata

  • Download URL: browser_agent_protocol-0.5.0.tar.gz
  • Upload date:
  • Size: 34.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for browser_agent_protocol-0.5.0.tar.gz
Algorithm Hash digest
SHA256 6319b28ef853beabd07532ef60ee2ebcf139464c863b07f045808272a7e98b82
MD5 2b4868e370a669faca020c5bc8d6fec2
BLAKE2b-256 45dcac55684392d3bf0b17c8c5576ca3865e5ecd5c5164b72b76d67374a2ed47

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_agent_protocol-0.5.0.tar.gz:

Publisher: release.yml on browseragentprotocol/bap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file browser_agent_protocol-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for browser_agent_protocol-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f48eb1875e2addc29851dc64009529d0c13dbb90ddd9b831864755404ad28bff
MD5 667f8fcc49a2dadc97a919b94a645cc1
BLAKE2b-256 29519eba26ba75ae82ed6ad8e0a67ad16e0e619f290eab7d9ced0aba7ac66443

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_agent_protocol-0.5.0-py3-none-any.whl:

Publisher: release.yml on browseragentprotocol/bap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page