Skip to main content

Python SDK for the Browser Agent Protocol (BAP) - control browsers with AI agents

Project description

browser-agent-protocol

Python SDK for the Browser Agent Protocol (BAP) - control browsers with AI agents.

Installation

pip install browser-agent-protocol

Quick Start

Async API (recommended)

import asyncio
from browseragentprotocol import BAPClient, role, text, label

async def main():
    async with BAPClient("ws://localhost:9222") as client:
        # Launch browser
        await client.launch(browser="chromium", headless=True)

        # Create page and navigate
        await client.create_page(url="https://example.com")

        # Click using semantic selectors
        await client.click(role("button", "Submit"))

        # Fill form fields
        await client.fill(label("Email"), "user@example.com")

        # Take screenshot
        screenshot = await client.screenshot()
        print(f"Screenshot: {len(screenshot.data)} bytes")

        # Get accessibility tree (ideal for AI agents)
        tree = await client.accessibility()
        print(f"Found {len(tree.tree)} nodes")

asyncio.run(main())

High-Level Session Helper

from browseragentprotocol.context import bap_session, role

async with bap_session(
    "ws://localhost:9222",
    start_url="https://example.com"
) as client:
    await client.click(role("button", "Accept"))
    content = await client.content()

Sync API (for scripts and notebooks)

from browseragentprotocol import BAPClientSync, role

with BAPClientSync("ws://localhost:9222") as client:
    client.launch(browser="chromium", headless=True)
    client.create_page(url="https://example.com")

    client.click(role("button", "Submit"))
    screenshot = client.screenshot()

CLI

# Test connection to a BAP server
bap connect ws://localhost:9222

# Get server info (with JSON output)
bap info ws://localhost:9222 --json

Semantic Selectors

BAP uses semantic selectors instead of brittle CSS selectors:

from browseragentprotocol import role, text, label, css, xpath, test_id, ref

# Recommended: Semantic selectors
role("button", "Submit")           # ARIA role + accessible name
text("Sign in")                    # Visible text content
label("Email address")             # Associated label

# Developer-controlled identifiers
test_id("submit-button")           # data-testid attribute

# Stable element references
ref("@submitBtn")                  # Element ref from agent/observe

# Fallback: CSS/XPath
css(".btn-primary")
xpath("//button[@type='submit']")

AI Agent Methods

BAP provides three composite methods optimized for AI agents:

agent/observe - Get AI-optimized page snapshots

observation = await client.observe(
    include_accessibility=True,
    include_interactive_elements=True,
    include_screenshot=True,
    max_elements=50,
    annotate_screenshot=True,  # Set-of-Marks style annotation
)

# Interactive elements with stable refs
for element in observation.interactive_elements:
    print(f"{element.ref}: {element.role} - {element.name}")
    # @e1: button - Submit
    # @e2: textbox - Email

# Screenshot with numbered badges linking to elements
if observation.annotation_map:
    for annotation in observation.annotation_map:
        print(f"[{annotation.label}] -> {annotation.ref}")

agent/act - Execute multi-step sequences atomically

from browseragentprotocol import BAPClient

result = await client.act([
    BAPClient.step("action/fill", {"selector": label("Email"), "value": "user@example.com"}),
    BAPClient.step("action/fill", {"selector": label("Password"), "value": "secret123"}),
    BAPClient.step("action/click", {"selector": role("button", "Sign In")}),
])

print(f"Completed {result.completed}/{result.total} steps")
print(f"Success: {result.success}")

agent/extract - Extract structured data

data = await client.extract(
    instruction="Extract all product names and prices",
    schema={
        "type": "array",
        "items": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "price": {"type": "number"},
            },
        },
    },
)

if data.success:
    for product in data.data:
        print(f"{product['name']}: ${product['price']}")

Multi-Context Support

Create isolated browser contexts with separate cookies/storage:

# Create isolated context
context = await client.create_context(
    context_id="user-session",
    options={
        "viewport": {"width": 1920, "height": 1080},
        "locale": "en-US",
    },
)

# Create page in specific context
page = await client.create_page(
    url="https://example.com",
    context_id=context.context_id,
)

# Clean up
await client.destroy_context(context.context_id)

Frame Support

Navigate iframes and cross-origin frames:

# List frames
frames = await client.list_frames()
for frame in frames.frames:
    print(f"{frame.frame_id}: {frame.url}")

# Switch to iframe
await client.switch_frame(selector=css("iframe#payment"))

# Interact within frame
await client.fill(label("Card number"), "4242424242424242")

# Return to main frame
await client.main_frame()

Human-in-the-Loop Approval

Handle approval requests for sensitive actions:

def handle_approval(params):
    print(f"Approval needed: {params.rule}")
    print(f"Action: {params.original_request}")
    # In a real app, show UI to user
    return "approve"

client.on_approval_required(handle_approval)

# Respond to approval request
await client.respond_to_approval(
    request_id="...",
    decision="approve",  # or "deny", "approve-session"
    reason="User approved the action",
)

Error Handling

from browseragentprotocol import (
    BAPError,
    BAPTimeoutError,
    BAPElementNotFoundError,
    BAPApprovalDeniedError,
)

try:
    await client.click(role("button", "Missing"))
except BAPTimeoutError as e:
    print(f"Timeout: {e.message}")
    if e.retryable:
        # Retry the operation
        pass
except BAPElementNotFoundError as e:
    print(f"Element not found: {e.details}")
except BAPApprovalDeniedError as e:
    print(f"Action denied: {e.message}")
except BAPError as e:
    print(f"Error {e.code}: {e.message}")

Requirements

  • Python 3.10+
  • aiohttp >= 3.9.0
  • pydantic >= 2.0.0
  • anyio >= 4.0.0
  • httpx >= 0.27.0
  • httpx-sse >= 0.4.0

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browser_agent_protocol-0.7.0.tar.gz (34.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

browser_agent_protocol-0.7.0-py3-none-any.whl (43.2 kB view details)

Uploaded Python 3

File details

Details for the file browser_agent_protocol-0.7.0.tar.gz.

File metadata

  • Download URL: browser_agent_protocol-0.7.0.tar.gz
  • Upload date:
  • Size: 34.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for browser_agent_protocol-0.7.0.tar.gz
Algorithm Hash digest
SHA256 1d2e008b8913589e70d8614eabfe2ceee28dbab205fa4dd907ae54a7d780bdf4
MD5 44c196bea9996fddcb8ebbfdc3fa17be
BLAKE2b-256 082268aa6c76723d5397572a2f95cf64ebb47272a2420540d035e766dab6044d

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_agent_protocol-0.7.0.tar.gz:

Publisher: release.yml on browseragentprotocol/bap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file browser_agent_protocol-0.7.0-py3-none-any.whl.

File metadata

File hashes

Hashes for browser_agent_protocol-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 50043b55e4e11aba41a2e22e458f481d08f710befc67fab1e5a4422272cd012e
MD5 199cf3830dc1a7e22c57d9b8ad1c8f7f
BLAKE2b-256 8156d58ab4a8f73c8eb0baa7ebbc06a3ee4c56971bcbadd57d134a0f638d2bfd

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_agent_protocol-0.7.0-py3-none-any.whl:

Publisher: release.yml on browseragentprotocol/bap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page