Skip to main content

Python SDK for the Browser Agent Protocol (BAP) - control browsers with AI agents

Project description

browser-agent-protocol

Python SDK for the Browser Agent Protocol (BAP) - control browsers with AI agents.

Installation

pip install browser-agent-protocol

Quick Start

Async API (recommended)

import asyncio
from browseragentprotocol import BAPClient, role, text, label

async def main():
    async with BAPClient("ws://localhost:9222") as client:
        # Launch browser
        await client.launch(browser="chromium", headless=True)

        # Create page and navigate
        await client.create_page(url="https://example.com")

        # Click using semantic selectors
        await client.click(role("button", "Submit"))

        # Fill form fields
        await client.fill(label("Email"), "user@example.com")

        # Take screenshot
        screenshot = await client.screenshot()
        print(f"Screenshot: {len(screenshot.data)} bytes")

        # Get accessibility tree (ideal for AI agents)
        tree = await client.accessibility()
        print(f"Found {len(tree.tree)} nodes")

asyncio.run(main())

High-Level Session Helper

from browseragentprotocol.context import bap_session, role

async with bap_session(
    "ws://localhost:9222",
    start_url="https://example.com"
) as client:
    await client.click(role("button", "Accept"))
    content = await client.content()

Sync API (for scripts and notebooks)

from browseragentprotocol import BAPClientSync, role

with BAPClientSync("ws://localhost:9222") as client:
    client.launch(browser="chromium", headless=True)
    client.create_page(url="https://example.com")

    client.click(role("button", "Submit"))
    screenshot = client.screenshot()

CLI

# Test connection to a BAP server
bap connect ws://localhost:9222

# Get server info (with JSON output)
bap info ws://localhost:9222 --json

Semantic Selectors

BAP uses semantic selectors instead of brittle CSS selectors:

from browseragentprotocol import role, text, label, css, xpath, test_id, ref

# Recommended: Semantic selectors
role("button", "Submit")           # ARIA role + accessible name
text("Sign in")                    # Visible text content
label("Email address")             # Associated label

# Developer-controlled identifiers
test_id("submit-button")           # data-testid attribute

# Stable element references
ref("@submitBtn")                  # Element ref from agent/observe

# Fallback: CSS/XPath
css(".btn-primary")
xpath("//button[@type='submit']")

AI Agent Methods

BAP provides three composite methods optimized for AI agents:

agent/observe - Get AI-optimized page snapshots

observation = await client.observe(
    include_accessibility=True,
    include_interactive_elements=True,
    include_screenshot=True,
    max_elements=50,
    annotate_screenshot=True,  # Set-of-Marks style annotation
)

# Interactive elements with stable refs
for element in observation.interactive_elements:
    print(f"{element.ref}: {element.role} - {element.name}")
    # @e1: button - Submit
    # @e2: textbox - Email

# Screenshot with numbered badges linking to elements
if observation.annotation_map:
    for annotation in observation.annotation_map:
        print(f"[{annotation.label}] -> {annotation.ref}")

agent/act - Execute multi-step sequences atomically

from browseragentprotocol import BAPClient

result = await client.act([
    BAPClient.step("action/fill", {"selector": label("Email"), "value": "user@example.com"}),
    BAPClient.step("action/fill", {"selector": label("Password"), "value": "secret123"}),
    BAPClient.step("action/click", {"selector": role("button", "Sign In")}),
])

print(f"Completed {result.completed}/{result.total} steps")
print(f"Success: {result.success}")

agent/extract - Extract structured data

data = await client.extract(
    instruction="Extract all product names and prices",
    schema={
        "type": "array",
        "items": {
            "type": "object",
            "properties": {
                "name": {"type": "string"},
                "price": {"type": "number"},
            },
        },
    },
)

if data.success:
    for product in data.data:
        print(f"{product['name']}: ${product['price']}")

Multi-Context Support

Create isolated browser contexts with separate cookies/storage:

# Create isolated context
context = await client.create_context(
    context_id="user-session",
    options={
        "viewport": {"width": 1920, "height": 1080},
        "locale": "en-US",
    },
)

# Create page in specific context
page = await client.create_page(
    url="https://example.com",
    context_id=context.context_id,
)

# Clean up
await client.destroy_context(context.context_id)

Frame Support

Navigate iframes and cross-origin frames:

# List frames
frames = await client.list_frames()
for frame in frames.frames:
    print(f"{frame.frame_id}: {frame.url}")

# Switch to iframe
await client.switch_frame(selector=css("iframe#payment"))

# Interact within frame
await client.fill(label("Card number"), "4242424242424242")

# Return to main frame
await client.main_frame()

Human-in-the-Loop Approval

Handle approval requests for sensitive actions:

def handle_approval(params):
    print(f"Approval needed: {params.rule}")
    print(f"Action: {params.original_request}")
    # In a real app, show UI to user
    return "approve"

client.on_approval_required(handle_approval)

# Respond to approval request
await client.respond_to_approval(
    request_id="...",
    decision="approve",  # or "deny", "approve-session"
    reason="User approved the action",
)

Error Handling

from browseragentprotocol import (
    BAPError,
    BAPTimeoutError,
    BAPElementNotFoundError,
    BAPApprovalDeniedError,
)

try:
    await client.click(role("button", "Missing"))
except BAPTimeoutError as e:
    print(f"Timeout: {e.message}")
    if e.retryable:
        # Retry the operation
        pass
except BAPElementNotFoundError as e:
    print(f"Element not found: {e.details}")
except BAPApprovalDeniedError as e:
    print(f"Action denied: {e.message}")
except BAPError as e:
    print(f"Error {e.code}: {e.message}")

Requirements

  • Python 3.10+
  • aiohttp >= 3.9.0
  • pydantic >= 2.0.0
  • anyio >= 4.0.0
  • httpx >= 0.27.0
  • httpx-sse >= 0.4.0

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browser_agent_protocol-0.2.0.tar.gz (34.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

browser_agent_protocol-0.2.0-py3-none-any.whl (42.6 kB view details)

Uploaded Python 3

File details

Details for the file browser_agent_protocol-0.2.0.tar.gz.

File metadata

  • Download URL: browser_agent_protocol-0.2.0.tar.gz
  • Upload date:
  • Size: 34.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for browser_agent_protocol-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d036eb9101a4a5e1a5f643fe017b52495bbaa57cf3d9d4f60a6a0dbfdd4d8baa
MD5 ddc03a216316b6115d0fd61f47510f71
BLAKE2b-256 1f30a97068e9dfe7874d8881da9ed11858400e152e8f880162bdd174c66746f2

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_agent_protocol-0.2.0.tar.gz:

Publisher: release.yml on browseragentprotocol/bap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file browser_agent_protocol-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for browser_agent_protocol-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 12afb8f605403054189f4872c0507be7d44172f000f714b52fe3cae19c8ef378
MD5 b7d98d07bbc19454f5e3f7d579470dc3
BLAKE2b-256 669964848a8c59f22262c00894d683073588bb5c6543ae4cc698a2859be877f0

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_agent_protocol-0.2.0-py3-none-any.whl:

Publisher: release.yml on browseragentprotocol/bap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page