Skip to main content

Chrome DevTools with accessibility-tree semantics for browser automation

Project description

Browser Hybrid

Chrome DevTools with accessibility-tree semantics for browser automation.

Use your authenticated Chrome sessions with agent-browser-like element targeting.

PyPI version Python 3.10+ License: MIT

Why?

Tool Your Sessions Accessibility Tree Performance
agent-browser ❌ Fresh browser ✅ Built-in refs Fast
Playwright ❌ Fresh browser ❌ CSS selectors Medium
Chrome DevTools ✅ Your Chrome ❌ Raw DOM Fast
Browser Hybrid ✅ Your Chrome ✅ Accessibility refs Fast

Installation

pip install browser-hybrid

Or from source:

git clone https://github.com/your-repo/browser-hybrid
cd browser-hybrid
pip install -e .

Automatic Reconnection

Browser Hybrid automatically reconnects when the Chrome WebSocket connection drops (network issues, Chrome restart, etc.).

Configuration

# Default: auto-reconnect enabled
browser = Browser()

# Disable auto-reconnect
browser = Browser(reconnect=False)

# Custom retry settings
browser = Browser(
    reconnect_max_retries=5,
    reconnect_backoff=1.0  # seconds
)

# Monitor reconnection events
def on_reconnect(status):
    print(f"Reconnect: {status}")

browser = Browser(reconnect_callback=on_reconnect)

Handling Errors

If reconnection fails after max retries, ConnectionError is raised. Catch it to handle gracefully:

from browser_hybrid import Browser, ConnectionError

try:
    browser.click("axnode@123")
except ConnectionError:
    print("Connection lost. Restart Chrome?")

Prerequisites

Chrome must be running with remote debugging:

# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
  --remote-debugging-port=9222 \
  --user-data-dir="$HOME/Library/Application Support/Google/Chrome-Debug"

# Or use the provided script
~/scripts/chrome-debug.sh

Async Considerations

While core interactions (click, type_text, evaluate) use asynchronous WebSockets for maximum performance, Tab Management (new_tab, list_tabs, close_tab) uses synchronous HTTP calls to the Chrome DevTools API.

In highly asynchronous environments (e.g., event-driven spiders), these HTTP calls will briefly block the event loop. For most use cases, this is negligible (~10-50ms), but for extreme performance, consider running Browser in a dedicated thread or process.

Quick Start

Python API

from browser_hybrid import Browser

# Connect to your Chrome
browser = Browser()

# Tab management
tab = browser.new_tab("https://gmail.com")  # Your logged-in session!

# Navigation
browser.navigate("https://github.com")

# Accessibility tree (like agent-browser)
tree = browser.accessibility_tree()
print(tree.to_tree_str())
# - heading "Welcome" [ref=1]
# - link "Sign in" [ref=2]
# - textbox "Email" [ref=3]

# Click by accessibility ref
browser.click("2")

# Or click by text
browser.click_by_text("Sign in")

# Fill form
browser.fill_form({
    "Email": "user@example.com",
    "Password": "secret"
})

# Screenshot
browser.screenshot("/tmp/page.png")

# Execute JavaScript
title = browser.evaluate("document.title")

CLI

# List tabs
browser-hybrid list

# Open new page
browser-hybrid new https://example.com

# Get accessibility snapshot (like agent-browser)
browser-hybrid snapshot
# - heading "Example Domain" [ref=1]
# - link "Learn more" [ref=2]

# Click by text
browser-hybrid click "Learn more"

# Take screenshot
browser-hybrid screenshot /tmp/page.png

# Execute JavaScript
browser-hybrid eval "document.title"

# Close tabs
browser-hybrid close

Features

Feature Description
Your sessions Use Gmail, banking, SSO without re-auth
Accessibility refs Target elements like agent-browser
Click by text Find elements by visible text
Form filling Fill multiple fields at once
Visual Regression Screenshot comparison (pixel-by-pixel), tolerance, regions
History recording Record and playback sessions to JSON format
Tab isolation temp_tab() context manager for isolated workflows
PDF Generation Generate PDF with headers, footers, & paper size control
Zero dependencies Only Python stdlib + websockets

API Reference

Tab Management

tabs = browser.list_tabs()           # List all open tabs
tab = browser.new_tab(url)           # Open new tab
browser.activate_tab(tab.id)         # Focus tab
browser.close_tab(tab.id)            # Close tab

Navigation

browser.navigate(url)                # Navigate tab
browser.wait_for(text="Sign in")     # Wait for text
browser.wait_for(selector="button")  # Wait for element

Accessibility Tree

tree = browser.accessibility_tree()           # Get full tree
links = tree.find_by_role("link")             # Find by role
headings = tree.find_by_role("heading")        # Find headings
matching = tree.find_by_name("Submit")         # Find by name
node = tree.find("ref_id")                     # Find by ref

print(tree.to_tree_str())              # agent-browser format

Interactions

browser.click(ref)                    # Click by accessibility ref
browser.click_by_text("Sign in")       # Click by visible text
browser.type_text(ref, "hello")        # Type into input
browser.type_slowly(ref, "hello", delay=0.05)  # Character-by-character typing
browser.fill_form({"name": "John"})    # Fill form fields
browser.hover(ref)                     # Hover over element

# Click with modifiers
browser.click(ref, double=True)         # Double-click
browser.click(ref, button="right")      # Right-click
browser.press_key("Enter")              # Press key
browser.press_key("c", modifiers=["Control"])  # Ctrl+C

Wait Conditions

# Wait for element to appear
browser.wait_for(selector="button.submit")
browser.wait_for(text="Welcome")

# Wait for navigation
browser.wait_for(url="https://example.com/success")
browser.wait_for_load_state("complete")

# Custom condition
browser.wait_until(lambda: browser.evaluate("document.readyState") == "complete")

Cookies

# Get all cookies
cookies = browser.get_cookies()

# Set cookies
browser.set_cookies([{"name": "session", "value": "abc123", "domain": ".example.com"}])

# Delete cookies
browser.delete_cookies("session")

Network Interception

# Intercept requests
browser.on_request(lambda req: print(f"Request: {req['url']}"))
browser.on_response(lambda res: print(f"Response: {res['status']}"))

### Content

```python
html = browser.get_html()              # Get page HTML
title = browser.evaluate("document.title")  # Execute JS
browser.screenshot("/tmp/page.png")    # Take screenshot

### Visual Regression

```python
# Compare screenshots with tolerance
result = browser.compare_screenshots("baseline.png", "current.png", threshold=0.1)

# Or assert in tests
browser.assert_visual_match("baseline.png", threshold=0.01)

## Comparison

| Feature | Selenium | Playwright | agent-browser | Browser Hybrid |
|---------|----------|------------|---------------|----------------|
| Uses installed Chrome | ❌ | ❌ | ❌ | ✅ |
| Authenticated sessions | ❌ | ❌ | ❌ | ✅ |
| Accessibility tree | ❌ | ✅ | ✅ | ✅ |
| Ref-based targeting | ❌ | ❌ | ✅ | ✅ |
| Python API | ✅ | ✅ | ❌ | ✅ |
| CLI | ❌ | ✅ | ✅ | ✅ |
| Dependencies | Heavy | Heavy | Rust | Minimal |

## Architecture

┌──────────────────────────────────────────┐ │ Browser Hybrid │ ├──────────────────────────────────────────┤ │ Python API │ CLI │ │ browser.py │ cli.py │ ├──────────────────────────────────────────┤ │ CDP Client │ │ ┌─────────────┐ ┌───────────────┐ │ │ │ HTTP API │ │ WebSocket API │ │ │ │ /json/* │ │ CDP Commands │ │ │ └─────────────┘ └───────────────┘ │ └──────────────────────────────────────────┘ │ ▼ ┌───────────────────┐ │ Chrome DevTools │ │ localhost:9222 │ │ (Your Chrome) │ └───────────────────┘


## Development

```bash
# Clone and install dev dependencies
git clone https://github.com/your-repo/browser-hybrid
cd browser-hybrid
pip install -e ".[dev]"

# Run tests
pytest

# Type check
mypy src/browser_hybrid

# Format
ruff format src/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browser_hybrid-0.4.4.tar.gz (129.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

browser_hybrid-0.4.4-py3-none-any.whl (44.5 kB view details)

Uploaded Python 3

File details

Details for the file browser_hybrid-0.4.4.tar.gz.

File metadata

  • Download URL: browser_hybrid-0.4.4.tar.gz
  • Upload date:
  • Size: 129.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for browser_hybrid-0.4.4.tar.gz
Algorithm Hash digest
SHA256 329b3edfc47c78ea0b6e4364d1118196a5fadf10bfdafd2761c7be2d3e9a8a59
MD5 4698d9dd2f1946391de41ff75d4efb16
BLAKE2b-256 e75914ddaa1ed36105f17df3818d56cb478dd3bbbe468737027c8d24dd4c912a

See more details on using hashes here.

File details

Details for the file browser_hybrid-0.4.4-py3-none-any.whl.

File metadata

  • Download URL: browser_hybrid-0.4.4-py3-none-any.whl
  • Upload date:
  • Size: 44.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for browser_hybrid-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a00b4c70f80e9af2b969b07c7bd8a2977a311bfed0ddcd9c863877f35642116b
MD5 d2ff5eca324f4aa41384767e20a01369
BLAKE2b-256 7dcc06dad0b687fb91338700611415654f677bc592c8ad11799fe2cc373cd957

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page