Skip to main content

Browser Bridge server and CLI for controlling a Chrome extension over WebSocket

Project description

Browser-Agent Bridge - Ultra-Fast Browser Control for Agents

WebSocket-only HTML-first browser bridge for remotely controlling a local Chrome extension, built as a super fast alternative to traditional vision-based browser control systems.

Why This Exists

Traditional browser relays often rely on LLM vision to understand web pages at each step. In practice, that approach is:

  1. Expensive: it consumes many tokens to repeatedly analyze visual page state.
  2. Slow: repeated visual analysis adds latency at every interaction step.
  3. Error-prone: visual perception includes noise that is less relevant than structured HTML for deterministic control.

This project exists as an HTML-first relay: the browser-side extension exposes structured observations and preprocessed HTML, so remote agents can interact with websites with lower cost, lower latency, and more reliable control.

Architecture (WS-only)

Operator CLI (remote/local)
    |
    |  ws(s)://.../ws/operator   (auth)
    v
Bridge Server
    ^
    |  ws(s)://.../ws/client     (auth)
    |
Chrome Extension (local browser)
    |
    +-- content script commands: observe/click/type/get_html/ping_tab/etc.

The extension connects outbound to server. Operator sends commands through server to a specific (instance_id, client_id).

Protocol

Client -> Server

  • auth: {kind, instance_id, client_id, token}
  • result: {kind, command_id, ok, result|error}
  • ping

Server -> Client

  • auth_ok / auth_error
  • command: {kind, command_id, type, payload, request_id, sent_at}
  • pong

Operator -> Server

  • auth: {kind, token}
  • list_clients
  • connect_status: {kind, instance_id, client_id}
  • send_command: {kind, instance_id, client_id, type, payload, timeout_s, request_id}
  • ping

Server -> Operator

  • auth_ok / auth_error
  • clients
  • connect_status
  • command_result
  • pong

Auth Modes

Set BRIDGE_AUTH_MODE:

  • static (default): compare token against BRIDGE_SHARED_TOKEN (for clients) and BRIDGE_OPERATOR_TOKEN (for operator; defaults to shared token).
    • BRIDGE_OPERATOR_TOKEN must be at least 16 chars and include lowercase, uppercase, digit, and symbol.
  • jwt: validate JWT with BRIDGE_JWT_SECRET/BRIDGE_JWT_ALG.
    • Client JWT should include matching instance_id and client_id claims.
    • Operator JWT should include role=operator.

Production safety

  • BRIDGE_ENV=production enforces strong auth config:
    • static mode: BRIDGE_SHARED_TOKEN must not be empty/dev default.
    • jwt mode: BRIDGE_JWT_SECRET must not be default.

Install (pipx recommended)

python3 -m pip install --user pipx
python3 -m pipx ensurepath
pipx install browser-agent-bridge

Quick Start

1) (Optional) Generate local JWT secret file

browser-bridge setup-secret

If BRIDGE_AUTH_MODE=jwt and BRIDGE_JWT_SECRET is still default, server startup auto-loads/creates local secret file (~/.browser_bridge/jwt_secret or BRIDGE_JWT_SECRET_FILE).

2) Start server

# static mode example
export BRIDGE_AUTH_MODE=static
export BRIDGE_SHARED_TOKEN='change-me-strong-token'
export BRIDGE_OPERATOR_TOKEN='Str0ng!Operator#42'
browser-bridge-server

3) Load extension

  1. Open chrome://extensions
  2. Enable Developer mode
  3. Load unpacked extension/
  4. In popup fill:
    • Bridge Server WS URL: ws://127.0.0.1:8765/ws/client (or wss://.../ws/client)
    • Instance ID: e.g. local-instance
    • Client ID: e.g. chrome-main
    • Auth Token / JWT: client token
  5. Save + Connect

Connected tab preview:

Connected tab preview

4) Operator CLI usage

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' list-clients
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' connect-status --instance-id local-instance --client-id chrome-main
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' ping-tab --instance-id local-instance --client-id chrome-main
browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token 'Str0ng!Operator#42' observe --instance-id local-instance --client-id chrome-main

observe now returns stable references per node:

  • ref: stable element reference for follow-up actions
  • click_ref: reference biased toward a clickable ancestor (row/link/button)
  • clickable_selector: selector for the chosen clickable ancestor

You can pass these back to click via send-command payload using ref/click_ref and optional guardrails:

  • prefer: control (default), row, or link
  • avoid_roles: e.g. ["checkbox", "menuitem"]
  • avoid_tags: e.g. ["input"]
  • avoid_input_types: e.g. ["checkbox", "radio"]

Raw command:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token '...' \
  send-command --instance-id local-instance --client-id chrome-main \
  --type get_html --payload '{"max_chars":40000}'

You can also avoid shell JSON escaping with --payload-file:

cat > /tmp/cmd.json <<'JSON'
{"selector":"input[name=\"q\"]","text":"openclaw"}
JSON

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token '...' \
  send-command --instance-id local-instance --client-id chrome-main \
  --type type --payload-file /tmp/cmd.json

get_html result includes:

  • html: captured DOM text (possibly truncated)
  • truncated: whether output was cut to payload.max_chars
  • notes: actionable recommendations (for example, increase max_chars when truncated, or set preprocess=false for rawer DOM)
  • preprocess and removed_nodes: preprocessing mode and removed-node count

Adaptive load wait (navigate, click, type):

  • Extension now waits for tab load completion before replying, but only up to 10s (adaptive: returns immediately if tab is already complete).
  • Override per command payload:
    • wait_for_load (default true)
    • wait_for_load_ms (default 10000, capped at 10000)
  • Command result includes load_wait diagnostics: waited_ms, completed, timed_out, final_status, enabled, max_wait_ms.

Example:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token '...' \
  send-command --instance-id local-instance --client-id chrome-main \
  --type navigate --payload '{"url":"https://example.com","wait_for_load_ms":4000}'

Human-like typing (type):

  • type now simulates typing character-by-character by default to better match human input behavior.
  • Optional payload fields:
    • human_like (default true)
    • clear_first (default true)
    • keystroke_delay_ms (default 45)
    • keystroke_jitter_ms (default 30)

Example:

browser-bridge --server-ws-url ws://127.0.0.1:8765/ws/operator --token '...' \
  send-command --instance-id local-instance --client-id chrome-main \
  --type type --payload '{"selector":"input[name=\"q\"]","text":"hello world","keystroke_delay_ms":70,"keystroke_jitter_ms":45}'

Security Hardening

  • Use TLS in non-local deployments (wss://).
  • Use strong static tokens or JWT secret. Operator static token must include mixed-case letters, digits, symbols, and be 16+ chars.
  • Optional command allowlist: BRIDGE_COMMAND_ALLOWLIST=observe,ping_tab,get_html.
  • Optional allowed clients allowlist in static mode: BRIDGE_ALLOWED_CLIENTS=instance1:client1,instance2:client2.
  • Request idempotency/replay guard is enforced by request_id dedup window.
  • Max payload limit is enforced by BRIDGE_MAX_MESSAGE_BYTES.

Testing

pytest -v

Coverage includes WS auth success/failure, command routing, disconnect handling, wrong target routing, CLI failure paths, and reconnect replacement behavior.

Contributing

Contributions are very welcome.

If you want to help, great places to start are:

  • bug fixes and reliability improvements
  • new command handlers and protocol hardening
  • better docs and examples
  • tests for real-world edge cases

Quick contributor workflow:

  1. Fork the repo and create a focused branch.
  2. Run tests locally (pytest -v).
  3. Open a PR with a clear description, motivation, and test notes.

For detailed guidelines, see CONTRIBUTING.md.

If you have ideas but no patch yet, opening an issue/discussion is also appreciated.

License

MIT (see LICENSE).


Created by the creator of openclaw-setup.me.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browser_agent_bridge-0.2.4.tar.gz (16.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

browser_agent_bridge-0.2.4-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file browser_agent_bridge-0.2.4.tar.gz.

File metadata

  • Download URL: browser_agent_bridge-0.2.4.tar.gz
  • Upload date:
  • Size: 16.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for browser_agent_bridge-0.2.4.tar.gz
Algorithm Hash digest
SHA256 5b6da81768ee4dc91f73d1156d9f37d9e7beea618624a0fc38f3d0bb384ac00e
MD5 ad940f3821421cdb74caca89da77102a
BLAKE2b-256 bd76692ad15025ac2077eb0c3c15265b3bd7ea374b431eb278f19e7c30f26c22

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_agent_bridge-0.2.4.tar.gz:

Publisher: publish.yml on NmadeleiDev/browser_agent_bridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file browser_agent_bridge-0.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for browser_agent_bridge-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1ccd966e1e877b1e3f094c45455b8fd4e7c67dc1581c4db5f4629db9db52a3e2
MD5 f8f351487678cc5ff5b70b8d2d29f9cc
BLAKE2b-256 888f6406ca4bd565d559bb5b5c0f1d3fb4bf27fea545b497a3ac5dbff2da56db

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_agent_bridge-0.2.4-py3-none-any.whl:

Publisher: publish.yml on NmadeleiDev/browser_agent_bridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page