Skip to main content

Control your browser from the command line via a Chrome extension + WebSocket bridge

Project description

browser-ctl

Control Chrome from your terminal. A lightweight CLI tool for browser automation — navigate, click, type, scroll, screenshot, and more, all through simple commands.

pip install browser-ctl

bctl go https://github.com
bctl click "a.search-button"
bctl type "input[name=q]" "browser-ctl"
bctl press Enter
bctl screenshot results.png

Why browser-ctl?

  • Zero-config CLI — single bctl command, JSON output, works in any shell or script
  • No browser binary management — uses your existing Chrome with a lightweight extension
  • Stdlib-only CLI — the CLI itself has zero external Python dependencies
  • AI-agent friendly — ships with an AI coding skill file (SKILL.md) for Cursor / OpenCode integration
  • Local & private — all communication stays on localhost, no data leaves your machine

How It Works

Terminal (bctl)  ──HTTP──▶  Bridge Server  ◀──WebSocket──  Chrome Extension
  1. The CLI (bctl) sends commands via HTTP to a local bridge server
  2. The bridge server relays them over WebSocket to the Chrome extension
  3. The extension executes commands using Chrome APIs and content scripts
  4. Results flow back the same path as JSON

The bridge server auto-starts on first command — no manual setup needed.

Installation

1. Install the Python package

pip install browser-ctl

2. Load the Chrome extension

bctl setup

This copies the extension to ~/.browser-ctl/extension/ and opens Chrome's extension page. Then:

  1. Open chrome://extensions
  2. Enable Developer mode (top right)
  3. Click Load unpacked
  4. Select the ~/.browser-ctl/extension/ directory

3. Verify

bctl ping

You should see {"success": true, "data": {"server": true, "extension": true}}.

Commands

Navigation

bctl navigate <url>       # Navigate to URL (aliases: nav, go)
bctl back                 # Go back in history
bctl forward              # Go forward (alias: fwd)
bctl reload               # Reload current page

Interaction

bctl click <sel> [-i N]           # Click element (CSS selector, optional Nth match)
bctl hover <sel> [-i N]           # Hover over element
bctl type <sel> <text>            # Type text into input/textarea
bctl press <key>                  # Press key (Enter, Escape, Tab, etc.)
bctl scroll <dir|sel> [pixels]    # Scroll: up/down/top/bottom or element into view
bctl select-option <sel> <val>    # Select dropdown option (alias: sopt) [--text]
bctl drag <src> [target]          # Drag to element or offset [--dx N --dy N]

DOM Query

bctl text [sel]           # Get text content (default: body)
bctl html [sel]           # Get innerHTML
bctl attr <sel> [name]    # Get attribute(s) [-i N for Nth element]
bctl select <sel> [-l N]  # List matching elements (alias: sel, limit default: 20)
bctl count <sel>          # Count matching elements
bctl status               # Current page URL and title

JavaScript

bctl eval <code>          # Execute JS in page context (auto-bypasses CSP)

Tabs

bctl tabs                 # List all tabs
bctl tab <id>             # Switch to tab by ID
bctl new-tab [url]        # Open new tab
bctl close-tab [id]       # Close tab (default: active)

Screenshot & Files

bctl screenshot [path]    # Capture screenshot (alias: ss)
bctl download <target>    # Download file/image (alias: dl) [-o file] [-i N]
bctl upload <sel> <files> # Upload file(s) to <input type="file">

Wait & Dialog

bctl wait <sel|seconds>   # Wait for element or sleep [timeout]
bctl dialog [accept|dismiss] [--text <val>]  # Handle next alert/confirm/prompt

Server

bctl ping                 # Check server & extension status
bctl serve                # Start server in foreground
bctl stop                 # Stop server

Examples

Search and extract

bctl go "https://news.ycombinator.com"
bctl select "a.titlelink" -l 5       # Top 5 links with text, href, etc.

Fill a form

bctl type "input[name=email]" "user@example.com"
bctl type "input[name=password]" "hunter2"
bctl select-option "select#country" "US"
bctl upload "input[type=file]" ./resume.pdf
bctl click "button[type=submit]"

Scroll and screenshot

bctl go "https://en.wikipedia.org/wiki/Web_browser"
bctl scroll down 1000
bctl ss page.png

Handle dialogs

bctl dialog accept              # Set up handler BEFORE triggering
bctl click "#delete-button"     # This triggers a confirm() dialog

Drag and drop

bctl drag ".task-card" ".done-column"
bctl drag ".range-slider" --dx 50 --dy 0

Use in shell scripts

# Extract all image URLs from a page
bctl go "https://example.com"
bctl eval "JSON.stringify(Array.from(document.images).map(i=>i.src))"

# Wait for SPA content to load
bctl go "https://app.example.com/dashboard"
bctl wait ".dashboard-loaded" 15
bctl text ".metric-value"

AI Agent Integration

browser-ctl ships with a SKILL.md file designed for AI coding assistants. Install it for your tool:

bctl setup cursor       # Install skill for Cursor IDE
bctl setup opencode     # Install skill for OpenCode
bctl setup /path/to/dir # Install to custom directory

Once installed, AI agents can use bctl commands to automate browser tasks on your behalf.

Output Format

All commands return JSON to stdout:

// Success
{"success": true, "data": {"url": "https://example.com", "title": "Example"}}

// Error
{"success": false, "error": "Element not found: .missing"}

Non-zero exit code on errors — works naturally with set -e and && chains.

Architecture

┌─────────────────────────────────────────────────┐
│  Terminal                                       │
│  $ bctl click "button.submit"                   │
│       │                                         │
│       ▼ HTTP POST localhost:19876/command        │
│  ┌─────────────────────┐                        │
│  │   Bridge Server     │ (Python, aiohttp)      │
│  │   :19876            │                        │
│  └────────┬────────────┘                        │
│           │ WebSocket                           │
│           ▼                                     │
│  ┌─────────────────────┐                        │
│  │  Chrome Extension   │ (Manifest V3)          │
│  │  Service Worker     │                        │
│  └────────┬────────────┘                        │
│           │ chrome.scripting / chrome.debugger   │
│           ▼                                     │
│  ┌─────────────────────┐                        │
│  │  Web Page           │                        │
│  └─────────────────────┘                        │
└─────────────────────────────────────────────────┘
  • CLI → stdlib only, communicates via HTTP
  • Bridge Server → async relay (aiohttp), auto-daemonizes
  • Extension → MV3 service worker, auto-reconnects via chrome.alarms
  • Eval → dual strategy: MAIN-world injection (fast) with CDP fallback (CSP-safe)

Requirements

  • Python >= 3.11
  • Chrome / Chromium with the extension loaded
  • macOS, Linux, or Windows

Privacy

All communication is local (127.0.0.1). No analytics, no telemetry, no external servers. See PRIVACY.md for the full privacy policy.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browser_ctl-0.1.0.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

browser_ctl-0.1.0-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file browser_ctl-0.1.0.tar.gz.

File metadata

  • Download URL: browser_ctl-0.1.0.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for browser_ctl-0.1.0.tar.gz
Algorithm Hash digest
SHA256 18efdcb4f4d5ad3b284e2277ac6cdb07afec96bbcbd535eaef0bb2a62aa174d6
MD5 596226cd114fd411eea0d8366ef388ef
BLAKE2b-256 72a36b6a88fe1b5412119891beda25917ef965155991be4a40db9033dc339909

See more details on using hashes here.

File details

Details for the file browser_ctl-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: browser_ctl-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for browser_ctl-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a5bbfd7aabe8413dad7ce95b2e61058ac704091706c2c3115f2344bd8e751f5a
MD5 44d0e1915f61cc40d2f81e73761e0439
BLAKE2b-256 8665eda7ff10536b6519c2763bfd925854392cd86d434c0b261f5f885aa31d00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page