AI-friendly browser automation via CDP with profile-based login persistence

These details have not been verified by PyPI

Project description

harness-browser

AI-friendly browser automation via Chrome DevTools Protocol (CDP).

English · 中文

An agent-first browser runtime built on pure CDP. Predictable DOM snapshots, persistent profile sessions, and a typed Python API designed for LLM tool-calling — no Playwright, no driver layer in between.

Why harness-browser

Concern	How we address it
Token cost	4-level DOM with `interactive` mode (~200–500 tokens) returns only clickable/typeable elements with stable refs — not raw HTML
Stable element targeting	Refs (`btn_2`, `inp_search`) survive layout reflows and are auto-invalidated on navigation, so the agent never points at a stale node
Login persistence	One Chrome user-data-dir per profile under `~/.harness-browser/profiles/<name>/` — log in once, every subsequent run reuses cookies and storage
No Playwright tax	Pure CDP over WebSocket (`websockets>=12.0`); no browser binaries shipped, no patched Chromium, no driver layer
Observability	Every action emits `ActionMetrics` (`duration_ms`, `dom_nodes_scanned`, `estimated_tokens`, `screenshot_size_kb`) plus `before_action` / `after_action` / `action_error` / `page_navigated` hooks
Configuration	Seven `BROWSER_USE_*` env vars cover paths, ports, timeouts, and remote/Docker Chrome via `BROWSER_USE_CDP_WS_URL` — no code changes between dev, CI, and prod
Agent integrations	Stateless `browser_tool(action=..., profile=...)` for any framework, MCP server (`python -m harness_browser.mcp_server`), and a ready-to-copy Claude Code skill in `skills/`

Features

Pure CDP — direct WebSocket connection, no Playwright dependency
Profile-based login persistence — Chrome user-data-dir per profile, cookies/sessions reused across runs
4-level DOM output — minimal (~50 tokens), interactive (~200–500 tokens), full (~1000–3000 tokens), structured (JSON)
Ref system — stable element references across actions, invalidated on navigation
Hook system — before_action, after_action, action_error, page_navigated
Per-action metrics — duration_ms, estimated_tokens, screenshot_size_kb
Environment-variable configuration — all paths, ports, and timeouts configurable without code changes
Remote/Docker Chrome support — bypass launcher via BROWSER_USE_CDP_WS_URL
MCP Server — expose actions as MCP tools for Claude Code and other MCP clients
Drop-in Claude Code skill — copy skills/harness-browser/ into any agent project
Strict typing — mypy strict, ruff clean, 34 unit tests covering DOM, refs, hooks, settings, and CDP framing

Requirements

Python 3.11+
Chrome or Chromium

# Ubuntu/Debian
sudo apt install chromium-browser

# macOS
brew install --cask google-chrome

Installation

pip install harness-browser

Quick Start

Python API

import asyncio
from harness_browser import BrowserSession

async def main():
    async with await BrowserSession.create(profile="default") as sess:
        await sess.navigate("https://example.com")
        result = await sess.dom_tree(level="interactive")
        print(result.content)
        # → [ref=inp_1] input[text] placeholder="Search"
        # → [ref=btn_2] button "Go"
        await sess.click(ref="btn_2")

asyncio.run(main())

AI Framework Usage (stateless)

from harness_browser import browser_tool

# All calls route to the same session by profile name
result = await browser_tool(action="navigate", url="https://github.com", profile="work")
result = await browser_tool(action="dom_tree", level="interactive", profile="work")
result = await browser_tool(action="click", ref="btn_search", profile="work")
result = await browser_tool(action="type", text="harness", profile="work")

DOM Levels

Level	Tokens	Use case
`minimal`	~50	Confirm page loaded, check title/URL
`interactive`	~200–500	Find clickable/typeable elements (default)
`full`	~1000–3000	Read page content
`structured`	varies	JSON for programmatic processing

Login State Reuse

Profiles persist Chrome sessions in ~/.harness-browser/profiles/<name>/:

# First run: navigate to login page, log in manually
await browser_tool(action="navigate", url="https://github.com/login", profile="github")

# All future runs: login state reused automatically
await browser_tool(action="navigate", url="https://github.com/settings", profile="github")

Hook System

async with await BrowserSession.create(profile="work") as sess:
    @sess.on("before_action")
    async def log_action(event):
        print(f"[{event['action']}] starting")

    @sess.on("after_action")
    async def log_metrics(metrics):
        print(f"  done in {metrics.duration_ms}ms (~{metrics.estimated_tokens} tokens)")

    await sess.navigate("https://example.com")

MCP Server

python -m harness_browser.mcp_server

Add to Claude Code settings.json:

{
  "mcpServers": {
    "harness-browser": {
      "command": "python",
      "args": ["-m", "harness_browser.mcp_server"],
      "env": {
        "BROWSER_USE_MODE": "auto",
        "BROWSER_USE_PROFILES_DIR": "/data/browser-profiles"
      }
    }
  }
}

All BROWSER_USE_* environment variables can be passed through the MCP env block — this is the recommended way to configure mode, profile location, and remote CDP endpoints for an MCP-hosted browser.

Available MCP tools: browser_navigate, browser_dom_tree, browser_screenshot, browser_click, browser_type, browser_eval_js.

Screenshots

screenshot writes a PNG to disk and returns its path — never raw base64. That keeps token usage flat regardless of image size and lets dashboards preview the file directly.

# default: timestamped file in BROWSER_USE_SCREENSHOTS_DIR
result = await sess.screenshot()
print(result.content)
# → /home/user/.harness-browser/screenshots/harness-1779462725763.png

# full scrollable page (uses Page.getLayoutMetrics + captureBeyondViewport)
await sess.screenshot(full_page=True)

# crop to a single element discovered via dom_tree
await sess.screenshot(element_ref="btn_2")

# pin the file path — every call overwrites the same file
await sess.screenshot(path="/tmp/latest.png")

result.metadata carries the page url / title / width / height / size_kb / full_page so callers can render context without an extra Runtime.evaluate.

Claude Code Skill

A ready-to-use skill ships under skills/:

# Copy into another agent project as a Claude Code skill
cp -r skills/harness-browser /path/to/other-project/.codebuddy/skills/
# or the Chinese variant
cp -r skills/harness-browser-zh /path/to/other-project/.codebuddy/skills/

The skill teaches the agent the standard navigate → dom_tree → click/type loop and the ref discipline (always re-fetch DOM after navigation).

Actions Reference

Action	Required	Optional
`navigate`	`url`
`dom_tree`		`level` (default: `interactive`)
`screenshot`		`element_ref`, `full_page`, `path`
`click`	one of: `ref`, `selector`, `x`+`y`
`type`	`text`	`ref`
`scroll`		`direction`, `amount`
`hover`	`ref`
`eval_js`	`expression`
`go_back`
`go_forward`
`reload`
`list_tabs`
`new_tab`		`url`
`switch_tab`	`tab_id`
`close_tab`		`tab_id`
`close_session`

Configuration

All settings can be configured via environment variables. No code changes required.

Environment Variable	Default	Description
`BROWSER_USE_PROFILES_DIR`	`~/.harness-browser/profiles`	Root directory for Chrome user-data-dirs
`BROWSER_USE_SCREENSHOTS_DIR`	`~/.harness-browser/screenshots`	Directory where the `screenshot` action writes PNG files
`BROWSER_USE_CDP_HOST`	`localhost`	Host or IP serving Chrome's CDP HTTP/WebSocket endpoint
`BROWSER_USE_CDP_PORT_START`	`9222`	First CDP debug port assigned to profiles
`BROWSER_USE_MODE`	`auto`	Launch mode: `auto` / `headed` / `headless`. `auto` picks headed when `DISPLAY`/`WAYLAND_DISPLAY` is set (or on macOS/Windows), else headless
`BROWSER_USE_CHROME_BIN`	auto-detect	Absolute path to Chrome/Chromium executable
`BROWSER_USE_CDP_TIMEOUT`	`30.0`	Seconds to wait for a CDP command response
`BROWSER_USE_LAUNCH_RETRIES`	`20`	Times to poll Chrome after launch
`BROWSER_USE_LAUNCH_DELAY`	`0.25`	Seconds between launch poll attempts
`BROWSER_USE_CDP_WS_URL`	—	Direct connect: bypass launcher, connect to this WebSocket URL

Common scenarios

Custom profile storage:

export BROWSER_USE_PROFILES_DIR=/data/browser-profiles

Force headed or headless mode (default is auto, which picks based on DISPLAY):

export BROWSER_USE_MODE=headless   # always headless (CI, containers)
export BROWSER_USE_MODE=headed     # always headed (force a window even without DISPLAY)
# unset / "auto" → headed when a desktop is detected, headless otherwise

Non-standard Chrome path:

export BROWSER_USE_CHROME_BIN=/opt/google/chrome/chrome

Connect to a remote or Docker Chrome (bypasses launcher entirely):

# Start Chrome with --remote-debugging-port=9222 --remote-debugging-address=0.0.0.0
export BROWSER_USE_CDP_WS_URL="ws://remote-host:9222/devtools/browser/xxxxxxxx"

Talk to Chrome on another host or container (keeps the attach/launcher logic, just changes the host):

# Chrome already running with --remote-debugging-port=9222 --remote-debugging-address=0.0.0.0
export BROWSER_USE_CDP_HOST=10.0.0.42
# harness will hit http://10.0.0.42:9222/json/version and use that page's WS URL

Override settings in code (useful for testing or multi-instance setups):

from harness_browser import BrowserSession, HarnessSettings

cfg = HarnessSettings(
    cdp_port_start=9300,
    cdp_timeout=60.0,
    profiles_dir="/data/profiles",
)
sess = await BrowserSession.create(profile="work", settings=cfg)

Development

# Clone
git clone https://git.woa.com/orcakit/browser-use.git
cd browser-use

# Install with dev extras
uv sync --extra dev

# Install pre-commit hooks
pre-commit install

# Run tests
make test

# Lint + type check
make lint

# Format
make format

# Build wheel
make build

Contributing

See CONTRIBUTING.md.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.3

Jun 4, 2026

This version

0.1.2

May 24, 2026

0.1.1

May 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harness_browser-0.1.2.tar.gz (215.5 kB view details)

Uploaded May 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

harness_browser-0.1.2-py3-none-any.whl (37.4 kB view details)

Uploaded May 24, 2026 Python 3

File details

Details for the file harness_browser-0.1.2.tar.gz.

File metadata

Download URL: harness_browser-0.1.2.tar.gz
Upload date: May 24, 2026
Size: 215.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for harness_browser-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`df022ce84700d255244b41e436a74d84422f0f3e80061c456d7ca354fff0f080`
MD5	`5fe1572966e73987d78d16cac35f6e42`
BLAKE2b-256	`5b93755840587be443866669abec092c0f94aedf7f26855bb9731176d37e1f99`

See more details on using hashes here.

File details

Details for the file harness_browser-0.1.2-py3-none-any.whl.

File metadata

Download URL: harness_browser-0.1.2-py3-none-any.whl
Upload date: May 24, 2026
Size: 37.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for harness_browser-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5e179482b86abef545f082211fbf6c227f2882d9b78a554988be623cc6cd4944`
MD5	`488a64328637dd5e8ce2cfa369c98954`
BLAKE2b-256	`acb34ee302816b539302fd2861463563500da56efbdd8bc8240ea31b2846586c`

See more details on using hashes here.

harness-browser 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

harness-browser

Why harness-browser

Features

Requirements

Installation

Quick Start

Python API

AI Framework Usage (stateless)

DOM Levels

Login State Reuse

Hook System

MCP Server

Screenshots

Claude Code Skill

Actions Reference

Configuration

Common scenarios

Development

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes