Skip to main content

AI-friendly browser automation via CDP with profile-based login persistence

Project description

Harness Browser Banner

Harness Browser

AI-friendly browser automation via Chrome DevTools Protocol (CDP).

PyPI CI Python 3.11+ License: MIT

English · 中文


Highlights

  • Pure CDP, Zero Playwright — Direct WebSocket connection to Chrome DevTools, no browser binaries shipped, no driver layer
  • Token-Efficient DOM — 4-level output from ~50 to ~3000 tokens; interactive mode returns only clickable/typeable elements with stable refs
  • Login Persistence — One Chrome user-data-dir per profile; log in once, every subsequent run reuses cookies and storage
  • Agent-First API — Stateless browser_tool() function, MCP server, and Claude Code skill ready to drop into any agent project
  • Full Observability — Per-action metrics (duration_ms, estimated_tokens, screenshot_size_kb) plus lifecycle hooks

Overview

Harness Browser is an agent-first browser runtime built on pure CDP. It provides predictable DOM snapshots, persistent profile sessions, and a typed Python API designed for LLM tool-calling. No Playwright, no Selenium — just a direct WebSocket connection to Chrome.

The library solves three problems for AI agents:

  1. Token cost — Multi-level DOM output keeps context windows lean
  2. Element stability — Ref-based targeting survives layout reflows
  3. Authentication — Profile persistence eliminates repeated logins

Core Technology

Component Technology Purpose
Transport WebSocket (websockets>=12.0) Direct CDP communication with Chrome
DOM Engine Custom tree walker Multi-level DOM serialization with ref assignment
Configuration Pydantic models Typed settings with env-var override
MCP Server mcp>=1.0 Expose browser actions as MCP tools
Profiles Chrome user-data-dir Persistent login state per named profile

Features

  • Pure CDP — Direct WebSocket connection, no Playwright dependency
  • Profile-based login persistence — Chrome user-data-dir per profile, cookies/sessions reused across runs
  • 4-level DOM outputminimal (~50 tokens), interactive (~200–500 tokens), full (~1000–3000 tokens), structured (JSON)
  • Ref system — Stable element references across actions, invalidated on navigation
  • Hook systembefore_action, after_action, action_error, page_navigated
  • Per-action metricsduration_ms, estimated_tokens, screenshot_size_kb
  • Environment-variable configuration — All paths, ports, and timeouts configurable without code changes
  • Remote/Docker Chrome support — Bypass launcher via BROWSER_USE_CDP_WS_URL
  • MCP Server — Expose actions as MCP tools for Claude Code and other MCP clients
  • Drop-in Claude Code skill — Copy skills/harness-browser/ into any agent project
  • Strict typing — mypy strict, ruff clean, comprehensive test coverage

Quick Start

Requirements

  • Python 3.11+
  • Chrome or Chromium
# Ubuntu/Debian
sudo apt install chromium-browser

# macOS
brew install --cask google-chrome

Installation

pip install harness-browser

# Optional: download a Playwright-managed Chromium into the standard
# cache (~/.cache/ms-playwright/...) when you don't have a system Chrome.
# This installs `playwright` itself if it isn't already present, then
# fetches the browser binary. Set HARNESS_SKIP_PLAYWRIGHT_PIP=1 to skip
# the pip step (e.g. in pre-baked images).
harness-browser install-browser

Python API

import asyncio
from harness_browser import BrowserSession

async def main():
    async with await BrowserSession.create(profile="default") as sess:
        await sess.navigate("https://example.com")
        result = await sess.dom_tree(level="interactive")
        print(result.content)
        # → [ref=inp_1] input[text] placeholder="Search"
        # → [ref=btn_2] button "Go"
        await sess.click(ref="btn_2")

asyncio.run(main())

Stateless Tool Interface (for AI Frameworks)

from harness_browser import browser_tool

# All calls route to the same session by profile name
result = await browser_tool(action="navigate", url="https://github.com", profile="work")
result = await browser_tool(action="dom_tree", level="interactive", profile="work")
result = await browser_tool(action="click", ref="btn_search", profile="work")
result = await browser_tool(action="type", text="harness", profile="work")

CLI

Every action is also a shell command. Sequential calls on the same --profile attach to the same Chrome process, so refs and login state carry across invocations:

# Drive the browser
harness-browser navigate "https://example.com" --profile work
harness-browser dom-tree --profile work
# → [ref=inp_1] input[text] placeholder="Search"
# → [ref=btn_2] button "Go"
harness-browser click --ref inp_1 --profile work
harness-browser type "harness" --profile work
harness-browser click --ref btn_2 --profile work
harness-browser screenshot --path /tmp/result.png --profile work

# Tear down (Chrome itself stays running for later attach)
harness-browser close-session --profile work

Add --json to any command for the full structured ToolResult (useful in scripts), and --auto / --headed / --headless to override the launch mode on the first call per profile. Run harness-browser --help for the full list of subcommands.

MCP Server

python -m harness_browser.mcp_server

Add to Claude Code settings.json:

{
  "mcpServers": {
    "harness-browser": {
      "command": "python",
      "args": ["-m", "harness_browser.mcp_server"],
      "env": {
        "BROWSER_USE_MODE": "auto",
        "BROWSER_USE_PROFILES_DIR": "/data/browser-profiles"
      }
    }
  }
}

Available MCP tools: browser_navigate, browser_dom_tree, browser_screenshot, browser_click, browser_type, browser_eval_js, install_browser.


DOM Levels

Level Tokens Use Case
minimal ~50 Confirm page loaded, check title/URL
interactive ~200–500 Find clickable/typeable elements (default)
full ~1000–3000 Read page content
structured varies JSON for programmatic processing

Login State Reuse

Profiles persist Chrome sessions in ~/.harness-browser/profiles/<name>/:

# First run: navigate to login page, log in manually
await browser_tool(action="navigate", url="https://github.com/login", profile="github")

# All future runs: login state reused automatically
await browser_tool(action="navigate", url="https://github.com/settings", profile="github")

Hook System

async with await BrowserSession.create(profile="work") as sess:
    @sess.on("before_action")
    async def log_action(event):
        print(f"[{event['action']}] starting")

    @sess.on("after_action")
    async def log_metrics(metrics):
        print(f"  done in {metrics.duration_ms}ms (~{metrics.estimated_tokens} tokens)")

    await sess.navigate("https://example.com")

Screenshots

screenshot writes a PNG to disk and returns its path — never raw base64. That keeps token usage flat regardless of image size.

# Default: timestamped file in BROWSER_USE_SCREENSHOTS_DIR
result = await sess.screenshot()
# → /home/user/.harness-browser/screenshots/harness-1779462725763.png

# Full scrollable page
await sess.screenshot(full_page=True)

# Crop to a single element
await sess.screenshot(element_ref="btn_2")

# Pin the file path
await sess.screenshot(path="/tmp/latest.png")

Configuration

All settings can be configured via environment variables — no code changes required.

Environment Variable Default Description
BROWSER_USE_PROFILES_DIR ~/.harness-browser/profiles Root directory for Chrome user-data-dirs
BROWSER_USE_SCREENSHOTS_DIR ~/.harness-browser/screenshots Directory for PNG screenshots
BROWSER_USE_CDP_HOST localhost Host serving Chrome's CDP endpoint
BROWSER_USE_CDP_PORT_START 9222 First CDP debug port
BROWSER_USE_MODE auto Launch mode: auto / headed / headless
BROWSER_USE_CHROME_BIN auto-detect Absolute path to Chrome/Chromium
BROWSER_USE_CDP_TIMEOUT 30.0 CDP command timeout (seconds)
BROWSER_USE_CDP_WS_URL Direct WebSocket URL (bypasses launcher)

Actions Reference

Action Required Optional
navigate url
dom_tree level (default: interactive)
screenshot element_ref, full_page, path
click one of: ref, selector, x+y
type text ref
scroll direction, amount
hover ref
eval_js expression
go_back
go_forward
reload
list_tabs
new_tab url
switch_tab tab_id
close_tab tab_id
close_session

Development

# Clone
git clone https://github.com/orcakit/harness-browser.git
cd harness-browser

# Install with dev extras
uv sync --extra dev

# Install pre-commit hooks
pre-commit install

# Run tests
make test

# Lint + type check
make lint

# Format
make format

# Build wheel
make build

Related Projects

Project Description
harness-agent Production-grade AI agent platform built on LangChain Deep Agents
harness-memory Pluggable memory system with hierarchical recall and FTS
harness-im-bridge Multi-platform IM channel bridge for AI agents

Contributing

See CONTRIBUTING.md.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harness_browser-0.1.3.tar.gz (295.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

harness_browser-0.1.3-py3-none-any.whl (47.0 kB view details)

Uploaded Python 3

File details

Details for the file harness_browser-0.1.3.tar.gz.

File metadata

  • Download URL: harness_browser-0.1.3.tar.gz
  • Upload date:
  • Size: 295.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for harness_browser-0.1.3.tar.gz
Algorithm Hash digest
SHA256 e3d18eb3e30635c9a0e43be489dfeb45d96141093333df8dc645d957272e9533
MD5 5d30f16a2c6a7a918d46eb2796a42e6f
BLAKE2b-256 721663fb953dfdce424d3eb5f646b0e50343af996cae60de3fb8c45ea35fd044

See more details on using hashes here.

File details

Details for the file harness_browser-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: harness_browser-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 47.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for harness_browser-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 83f319eae340c77813169f4b4246340e55267cfdcd5b0c05295fa4cd2e2bafb3
MD5 b96e684ee794ba7d2a13956a718f890c
BLAKE2b-256 8d3cff99fd089ac4df63f7319babe484340753e7d83c6f3117ee980a63efdace

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page