AI-friendly browser automation via CDP with profile-based login persistence
Project description
Harness Browser
AI-friendly browser automation via Chrome DevTools Protocol (CDP).
English · 中文
Highlights
- Pure CDP, Zero Playwright — Direct WebSocket connection to Chrome DevTools, no browser binaries shipped, no driver layer
- Token-Efficient DOM — 4-level output from ~50 to ~3000 tokens;
interactivemode returns only clickable/typeable elements with stable refs - Login Persistence — One Chrome user-data-dir per profile; log in once, every subsequent run reuses cookies and storage
- Agent-First API — Stateless
browser_tool()function, MCP server, and Claude Code skill ready to drop into any agent project - Full Observability — Per-action metrics (
duration_ms,estimated_tokens,screenshot_size_kb) plus lifecycle hooks
Overview
Harness Browser is an agent-first browser runtime built on pure CDP. It provides predictable DOM snapshots, persistent profile sessions, and a typed Python API designed for LLM tool-calling. No Playwright, no Selenium — just a direct WebSocket connection to Chrome.
The library solves three problems for AI agents:
- Token cost — Multi-level DOM output keeps context windows lean
- Element stability — Ref-based targeting survives layout reflows
- Authentication — Profile persistence eliminates repeated logins
Core Technology
| Component | Technology | Purpose |
|---|---|---|
| Transport | WebSocket (websockets>=12.0) |
Direct CDP communication with Chrome |
| DOM Engine | Custom tree walker | Multi-level DOM serialization with ref assignment |
| Configuration | Pydantic models | Typed settings with env-var override |
| MCP Server | mcp>=1.0 |
Expose browser actions as MCP tools |
| Profiles | Chrome user-data-dir | Persistent login state per named profile |
Features
- Pure CDP — Direct WebSocket connection, no Playwright dependency
- Profile-based login persistence — Chrome user-data-dir per profile, cookies/sessions reused across runs
- 4-level DOM output —
minimal(~50 tokens),interactive(~200–500 tokens),full(~1000–3000 tokens),structured(JSON) - Ref system — Stable element references across actions, invalidated on navigation
- Hook system —
before_action,after_action,action_error,page_navigated - Per-action metrics —
duration_ms,estimated_tokens,screenshot_size_kb - Environment-variable configuration — All paths, ports, and timeouts configurable without code changes
- Remote/Docker Chrome support — Bypass launcher via
BROWSER_USE_CDP_WS_URL - MCP Server — Expose actions as MCP tools for Claude Code and other MCP clients
- Drop-in Claude Code skill — Copy
skills/harness-browser/into any agent project - Strict typing — mypy strict, ruff clean, comprehensive test coverage
Quick Start
Requirements
- Python 3.11+
- Chrome or Chromium
# Ubuntu/Debian
sudo apt install chromium-browser
# macOS
brew install --cask google-chrome
Installation
pip install harness-browser
# Optional: download a Playwright-managed Chromium into the standard
# cache (~/.cache/ms-playwright/...) when you don't have a system Chrome.
# This installs `playwright` itself if it isn't already present, then
# fetches the browser binary. Set HARNESS_SKIP_PLAYWRIGHT_PIP=1 to skip
# the pip step (e.g. in pre-baked images).
harness-browser install-browser
Python API
import asyncio
from harness_browser import BrowserSession
async def main():
async with await BrowserSession.create(profile="default") as sess:
await sess.navigate("https://example.com")
result = await sess.dom_tree(level="interactive")
print(result.content)
# → [ref=inp_1] input[text] placeholder="Search"
# → [ref=btn_2] button "Go"
await sess.click(ref="btn_2")
asyncio.run(main())
Stateless Tool Interface (for AI Frameworks)
from harness_browser import browser_tool
# All calls route to the same session by profile name
result = await browser_tool(action="navigate", url="https://github.com", profile="work")
result = await browser_tool(action="dom_tree", level="interactive", profile="work")
result = await browser_tool(action="click", ref="btn_search", profile="work")
result = await browser_tool(action="type", text="harness", profile="work")
CLI
Every action is also a shell command. Sequential calls on the same
--profile attach to the same Chrome process, so refs and login state
carry across invocations:
# Drive the browser
harness-browser navigate "https://example.com" --profile work
harness-browser dom-tree --profile work
# → [ref=inp_1] input[text] placeholder="Search"
# → [ref=btn_2] button "Go"
harness-browser click --ref inp_1 --profile work
harness-browser type "harness" --profile work
harness-browser click --ref btn_2 --profile work
harness-browser screenshot --path /tmp/result.png --profile work
# Tear down (Chrome itself stays running for later attach)
harness-browser close-session --profile work
Add --json to any command for the full structured ToolResult (useful in
scripts), and --auto / --headed / --headless to override the launch
mode on the first call per profile. Run harness-browser --help for the
full list of subcommands.
MCP Server
python -m harness_browser.mcp_server
Add to Claude Code settings.json:
{
"mcpServers": {
"harness-browser": {
"command": "python",
"args": ["-m", "harness_browser.mcp_server"],
"env": {
"BROWSER_USE_MODE": "auto",
"BROWSER_USE_PROFILES_DIR": "/data/browser-profiles"
}
}
}
}
Available MCP tools: browser_navigate, browser_dom_tree, browser_screenshot, browser_click, browser_type, browser_eval_js, install_browser.
DOM Levels
| Level | Tokens | Use Case |
|---|---|---|
minimal |
~50 | Confirm page loaded, check title/URL |
interactive |
~200–500 | Find clickable/typeable elements (default) |
full |
~1000–3000 | Read page content |
structured |
varies | JSON for programmatic processing |
Login State Reuse
Profiles persist Chrome sessions in ~/.harness-browser/profiles/<name>/:
# First run: navigate to login page, log in manually
await browser_tool(action="navigate", url="https://github.com/login", profile="github")
# All future runs: login state reused automatically
await browser_tool(action="navigate", url="https://github.com/settings", profile="github")
Hook System
async with await BrowserSession.create(profile="work") as sess:
@sess.on("before_action")
async def log_action(event):
print(f"[{event['action']}] starting")
@sess.on("after_action")
async def log_metrics(metrics):
print(f" done in {metrics.duration_ms}ms (~{metrics.estimated_tokens} tokens)")
await sess.navigate("https://example.com")
Screenshots
screenshot writes a PNG to disk and returns its path — never raw base64. That keeps token usage flat regardless of image size.
# Default: timestamped file in BROWSER_USE_SCREENSHOTS_DIR
result = await sess.screenshot()
# → /home/user/.harness-browser/screenshots/harness-1779462725763.png
# Full scrollable page
await sess.screenshot(full_page=True)
# Crop to a single element
await sess.screenshot(element_ref="btn_2")
# Pin the file path
await sess.screenshot(path="/tmp/latest.png")
Configuration
All settings can be configured via environment variables — no code changes required.
| Environment Variable | Default | Description |
|---|---|---|
BROWSER_USE_PROFILES_DIR |
~/.harness-browser/profiles |
Root directory for Chrome user-data-dirs |
BROWSER_USE_SCREENSHOTS_DIR |
~/.harness-browser/screenshots |
Directory for PNG screenshots |
BROWSER_USE_CDP_HOST |
localhost |
Host serving Chrome's CDP endpoint |
BROWSER_USE_CDP_PORT_START |
9222 |
First CDP debug port |
BROWSER_USE_MODE |
auto |
Launch mode: auto / headed / headless |
BROWSER_USE_CHROME_BIN |
auto-detect | Absolute path to Chrome/Chromium |
BROWSER_USE_CDP_TIMEOUT |
30.0 |
CDP command timeout (seconds) |
BROWSER_USE_CDP_WS_URL |
— | Direct WebSocket URL (bypasses launcher) |
Actions Reference
| Action | Required | Optional |
|---|---|---|
navigate |
url |
|
dom_tree |
level (default: interactive) |
|
screenshot |
element_ref, full_page, path |
|
click |
one of: ref, selector, x+y |
|
type |
text |
ref |
scroll |
direction, amount |
|
hover |
ref |
|
eval_js |
expression |
|
go_back |
||
go_forward |
||
reload |
||
list_tabs |
||
new_tab |
url |
|
switch_tab |
tab_id |
|
close_tab |
tab_id |
|
close_session |
Development
# Clone
git clone https://github.com/orcakit/harness-browser.git
cd harness-browser
# Install with dev extras
uv sync --extra dev
# Install pre-commit hooks
pre-commit install
# Run tests
make test
# Lint + type check
make lint
# Format
make format
# Build wheel
make build
Related Projects
| Project | Description |
|---|---|
| harness-agent | Production-grade AI agent platform built on LangChain Deep Agents |
| harness-memory | Pluggable memory system with hierarchical recall and FTS |
| harness-im-bridge | Multi-platform IM channel bridge for AI agents |
Contributing
See CONTRIBUTING.md.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file harness_browser-0.1.3.tar.gz.
File metadata
- Download URL: harness_browser-0.1.3.tar.gz
- Upload date:
- Size: 295.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3d18eb3e30635c9a0e43be489dfeb45d96141093333df8dc645d957272e9533
|
|
| MD5 |
5d30f16a2c6a7a918d46eb2796a42e6f
|
|
| BLAKE2b-256 |
721663fb953dfdce424d3eb5f646b0e50343af996cae60de3fb8c45ea35fd044
|
File details
Details for the file harness_browser-0.1.3-py3-none-any.whl.
File metadata
- Download URL: harness_browser-0.1.3-py3-none-any.whl
- Upload date:
- Size: 47.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"12","id":"bookworm","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83f319eae340c77813169f4b4246340e55267cfdcd5b0c05295fa4cd2e2bafb3
|
|
| MD5 |
b96e684ee794ba7d2a13956a718f890c
|
|
| BLAKE2b-256 |
8d3cff99fd089ac4df63f7319babe484340753e7d83c6f3117ee980a63efdace
|