CLI tool for AI agents to observe and interact with Chrome via CDP
Project description
chrome-agent
A CLI tool that gives AI coding agents the ability to observe and interact with Chrome browsers.
Built as a replacement for browser MCP tools. Faster, lower token overhead, and supports something MCP tools can't do: multiple agents sharing the same browser instance.
Why this exists
AI coding agents need to see and interact with browsers -- to test their code, debug automation, inspect page state. The standard approach (browser MCP tools) uses a persistent server with protocol negotiation and verbose response formatting. chrome-agent takes a different approach: each command is a standalone CLI call that connects to Chrome via the DevTools Protocol, does one thing, and disconnects. No server, no session state, no bloat.
This also enables a workflow that MCP tools can't support: one process drives the browser (your automation code) while a separate agent observes the same browser to diagnose issues and improve the code.
Installation
uv tool install chrome-agent
playwright install chromium
Or add to a project:
uv add chrome-agent
uv run playwright install chromium
Two ways to use it
Drive mode -- you control the browser
Launch a browser and interact with it directly. This is the MCP replacement use case.
chrome-agent launch &
chrome-agent navigate "https://example.com"
chrome-agent text # Read page content
chrome-agent element "h1" # Inspect an element
chrome-agent fill "#search" "query" # Fill a form field
chrome-agent click "#submit" # Click a button
chrome-agent screenshot /tmp/page.png # Capture the screen
Attach mode -- observe a running browser
Your automation code launches a browser with --remote-debugging-port=9222. You connect to observe what the code is doing, diagnose failures, and figure out what to change.
chrome-agent status # Is the browser running?
chrome-agent url # Where is it?
chrome-agent element "#submit-btn" # Why can't the code click this?
chrome-agent eval "document.querySelectorAll('.error').length"
chrome-agent screenshot # What does it look like?
The feedback loop: write code -> run it -> observe the browser -> diagnose -> modify code -> repeat.
Commands
chrome-agent [--port PORT] <command> [args...]
Check browser status
status Check if a browser is running on the CDP port
launch Launch a browser with CDP enabled
[--fingerprint PATH] [--headless] [--no-pin-desktop]
help Print command reference
Observe (read-only, always safe)
url Print current URL and page title
screenshot [path] Save a screenshot (default: /tmp/cdp-screenshot.png)
snapshot Print the ARIA accessibility tree
text Print visible text content
html [selector] Print page HTML or a specific element's HTML
element <selector> Detailed element inspection (visibility, dimensions,
attributes, position, disabled state)
find <selector> Count and list all matching elements
value <selector> Get an input element's current value
eval <code> Execute JavaScript and print the result
cookies List all cookies
tabs List all open tabs/pages
wait <target> Wait for a selector, milliseconds, or load state
Navigate
navigate <url> Go to a URL
back Browser back
forward Browser forward
reload Reload the page
Interact
click <selector> Click an element (JS fallback for hidden elements)
fill <selector> <val> Fill a form field (clears first)
type <selector> <txt> Type text character by character
press <key> Press a keyboard key (Enter, Escape, Tab, etc.)
select <sel> <value> Select a dropdown option
check <selector> Check a checkbox
uncheck <selector> Uncheck a checkbox
hover <selector> Hover over an element
scroll <target> Scroll to element, or scroll up/down
clickxy <x> <y> Click at page coordinates
close Close the current page
viewport <w> <h> Resize the viewport
For AI agents
The primary user of this tool is an AI coding agent, not a human. See INSTRUCTIONS.md for comprehensive agent instructions covering:
- Drive mode vs attach mode mental model
- Safety rules for shared browser access
- The development feedback loop
- When to observe vs intervene
- Command recipes for common tasks
- Failure modes and recovery
Include the contents of INSTRUCTIONS.md in your project's CLAUDE.md or agent instructions file.
Browser fingerprinting (optional)
For sites that detect automated browsers, launch with a fingerprint profile:
chrome-agent launch --fingerprint path/to/fingerprint.json
The fingerprint JSON overrides the browser's user agent, viewport, locale, timezone, and platform to match a real desktop browser:
{
"userAgent": "Mozilla/5.0 (X11; Linux x86_64) ...",
"platform": "Linux x86_64",
"vendor": "Google Inc.",
"language": "en-US",
"timezone": "America/Chicago",
"viewport": {"width": 1920, "height": 1080}
}
Without --fingerprint, the browser launches with default Chromium settings.
Requirements
- Python >= 3.11
- Playwright >= 1.50.0
- Chromium (installed via
playwright install chromium) - Linux with xdotool (optional, for virtual desktop pinning)
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chrome_agent-0.2.1.tar.gz.
File metadata
- Download URL: chrome_agent-0.2.1.tar.gz
- Upload date:
- Size: 12.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f4b5d23e9e63f1ed985cde86bcaad17e457bdbf698e1c2055073274c56c43570
|
|
| MD5 |
e0a80ced3715c41ec9ff5979821013ad
|
|
| BLAKE2b-256 |
2e87a398ac35182c0a88ac6942e11811c1fe1df5cac6cbe182222378e166762b
|
File details
Details for the file chrome_agent-0.2.1-py3-none-any.whl.
File metadata
- Download URL: chrome_agent-0.2.1-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
591700b63f5c77c64de8f9215ac53c7e0d8d0fc79c35b64e3cc3907bcd2d5eac
|
|
| MD5 |
badcdc09604bbda53a0f1a664717fd33
|
|
| BLAKE2b-256 |
72adda99b9adac12ed0ce9702b79a9bab2f818fdf16a02c876bdeb8c764bbe82
|