Skip to main content

Pydantic AI Toolsets for browser use

Project description

pai-browser-use

Release Build status codecov Commit activity License

Pydantic AI Toolsets for browser automation using Chrome DevTools Protocol (CDP).

Inspired by browser-use, designed for Pydantic AI agents.

Features

  • Browser Automation Tools: Navigation, state inspection, interaction, and element queries
  • Multi-Modal Screenshots: Automatic image splitting for long pages with ToolReturn support
  • Type-Safe CDP Integration: Direct access to cdp-use API with full type hints
  • Fully Tested: Comprehensive test suite with Docker-based Chrome container

Installation

Use pip:

pip install pai-browser-use

Or use uv:

uv add pai-browser-use

Quick Start

Prerequisites

Start a Chrome instance with CDP enabled:

# Option 1: Using Chrome directly
google-chrome --remote-debugging-port=9222

# Option 2: Using Docker container
./dev/start-browser-container.sh

Basic Usage

import os
from pydantic_ai import Agent
from pai_browser_use import BrowserUseToolset

agent = Agent(
    model="anthropic:claude-sonnet-4-5",
    system_prompt="You are a helpful assistant.",
    toolsets=[
        BrowserUseToolset(cdp_url="http://localhost:9222/json/version"),
    ],
)

result = await agent.run("Find the number of stars of the wh1isper/pai-browser-use repo")
print(result.output)

See examples/agent.py for a complete example.

Available Tools

Navigation (4 tools)

  • navigate_to_url, go_back, go_forward, reload_page

State Inspection (5 tools)

  • get_page_info, get_page_content, take_screenshot, take_element_screenshot, get_viewport_info

Interaction (4 tools)

  • click_element, type_text, execute_javascript, scroll_to

Query (3 tools)

  • find_elements, get_element_text, get_element_attributes

Logging

The project includes detailed INFO level logging for debugging and development. By default, only ERROR logs are shown.

Enable Detailed Logging

# Set log level via environment variable
export PAI_BROWSER_USE_LOG_LEVEL=INFO

# Then run your script
python your_script.py

Available Log Levels

  • ERROR (default): Only show errors
  • WARNING: Show warnings and errors
  • INFO: Show detailed operation flow
  • DEBUG: Show all debugging information including actual data

What Gets Logged

INFO level shows:

  • CDP connection establishment
  • Browser target creation/reuse
  • Tool execution lifecycle
  • Page navigation steps
  • Screenshot capture and processing
  • Element interactions and queries
  • Session state updates
  • Resource cleanup

DEBUG level additionally shows:

  • Extracted text content (first 500 characters)
  • HTML content preview (first 500 characters)
  • Element details (tag, text, attributes, position)
  • JavaScript execution results
  • Tool call arguments and return types
  • Full page information (URL, title, viewport)
  • All intermediate data for debugging

Examples

INFO Level - See operation flow:

import os
os.environ["PAI_BROWSER_USE_LOG_LEVEL"] = "INFO"

from pai_browser_use import BrowserUseToolset

async with BrowserUseToolset(cdp_url="http://localhost:9222/json/version") as toolset:
    # Shows what operations are being performed
    pass

DEBUG Level - See actual data:

import os
os.environ["PAI_BROWSER_USE_LOG_LEVEL"] = "DEBUG"

from pai_browser_use.tools.state import get_page_content

# Will show the actual text/HTML extracted
content = await get_page_content(content_format="text")
# DEBUG log shows: "Text content preview (first 500 chars): ..."

Development

# Install dependencies
uv sync

# Run tests
pytest tests/

# Run example
python examples/agent.py

# Try DEBUG logging demo (shows extracted content)
PAI_BROWSER_USE_LOG_LEVEL=DEBUG python demo_debug_logging.py

License

BSD 3-Clause License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pai_browser_use-0.1.2.tar.gz (161.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pai_browser_use-0.1.2-py3-none-any.whl (22.2 kB view details)

Uploaded Python 3

File details

Details for the file pai_browser_use-0.1.2.tar.gz.

File metadata

  • Download URL: pai_browser_use-0.1.2.tar.gz
  • Upload date:
  • Size: 161.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.2

File hashes

Hashes for pai_browser_use-0.1.2.tar.gz
Algorithm Hash digest
SHA256 b3299187203de74157988e89ac0684d94346608c3ce66d7579ea0e9a423ab5a8
MD5 dd2233a967739099d2a47d708f00254e
BLAKE2b-256 91e4f4d54b70b449d07e74b103e21c7092c4c39c7aad928eb5675a18f8aa7e59

See more details on using hashes here.

File details

Details for the file pai_browser_use-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for pai_browser_use-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6b2aa3c2de97493a1a05291e1e0ee71492449d41949260d6d0a554679926c232
MD5 ddc732dea23e42af9077ecbbbcc2306d
BLAKE2b-256 8e581f87c6ddf6db524c72268dc8672d8708b5b3c7faa3f6a76244a95cdc8424

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page