Skip to main content

Pydantic AI Toolsets for browser use

Project description

pai-browser-use

Release Build status codecov Commit activity License

Pydantic AI Toolsets for browser automation using Chrome DevTools Protocol (CDP).

Inspired by browser-use, designed for Pydantic AI agents.

Features

  • Browser Automation Tools: Navigation, state inspection, interaction, and element queries
  • Multi-Modal Screenshots: Automatic image splitting for long pages with ToolReturn support
  • Type-Safe CDP Integration: Direct access to cdp-use API with full type hints
  • Fully Tested: Comprehensive test suite with Docker-based Chrome container

Installation

Use pip:

pip install pai-browser-use

Or use uv:

uv add pai-browser-use

Quick Start

Prerequisites

Start a Chrome instance with CDP enabled:

# Option 1: Using Chrome directly
google-chrome --remote-debugging-port=9222

# Option 2: Using Docker container
./dev/start-browser-container.sh

Basic Usage

import os
from pydantic_ai import Agent
from pai_browser_use import BrowserUseToolset

agent = Agent(
    model="anthropic:claude-sonnet-4-5",
    system_prompt="You are a helpful assistant.",
    toolsets=[
        BrowserUseToolset(cdp_url="http://localhost:9222/json/version"),
    ],
)

result = await agent.run("Find the number of stars of the wh1isper/pai-browser-use repo")
print(result.output)

See examples/agent.py for a complete example.

Available Tools

Navigation (4 tools)

  • navigate_to_url, go_back, go_forward, reload_page

State Inspection (5 tools)

  • get_page_info, get_page_content, take_screenshot, take_element_screenshot, get_viewport_info

Interaction (4 tools)

  • click_element, type_text, execute_javascript, scroll_to

Query (3 tools)

  • find_elements, get_element_text, get_element_attributes

Logging

The project includes detailed INFO level logging for debugging and development. By default, only ERROR logs are shown.

Enable Detailed Logging

# Set log level via environment variable
export PAI_BROWSER_USE_LOG_LEVEL=INFO

# Then run your script
python your_script.py

Available Log Levels

  • ERROR (default): Only show errors
  • WARNING: Show warnings and errors
  • INFO: Show detailed operation flow
  • DEBUG: Show all debugging information including actual data

What Gets Logged

INFO level shows:

  • CDP connection establishment
  • Browser target creation/reuse
  • Tool execution lifecycle
  • Page navigation steps
  • Screenshot capture and processing
  • Element interactions and queries
  • Session state updates
  • Resource cleanup

DEBUG level additionally shows:

  • Extracted text content (first 500 characters)
  • HTML content preview (first 500 characters)
  • Element details (tag, text, attributes, position)
  • JavaScript execution results
  • Tool call arguments and return types
  • Full page information (URL, title, viewport)
  • All intermediate data for debugging

Examples

INFO Level - See operation flow:

import os
os.environ["PAI_BROWSER_USE_LOG_LEVEL"] = "INFO"

from pai_browser_use import BrowserUseToolset

async with BrowserUseToolset(cdp_url="http://localhost:9222/json/version") as toolset:
    # Shows what operations are being performed
    pass

DEBUG Level - See actual data:

import os
os.environ["PAI_BROWSER_USE_LOG_LEVEL"] = "DEBUG"

from pai_browser_use.tools.state import get_page_content

# Will show the actual text/HTML extracted
content = await get_page_content(content_format="text")
# DEBUG log shows: "Text content preview (first 500 chars): ..."

Development

# Install dependencies
uv sync

# Run tests
pytest tests/

# Run example
python examples/agent.py

# Try DEBUG logging demo (shows extracted content)
PAI_BROWSER_USE_LOG_LEVEL=DEBUG python demo_debug_logging.py

License

BSD 3-Clause License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pai_browser_use-0.1.1.tar.gz (161.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pai_browser_use-0.1.1-py3-none-any.whl (22.1 kB view details)

Uploaded Python 3

File details

Details for the file pai_browser_use-0.1.1.tar.gz.

File metadata

  • Download URL: pai_browser_use-0.1.1.tar.gz
  • Upload date:
  • Size: 161.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.2

File hashes

Hashes for pai_browser_use-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7a74603e233de1125d73b799f142be2180835e856b1fe65a73ca2d08f6e04c89
MD5 921547839f46c780be57669a8e67c0a7
BLAKE2b-256 b50ed23fb825955d9ae9a04d850b32087254c31430687f675c83476dbd3c25ce

See more details on using hashes here.

File details

Details for the file pai_browser_use-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pai_browser_use-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9430a98fd4471dadcbf8165cfb51fac60cb675228d3e9b235aac42e4c775c171
MD5 22c9941cc1efae97949da2587af9fe40
BLAKE2b-256 a000344e19c52297a0e7be85f8479d295ba0d9973ce86c7090b7f490a2383e51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page