Skip to main content

Pydantic AI Toolsets for browser use

Project description

pai-browser-use

Release Build status codecov Commit activity License

Pydantic AI Toolsets for browser automation using Chrome DevTools Protocol (CDP).

Inspired by browser-use, designed for Pydantic AI agents.

Features

  • Browser Automation Tools: Navigation, state inspection, interaction, and element queries
  • Multi-Modal Screenshots: Automatic image splitting for long pages with ToolReturn support
  • Type-Safe CDP Integration: Direct access to cdp-use API with full type hints
  • Fully Tested: Comprehensive test suite with Docker-based Chrome container

Installation

Use pip:

pip install pai-browser-use

Or use uv:

uv add pai-browser-use

Quick Start

Prerequisites

Start a Chrome instance with CDP enabled:

# Option 1: Using Chrome directly
google-chrome --remote-debugging-port=9222

# Option 2: Using Docker container
./dev/start-browser-container.sh

Basic Usage

import os
from pydantic_ai import Agent
from pai_browser_use import BrowserUseToolset

agent = Agent(
    model="anthropic:claude-sonnet-4-5",
    system_prompt="You are a helpful assistant.",
    toolsets=[
        BrowserUseToolset(cdp_url="http://localhost:9222/json/version"),
    ],
)

result = await agent.run("Find the number of stars of the wh1isper/pai-browser-use repo")
print(result.output)

See examples/agent.py for a complete example.

Available Tools

Navigation (4 tools)

  • navigate_to_url, go_back, go_forward, reload_page

State Inspection (5 tools)

  • get_page_info, get_page_content, take_screenshot, take_element_screenshot, get_viewport_info

Interaction (4 tools)

  • click_element, type_text, execute_javascript, scroll_to

Query (3 tools)

  • find_elements, get_element_text, get_element_attributes

Logging

The project includes detailed INFO level logging for debugging and development. By default, only ERROR logs are shown.

Enable Detailed Logging

# Set log level via environment variable
export PAI_BROWSER_USE_LOG_LEVEL=INFO

# Then run your script
python your_script.py

Available Log Levels

  • ERROR (default): Only show errors
  • WARNING: Show warnings and errors
  • INFO: Show detailed operation flow
  • DEBUG: Show all debugging information including actual data

What Gets Logged

INFO level shows:

  • CDP connection establishment
  • Browser target creation/reuse
  • Tool execution lifecycle
  • Page navigation steps
  • Screenshot capture and processing
  • Element interactions and queries
  • Session state updates
  • Resource cleanup

DEBUG level additionally shows:

  • Extracted text content (first 500 characters)
  • HTML content preview (first 500 characters)
  • Element details (tag, text, attributes, position)
  • JavaScript execution results
  • Tool call arguments and return types
  • Full page information (URL, title, viewport)
  • All intermediate data for debugging

Examples

INFO Level - See operation flow:

import os
os.environ["PAI_BROWSER_USE_LOG_LEVEL"] = "INFO"

from pai_browser_use import BrowserUseToolset

async with BrowserUseToolset(cdp_url="http://localhost:9222/json/version") as toolset:
    # Shows what operations are being performed
    pass

DEBUG Level - See actual data:

import os
os.environ["PAI_BROWSER_USE_LOG_LEVEL"] = "DEBUG"

from pai_browser_use.tools.state import get_page_content

# Will show the actual text/HTML extracted
content = await get_page_content(content_format="text")
# DEBUG log shows: "Text content preview (first 500 chars): ..."

Development

# Install dependencies
uv sync

# Run tests
pytest tests/

# Run example
python examples/agent.py

# Try DEBUG logging demo (shows extracted content)
PAI_BROWSER_USE_LOG_LEVEL=DEBUG python demo_debug_logging.py

License

BSD 3-Clause License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pai_browser_use-0.1.3.tar.gz (161.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pai_browser_use-0.1.3-py3-none-any.whl (22.8 kB view details)

Uploaded Python 3

File details

Details for the file pai_browser_use-0.1.3.tar.gz.

File metadata

  • Download URL: pai_browser_use-0.1.3.tar.gz
  • Upload date:
  • Size: 161.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.2

File hashes

Hashes for pai_browser_use-0.1.3.tar.gz
Algorithm Hash digest
SHA256 51c47a8dc275d6bedac23b8a6b71719dfdbf1e643136b10bc2c3a33a03206883
MD5 313b709fc724f9eabcb1c5916f3c3ae5
BLAKE2b-256 65e392a738bf40b08e4a75a9c9ff1859ff43c657f289a332de973caa721b39d0

See more details on using hashes here.

File details

Details for the file pai_browser_use-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for pai_browser_use-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d08d203e1d6f81b2abce38cf12d477f356fe8069ce070ad10d6c6c04bad227e2
MD5 894974ba2ae3b62494ca682971731081
BLAKE2b-256 dcfe8d9179932698d039f13faea577452974551cef10f3c43569b9554f5baf2e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page