Skip to main content

Pydantic AI Toolsets for browser use

Project description

pai-browser-use

Release Build status codecov Commit activity License

Pydantic AI Toolsets for browser automation using Chrome DevTools Protocol (CDP).

Inspired by browser-use, designed for Pydantic AI agents.

Features

  • Browser Automation Tools: Navigation, state inspection, interaction, and element queries
  • Multi-Modal Screenshots: Automatic image splitting for long pages with ToolReturn support
  • Type-Safe CDP Integration: Direct access to cdp-use API with full type hints
  • Fully Tested: Comprehensive test suite with Docker-based Chrome container

Installation

Use pip:

pip install pai-browser-use

Or use uv:

uv add pai-browser-use

Quick Start

Prerequisites

Start a Chrome instance with CDP enabled:

# Option 1: Using Chrome directly
google-chrome --remote-debugging-port=9222

# Option 2: Using Docker container
./dev/start-browser-container.sh

Basic Usage

import os
from pydantic_ai import Agent
from pai_browser_use import BrowserUseToolset

agent = Agent(
    model="anthropic:claude-sonnet-4-5",
    system_prompt="You are a helpful assistant.",
    toolsets=[
        BrowserUseToolset(cdp_url="http://localhost:9222/json/version"),
    ],
)

result = await agent.run("Find the number of stars of the wh1isper/pai-browser-use repo")
print(result.output)

See examples/agent.py for a complete example.

Available Tools

Navigation (4 tools)

  • navigate_to_url, go_back, go_forward, reload_page

State Inspection (5 tools)

  • get_page_info, get_page_content, take_screenshot, take_element_screenshot, get_viewport_info

Interaction (4 tools)

  • click_element, type_text, execute_javascript, scroll_to

Query (3 tools)

  • find_elements, get_element_text, get_element_attributes

Logging

The project includes detailed INFO level logging for debugging and development. By default, only ERROR logs are shown.

Enable Detailed Logging

# Set log level via environment variable
export PAI_BROWSER_USE_LOG_LEVEL=INFO

# Then run your script
python your_script.py

Available Log Levels

  • ERROR (default): Only show errors
  • WARNING: Show warnings and errors
  • INFO: Show detailed operation flow
  • DEBUG: Show all debugging information including actual data

What Gets Logged

INFO level shows:

  • CDP connection establishment
  • Browser target creation/reuse
  • Tool execution lifecycle
  • Page navigation steps
  • Screenshot capture and processing
  • Element interactions and queries
  • Session state updates
  • Resource cleanup

DEBUG level additionally shows:

  • Extracted text content (first 500 characters)
  • HTML content preview (first 500 characters)
  • Element details (tag, text, attributes, position)
  • JavaScript execution results
  • Tool call arguments and return types
  • Full page information (URL, title, viewport)
  • All intermediate data for debugging

Examples

INFO Level - See operation flow:

import os
os.environ["PAI_BROWSER_USE_LOG_LEVEL"] = "INFO"

from pai_browser_use import BrowserUseToolset

async with BrowserUseToolset(cdp_url="http://localhost:9222/json/version") as toolset:
    # Shows what operations are being performed
    pass

DEBUG Level - See actual data:

import os
os.environ["PAI_BROWSER_USE_LOG_LEVEL"] = "DEBUG"

from pai_browser_use.tools.state import get_page_content

# Will show the actual text/HTML extracted
content = await get_page_content(content_format="text")
# DEBUG log shows: "Text content preview (first 500 chars): ..."

Development

# Install dependencies
uv sync

# Run tests
pytest tests/

# Run example
python examples/agent.py

# Try DEBUG logging demo (shows extracted content)
PAI_BROWSER_USE_LOG_LEVEL=DEBUG python demo_debug_logging.py

License

BSD 3-Clause License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pai_browser_use-0.1.0.tar.gz (160.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pai_browser_use-0.1.0-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file pai_browser_use-0.1.0.tar.gz.

File metadata

  • Download URL: pai_browser_use-0.1.0.tar.gz
  • Upload date:
  • Size: 160.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.2

File hashes

Hashes for pai_browser_use-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8d6ad664fe415af54baa9b030570965cea5a9f3c88e1f52a5aaa110dd21e56b9
MD5 83a828999bbae05ddddeeebdd9c5ef41
BLAKE2b-256 9f568a0812f2adf3304e89221d2f7e411a45385c149eba751d0c455b8ae03ec2

See more details on using hashes here.

File details

Details for the file pai_browser_use-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pai_browser_use-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4c907b0d1713bc4d41520800031a1bbca434a539f279bb82d52846a71f5da8e0
MD5 d539f76399a3bf911c20dd52d3a8e2c7
BLAKE2b-256 2a86770f52560ad1c54fa4606a2303777e3365dbfd74fabde00cc6b8868e8e51

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page