Skip to main content

Python SDK for Owl Browser automation - async-first with dynamic OpenAPI method generation

Project description

Owl Browser Python SDK v2

Async-first Python SDK for Owl Browser automation with dynamic OpenAPI method generation and flow execution support.

Features

  • Dynamic Method Generation: Methods are automatically generated from the OpenAPI schema
  • Async-First Design: Built with asyncio for optimal performance
  • Sync Wrappers: Convenience methods for non-async code
  • Flow Execution: Execute test flows with variable resolution and expectations
  • Type Safety: Full type hints with Python 3.12+ features
  • Connection Pooling: Efficient HTTP connection management
  • Retry Logic: Automatic retries with exponential backoff

Installation

pip install owl-browser

For development:

pip install owl-browser[dev]

Quick Start

Connection Modes

The SDK supports two connection modes depending on your deployment:

from owl_browser import OwlBrowser, RemoteConfig

# Production (via nginx proxy) - this is the default
# Uses /api prefix: https://your-domain.com/api/execute/...
config = RemoteConfig(
    url="https://your-domain.com",
    token="your-token"
)

# Development (direct to http-server on port 8080)
# No prefix: http://localhost:8080/execute/...
config = RemoteConfig(
    url="http://localhost:8080",
    token="test-token",
    api_prefix=""  # Empty string for direct connection
)

Async Usage (Recommended)

import asyncio
from owl_browser import OwlBrowser, RemoteConfig

async def main():
    config = RemoteConfig(
        url="https://your-domain.com",
        token="your-secret-token"
    )

    async with OwlBrowser(config) as browser:
        # Create a browser context
        ctx = await browser.create_context()
        context_id = ctx["context_id"]

        # Navigate to a page
        await browser.navigate(context_id=context_id, url="https://example.com")

        # Click an element
        await browser.click(context_id=context_id, selector="button#submit")

        # Take a screenshot
        screenshot = await browser.screenshot(context_id=context_id)

        # Extract text content
        text = await browser.extract_text(context_id=context_id, selector="h1")
        print(f"Page title: {text}")

        # Close the context
        await browser.close_context(context_id=context_id)

asyncio.run(main())

Sync Usage

from owl_browser import OwlBrowser, RemoteConfig

config = RemoteConfig(
    url="http://localhost:8080",
    token="your-secret-token"
)

browser = OwlBrowser(config)
browser.connect_sync()

# Execute tools synchronously
ctx = browser.execute_sync("browser_create_context")
browser.execute_sync("browser_navigate", context_id=ctx["context_id"], url="https://example.com")
browser.execute_sync("browser_close_context", context_id=ctx["context_id"])

browser.close_sync()

Authentication

Bearer Token

config = RemoteConfig(
    url="http://localhost:8080",
    token="your-secret-token"
)

JWT Authentication

from owl_browser import RemoteConfig, AuthMode, JWTConfig

config = RemoteConfig(
    url="http://localhost:8080",
    auth_mode=AuthMode.JWT,
    jwt=JWTConfig(
        private_key_path="/path/to/private.pem",
        expires_in=3600,  # 1 hour
        refresh_threshold=300,  # Refresh 5 minutes before expiry
        issuer="my-app",
        subject="user-123"
    )
)

Flow Execution

Execute test flows from JSON files (compatible with Owl Browser frontend format):

from owl_browser import OwlBrowser, RemoteConfig
from owl_browser.flow import FlowExecutor

async def run_flow():
    async with OwlBrowser(RemoteConfig(...)) as browser:
        ctx = await browser.create_context()
        executor = FlowExecutor(browser, ctx["context_id"])

        # Load and execute a flow
        flow = FlowExecutor.load_flow("test-flows/navigation.json")
        result = await executor.execute(flow)

        if result.success:
            print(f"Flow completed in {result.total_duration_ms:.0f}ms")
            for step in result.steps:
                print(f"  [{step.step_index}] {step.tool_name}: {'OK' if step.success else 'FAIL'}")
        else:
            print(f"Flow failed: {result.error}")

        await browser.close_context(context_id=ctx["context_id"])

Flow JSON Format

{
  "name": "Navigation Test",
  "description": "Test navigation tools",
  "steps": [
    {
      "type": "browser_navigate",
      "url": "https://example.com",
      "selected": true,
      "description": "Navigate to example.com"
    },
    {
      "type": "browser_extract_text",
      "selector": "h1",
      "selected": true,
      "expected": {
        "contains": "Example"
      }
    }
  ]
}

Variable Resolution

Use ${prev} to reference the previous step's result:

{
  "steps": [
    {
      "type": "browser_get_page_info",
      "description": "Get page info"
    },
    {
      "type": "browser_navigate",
      "url": "${prev.url}/about",
      "description": "Navigate to about page"
    }
  ]
}

Expectations

Validate step results with expectations:

{
  "type": "browser_extract_text",
  "selector": "#count",
  "expected": {
    "greaterThan": 0,
    "field": "length"
  }
}

Supported expectations:

  • equals: Exact match
  • contains: String contains
  • length: Array/string length
  • greaterThan: Numeric comparison
  • lessThan: Numeric comparison
  • notEmpty: Not null/undefined/empty
  • matches: Regex pattern match
  • field: Nested field path (e.g., "data.count")

Available Tools

Methods are dynamically generated from the server's OpenAPI schema. Common tools include:

Context Management

  • create_context() - Create a new browser context
  • close_context(context_id) - Close a context

Navigation

  • navigate(context_id, url) - Navigate to URL
  • reload(context_id) - Reload page
  • go_back(context_id) - Navigate back
  • go_forward(context_id) - Navigate forward

Interaction

  • click(context_id, selector) - Click element
  • type(context_id, selector, text) - Type text
  • press_key(context_id, key) - Press keyboard key

Content Extraction

  • extract_text(context_id, selector) - Extract text
  • get_html(context_id) - Get page HTML
  • screenshot(context_id) - Take screenshot

AI Features

  • summarize_page(context_id) - Summarize page content
  • query_page(context_id, query) - Ask questions about page
  • solve_captcha(context_id) - Solve CAPTCHA challenges

Use browser.list_tools() to see all available tools.

Error Handling

from owl_browser import (
    OwlBrowserError,
    ConnectionError,
    AuthenticationError,
    ToolExecutionError,
    TimeoutError,
)

try:
    async with OwlBrowser(config) as browser:
        await browser.navigate(context_id="invalid", url="https://example.com")
except AuthenticationError as e:
    print(f"Authentication failed: {e}")
except ToolExecutionError as e:
    print(f"Tool {e.tool_name} failed: {e.message}")
except TimeoutError as e:
    print(f"Operation timed out: {e}")
except ConnectionError as e:
    print(f"Connection failed: {e}")

Configuration Options

from owl_browser import RemoteConfig, RetryConfig

config = RemoteConfig(
    url="https://your-domain.com",
    token="secret",

    # Timeout settings
    timeout=30.0,  # seconds

    # Concurrency
    max_concurrent=10,

    # Retry configuration
    retry=RetryConfig(
        max_retries=3,
        initial_delay_ms=100,
        max_delay_ms=10000,
        backoff_multiplier=2.0,
        jitter_factor=0.1
    ),

    # API prefix - determines URL structure for API calls
    # Default: "/api" (production via nginx proxy)
    # Set to "" for direct connection to http-server (development)
    api_prefix="/api",

    # SSL verification
    verify_ssl=True
)

Requirements

  • Python 3.12+
  • aiohttp >= 3.9.0
  • pyjwt[crypto] >= 2.8.0
  • cryptography >= 42.0.0

License

MIT License - see LICENSE file for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

owl_browser-2.0.2.tar.gz (56.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

owl_browser-2.0.2-py3-none-any.whl (64.5 kB view details)

Uploaded Python 3

File details

Details for the file owl_browser-2.0.2.tar.gz.

File metadata

  • Download URL: owl_browser-2.0.2.tar.gz
  • Upload date:
  • Size: 56.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for owl_browser-2.0.2.tar.gz
Algorithm Hash digest
SHA256 353e2eddcef450565e148d3fc6ea5e0c22b7abb2eca671c2f670b50a3c938eae
MD5 a2d47713bbe52c8afe871a7ec06e7c8f
BLAKE2b-256 d134cb227e989fc8796d5fd0cb855659b977f74245ed4c996af9681aaa7b635c

See more details on using hashes here.

File details

Details for the file owl_browser-2.0.2-py3-none-any.whl.

File metadata

  • Download URL: owl_browser-2.0.2-py3-none-any.whl
  • Upload date:
  • Size: 64.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for owl_browser-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5b859c9de191a7afb97ebf66190a5e5b9ec3b297e86206802c9423234f676aa7
MD5 0b4c57e7eccabea651a99e41118ed8bb
BLAKE2b-256 51918de8f3b7cd7c9f627a5ba9e9c70f3192bea970812565b1463b3bbf8c7566

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page