Skip to main content

Python SDK for Owl Browser automation - async-first with dynamic OpenAPI method generation

Project description

Owl Browser Python SDK v2

Async-first Python SDK for Owl Browser automation with dynamic OpenAPI method generation and flow execution support.

Features

  • Dynamic Method Generation: Methods are automatically generated from the OpenAPI schema
  • Async-First Design: Built with asyncio for optimal performance
  • Sync Wrappers: Convenience methods for non-async code
  • Flow Execution: Execute test flows with variable resolution and expectations
  • Type Safety: Full type hints with Python 3.12+ features
  • Connection Pooling: Efficient HTTP connection management
  • Retry Logic: Automatic retries with exponential backoff

Installation

pip install owl-browser

For development:

pip install owl-browser[dev]

Quick Start

Connection Modes

The SDK supports two connection modes depending on your deployment:

from owl_browser import OwlBrowser, RemoteConfig

# Production (via nginx proxy) - this is the default
# Uses /api prefix: https://your-domain.com/api/execute/...
config = RemoteConfig(
    url="https://your-domain.com",
    token="your-token"
)

# Development (direct to http-server on port 8080)
# No prefix: http://localhost:8080/execute/...
config = RemoteConfig(
    url="http://localhost:8080",
    token="test-token",
    api_prefix=""  # Empty string for direct connection
)

Async Usage (Recommended)

import asyncio
from owl_browser import OwlBrowser, RemoteConfig

async def main():
    config = RemoteConfig(
        url="https://your-domain.com",
        token="your-secret-token"
    )

    async with OwlBrowser(config) as browser:
        # Create a browser context
        ctx = await browser.create_context()
        context_id = ctx["context_id"]

        # Navigate to a page
        await browser.navigate(context_id=context_id, url="https://example.com")

        # Click an element
        await browser.click(context_id=context_id, selector="button#submit")

        # Take a screenshot
        screenshot = await browser.screenshot(context_id=context_id)

        # Extract text content
        text = await browser.extract_text(context_id=context_id, selector="h1")
        print(f"Page title: {text}")

        # Close the context
        await browser.close_context(context_id=context_id)

asyncio.run(main())

Sync Usage

from owl_browser import OwlBrowser, RemoteConfig

config = RemoteConfig(
    url="http://localhost:8080",
    token="your-secret-token"
)

browser = OwlBrowser(config)
browser.connect_sync()

# Execute tools synchronously
ctx = browser.execute_sync("browser_create_context")
browser.execute_sync("browser_navigate", context_id=ctx["context_id"], url="https://example.com")
browser.execute_sync("browser_close_context", context_id=ctx["context_id"])

browser.close_sync()

Authentication

Bearer Token

config = RemoteConfig(
    url="http://localhost:8080",
    token="your-secret-token"
)

JWT Authentication

from owl_browser import RemoteConfig, AuthMode, JWTConfig

config = RemoteConfig(
    url="http://localhost:8080",
    auth_mode=AuthMode.JWT,
    jwt=JWTConfig(
        private_key_path="/path/to/private.pem",
        expires_in=3600,  # 1 hour
        refresh_threshold=300,  # Refresh 5 minutes before expiry
        issuer="my-app",
        subject="user-123"
    )
)

Flow Execution

Execute JSON-described automation flows with declarative assertions, variable resolution, conditional branching, loops, retries, and notifications.

from owl_browser import OwlBrowser, RemoteConfig
from owl_browser.flow import FlowExecutor

async def run_flow():
    async with OwlBrowser(RemoteConfig(...)) as browser:
        ctx = await browser.create_context()
        executor = FlowExecutor(browser, ctx["context_id"])

        flow = FlowExecutor.load_flow("test-flows/navigation.json")
        result = await executor.execute(flow)

        if result.success:
            print(f"Flow completed in {result.total_duration_ms:.0f}ms")
        else:
            print(f"Flow failed: {result.error}")

        await browser.close_context(context_id=ctx["context_id"])

For the full reference — flow file schema, every step-level field (expected, condition, for_each, capture, optional, timeoutMs, retry), variable scopes (${prev}, ${vars}, ${params}), the FlowNotifier API, troubleshooting, and ~30 worked examples drawn from enterprise/test-flows/ — see docs/FLOW_EXECUTOR.md.

Playwright-Compatible API

Drop-in Playwright API that translates Playwright calls to Owl Browser tools. Use your existing Playwright code with Owl Browser's antidetect capabilities.

from owl_browser.playwright import chromium, devices

async def main():
    browser = await chromium.connect("http://localhost:8080", token="your-token")
    context = await browser.new_context(**devices["iPhone 15 Pro"])
    page = await context.new_page()

    await page.goto("https://example.com")
    await page.click("button#submit")
    await page.fill("#search", "query")

    text = await page.text_content("h1")
    await page.screenshot(path="page.png")

    # Locators
    button = page.locator("button.primary")
    await button.click()

    # Playwright-style selectors
    login = page.get_by_role("button", name="Log in")
    search = page.get_by_placeholder("Enter email")
    heading = page.get_by_text("Welcome")

    await context.close()
    await browser.close()

Supported features: Page navigation, click/fill/type/press, locators (CSS, text, role, test-id, xpath), frames, keyboard & mouse input, screenshots, network interception (route/unroute), dialogs, downloads, viewport emulation, and 20+ device descriptors (iPhone, Pixel, Galaxy, iPad, Desktop).

Data Extraction

Universal structured data extraction from any website — CSS selectors, auto-detection, tables, metadata, and multi-page scraping with pagination. No AI dependencies, works deterministically with BeautifulSoup.

from owl_browser import OwlBrowser, RemoteConfig
from owl_browser.extraction import Extractor

async def main():
    async with OwlBrowser(RemoteConfig(url="...", token="...")) as browser:
        ctx = await browser.create_context()
        ex = Extractor(browser, ctx["context_id"])
        await ex.goto("https://example.com/products")

        # CSS selector extraction
        products = await ex.select(".product-card", {
            "name": "h3",
            "price": ".price",
            "image": "img@src",
            "link": "a@href",
        })

        # Auto-detect repeating patterns (zero-config)
        patterns = await ex.detect()

        # Multi-page scraping with automatic pagination
        result = await ex.scrape(".product-card", {
            "fields": {"name": "h3", "price": ".price", "sku": "@data-sku"},
            "max_pages": 10,
            "deduplicate_by": "sku",
        })
        print(f"{result['total_items']} items from {result['pages_scraped']} pages")

Capabilities:

Method Description
select() / select_first() Extract with CSS selectors and field specs ("selector", "selector@attr", object specs with transforms)
table() / grid() / definition_list() Parse <table>, CSS grid/flexbox, and <dl> structures
meta() / json_ld() Extract OpenGraph, Twitter Card, JSON-LD, microdata, feeds
detect() / detect_and_extract() Auto-discover repeating DOM patterns
lists() Extract list/card containers with auto-field inference
scrape() Multi-page with pagination detection (click-next, URL patterns, buttons, load-more, infinite scroll)
clean() Remove cookie banners, modals, fixed elements, ads
html() / markdown() / text() Raw content with cleaning levels

All extraction functions are also available as standalone pure functions for use without a browser connection.

Available Tools

Methods are dynamically generated from the server's OpenAPI schema. Common tools include:

Context Management

  • create_context() - Create a new browser context
  • close_context(context_id) - Close a context

Navigation

  • navigate(context_id, url) - Navigate to URL
  • reload(context_id) - Reload page
  • go_back(context_id) - Navigate back
  • go_forward(context_id) - Navigate forward

Interaction

  • click(context_id, selector) - Click element
  • type(context_id, selector, text) - Type text
  • press_key(context_id, key) - Press keyboard key

Content Extraction

  • extract_text(context_id, selector) - Extract text
  • get_html(context_id) - Get page HTML
  • screenshot(context_id) - Take screenshot

AI Features

  • summarize_page(context_id) - Summarize page content
  • query_page(context_id, query) - Ask questions about page
  • solve_captcha(context_id) - Solve CAPTCHA challenges

Use browser.list_tools() to see all available tools.

Error Handling

from owl_browser import (
    OwlBrowserError,
    ConnectionError,
    AuthenticationError,
    ToolExecutionError,
    TimeoutError,
)

try:
    async with OwlBrowser(config) as browser:
        await browser.navigate(context_id="invalid", url="https://example.com")
except AuthenticationError as e:
    print(f"Authentication failed: {e}")
except ToolExecutionError as e:
    print(f"Tool {e.tool_name} failed: {e.message}")
except TimeoutError as e:
    print(f"Operation timed out: {e}")
except ConnectionError as e:
    print(f"Connection failed: {e}")

Configuration Options

from owl_browser import RemoteConfig, RetryConfig

config = RemoteConfig(
    url="https://your-domain.com",
    token="secret",

    # Timeout settings
    timeout=30.0,  # seconds

    # Concurrency
    max_concurrent=10,

    # Retry configuration
    retry=RetryConfig(
        max_retries=3,
        initial_delay_ms=100,
        max_delay_ms=10000,
        backoff_multiplier=2.0,
        jitter_factor=0.1
    ),

    # API prefix - determines URL structure for API calls
    # Default: "/api" (production via nginx proxy)
    # Set to "" for direct connection to http-server (development)
    api_prefix="/api",

    # SSL verification
    verify_ssl=True
)

API Reference

OwlBrowser

  • connect() / connect_sync() - Connect to server
  • close() / close_sync() - Close connection
  • execute(tool_name, **params) / execute_sync(...) - Execute any tool
  • health_check() - Check server health
  • list_tools() - List all tool names
  • list_methods() - List all method names
  • get_tool(name) - Get tool definition

FlowExecutor

  • execute(flow) - Execute a flow
  • abort() - Abort current execution
  • reset() - Reset abort flag and clear vars/params
  • set_params(params) - Override flow ${params.NAME} defaults
  • set_event_metadata(metadata) - Attach metadata to every emitted FlowEvent
  • load_flow(path) - Load flow from JSON file (static)

Full reference: docs/FLOW_EXECUTOR.md.

Extractor

  • goto(url, wait_for_idle=True) - Navigate to URL
  • select(selector, fields) - Extract from all matches
  • select_first(selector, fields) - Extract first match
  • count(selector) - Count matching elements
  • table(selector, options) - Parse HTML tables
  • grid(container, item) - Parse CSS grids
  • definition_list(selector) - Parse <dl> lists
  • detect_tables() - Auto-detect tables
  • meta() - Extract page metadata
  • json_ld() - Extract JSON-LD
  • detect(options) - Detect repeating patterns
  • detect_and_extract(options) - Detect + extract
  • lists(selector, options) - Extract lists/cards
  • scrape(selector, options) - Multi-page scrape
  • abort_scrape() - Abort running scrape
  • clean(options) - Remove obstructions
  • html(clean_level) - Get page HTML
  • markdown() - Get page markdown
  • text(selector, regex) - Get filtered text
  • detect_site() - Detect site type
  • site_data(template) - Site-specific extraction

Requirements

  • Python 3.12+
  • aiohttp >= 3.9.0
  • pyjwt[crypto] >= 2.8.0
  • cryptography >= 42.0.0
  • beautifulsoup4 >= 4.12.0

License

MIT License - see LICENSE file for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

owl_browser-2.1.0.tar.gz (148.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

owl_browser-2.1.0-py3-none-any.whl (172.2 kB view details)

Uploaded Python 3

File details

Details for the file owl_browser-2.1.0.tar.gz.

File metadata

  • Download URL: owl_browser-2.1.0.tar.gz
  • Upload date:
  • Size: 148.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for owl_browser-2.1.0.tar.gz
Algorithm Hash digest
SHA256 4179b21b9fc56e46f0bbc84510373b6b0c39fe1e43a514e02ca1b094c7b35114
MD5 8559e15a944ed645da8aa4ae7b11d08e
BLAKE2b-256 135c12c35134515e1bd880fb834e72f8ad4e62e8a7eebbe9fb9602aa34062a25

See more details on using hashes here.

File details

Details for the file owl_browser-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: owl_browser-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 172.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for owl_browser-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ed319139c1bea2507567b4095099c9cd17844877c88575800de8ff1d66855817
MD5 a16601e35621da31727f3608fb8f747f
BLAKE2b-256 309d1615b7d82cf0d9cee274d2937c764bf4f51096a53c841e4233dc3e791478

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page