Python SDK for Owl Browser automation - async-first with dynamic OpenAPI method generation
Project description
Owl Browser Python SDK v2
Async-first Python SDK for Owl Browser automation with dynamic OpenAPI method generation and flow execution support.
Features
- Dynamic Method Generation: Methods are automatically generated from the OpenAPI schema
- Async-First Design: Built with asyncio for optimal performance
- Sync Wrappers: Convenience methods for non-async code
- Flow Execution: Execute test flows with variable resolution and expectations
- Type Safety: Full type hints with Python 3.12+ features
- Connection Pooling: Efficient HTTP connection management
- Retry Logic: Automatic retries with exponential backoff
Installation
pip install owl-browser
For development:
pip install owl-browser[dev]
Quick Start
Connection Modes
The SDK supports two connection modes depending on your deployment:
from owl_browser import OwlBrowser, RemoteConfig
# Production (via nginx proxy) - this is the default
# Uses /api prefix: https://your-domain.com/api/execute/...
config = RemoteConfig(
url="https://your-domain.com",
token="your-token"
)
# Development (direct to http-server on port 8080)
# No prefix: http://localhost:8080/execute/...
config = RemoteConfig(
url="http://localhost:8080",
token="test-token",
api_prefix="" # Empty string for direct connection
)
Async Usage (Recommended)
import asyncio
from owl_browser import OwlBrowser, RemoteConfig
async def main():
config = RemoteConfig(
url="https://your-domain.com",
token="your-secret-token"
)
async with OwlBrowser(config) as browser:
# Create a browser context
ctx = await browser.create_context()
context_id = ctx["context_id"]
# Navigate to a page
await browser.navigate(context_id=context_id, url="https://example.com")
# Click an element
await browser.click(context_id=context_id, selector="button#submit")
# Take a screenshot
screenshot = await browser.screenshot(context_id=context_id)
# Extract text content
text = await browser.extract_text(context_id=context_id, selector="h1")
print(f"Page title: {text}")
# Close the context
await browser.close_context(context_id=context_id)
asyncio.run(main())
Sync Usage
from owl_browser import OwlBrowser, RemoteConfig
config = RemoteConfig(
url="http://localhost:8080",
token="your-secret-token"
)
browser = OwlBrowser(config)
browser.connect_sync()
# Execute tools synchronously
ctx = browser.execute_sync("browser_create_context")
browser.execute_sync("browser_navigate", context_id=ctx["context_id"], url="https://example.com")
browser.execute_sync("browser_close_context", context_id=ctx["context_id"])
browser.close_sync()
Authentication
Bearer Token
config = RemoteConfig(
url="http://localhost:8080",
token="your-secret-token"
)
JWT Authentication
from owl_browser import RemoteConfig, AuthMode, JWTConfig
config = RemoteConfig(
url="http://localhost:8080",
auth_mode=AuthMode.JWT,
jwt=JWTConfig(
private_key_path="/path/to/private.pem",
expires_in=3600, # 1 hour
refresh_threshold=300, # Refresh 5 minutes before expiry
issuer="my-app",
subject="user-123"
)
)
Flow Execution
Execute test flows from JSON files (compatible with Owl Browser frontend format):
from owl_browser import OwlBrowser, RemoteConfig
from owl_browser.flow import FlowExecutor
async def run_flow():
async with OwlBrowser(RemoteConfig(...)) as browser:
ctx = await browser.create_context()
executor = FlowExecutor(browser, ctx["context_id"])
# Load and execute a flow
flow = FlowExecutor.load_flow("test-flows/navigation.json")
result = await executor.execute(flow)
if result.success:
print(f"Flow completed in {result.total_duration_ms:.0f}ms")
for step in result.steps:
print(f" [{step.step_index}] {step.tool_name}: {'OK' if step.success else 'FAIL'}")
else:
print(f"Flow failed: {result.error}")
await browser.close_context(context_id=ctx["context_id"])
Flow JSON Format
{
"name": "Navigation Test",
"description": "Test navigation tools",
"steps": [
{
"type": "browser_navigate",
"url": "https://example.com",
"selected": true,
"description": "Navigate to example.com"
},
{
"type": "browser_extract_text",
"selector": "h1",
"selected": true,
"expected": {
"contains": "Example"
}
}
]
}
Variable Resolution
Use ${prev} to reference the previous step's result:
{
"steps": [
{
"type": "browser_get_page_info",
"description": "Get page info"
},
{
"type": "browser_navigate",
"url": "${prev.url}/about",
"description": "Navigate to about page"
}
]
}
Expectations
Validate step results with expectations:
{
"type": "browser_extract_text",
"selector": "#count",
"expected": {
"greaterThan": 0,
"field": "length"
}
}
Supported expectations:
equals: Exact matchcontains: String containslength: Array/string lengthgreaterThan: Numeric comparisonlessThan: Numeric comparisonnotEmpty: Not null/undefined/emptymatches: Regex pattern matchfield: Nested field path (e.g., "data.count")
Playwright-Compatible API
Drop-in Playwright API that translates Playwright calls to Owl Browser tools. Use your existing Playwright code with Owl Browser's antidetect capabilities.
from owl_browser.playwright import chromium, devices
async def main():
browser = await chromium.connect("http://localhost:8080", token="your-token")
context = await browser.new_context(**devices["iPhone 15 Pro"])
page = await context.new_page()
await page.goto("https://example.com")
await page.click("button#submit")
await page.fill("#search", "query")
text = await page.text_content("h1")
await page.screenshot(path="page.png")
# Locators
button = page.locator("button.primary")
await button.click()
# Playwright-style selectors
login = page.get_by_role("button", name="Log in")
search = page.get_by_placeholder("Enter email")
heading = page.get_by_text("Welcome")
await context.close()
await browser.close()
Supported features: Page navigation, click/fill/type/press, locators (CSS, text, role, test-id, xpath), frames, keyboard & mouse input, screenshots, network interception (route/unroute), dialogs, downloads, viewport emulation, and 20+ device descriptors (iPhone, Pixel, Galaxy, iPad, Desktop).
Data Extraction
Universal structured data extraction from any website — CSS selectors, auto-detection, tables, metadata, and multi-page scraping with pagination. No AI dependencies, works deterministically with BeautifulSoup.
from owl_browser import OwlBrowser, RemoteConfig
from owl_browser.extraction import Extractor
async def main():
async with OwlBrowser(RemoteConfig(url="...", token="...")) as browser:
ctx = await browser.create_context()
ex = Extractor(browser, ctx["context_id"])
await ex.goto("https://example.com/products")
# CSS selector extraction
products = await ex.select(".product-card", {
"name": "h3",
"price": ".price",
"image": "img@src",
"link": "a@href",
})
# Auto-detect repeating patterns (zero-config)
patterns = await ex.detect()
# Multi-page scraping with automatic pagination
result = await ex.scrape(".product-card", {
"fields": {"name": "h3", "price": ".price", "sku": "@data-sku"},
"max_pages": 10,
"deduplicate_by": "sku",
})
print(f"{result['total_items']} items from {result['pages_scraped']} pages")
Capabilities:
| Method | Description |
|---|---|
select() / select_first() |
Extract with CSS selectors and field specs ("selector", "selector@attr", object specs with transforms) |
table() / grid() / definition_list() |
Parse <table>, CSS grid/flexbox, and <dl> structures |
meta() / json_ld() |
Extract OpenGraph, Twitter Card, JSON-LD, microdata, feeds |
detect() / detect_and_extract() |
Auto-discover repeating DOM patterns |
lists() |
Extract list/card containers with auto-field inference |
scrape() |
Multi-page with pagination detection (click-next, URL patterns, buttons, load-more, infinite scroll) |
clean() |
Remove cookie banners, modals, fixed elements, ads |
html() / markdown() / text() |
Raw content with cleaning levels |
All extraction functions are also available as standalone pure functions for use without a browser connection.
Available Tools
Methods are dynamically generated from the server's OpenAPI schema. Common tools include:
Context Management
create_context()- Create a new browser contextclose_context(context_id)- Close a context
Navigation
navigate(context_id, url)- Navigate to URLreload(context_id)- Reload pagego_back(context_id)- Navigate backgo_forward(context_id)- Navigate forward
Interaction
click(context_id, selector)- Click elementtype(context_id, selector, text)- Type textpress_key(context_id, key)- Press keyboard key
Content Extraction
extract_text(context_id, selector)- Extract textget_html(context_id)- Get page HTMLscreenshot(context_id)- Take screenshot
AI Features
summarize_page(context_id)- Summarize page contentquery_page(context_id, query)- Ask questions about pagesolve_captcha(context_id)- Solve CAPTCHA challenges
Use browser.list_tools() to see all available tools.
Error Handling
from owl_browser import (
OwlBrowserError,
ConnectionError,
AuthenticationError,
ToolExecutionError,
TimeoutError,
)
try:
async with OwlBrowser(config) as browser:
await browser.navigate(context_id="invalid", url="https://example.com")
except AuthenticationError as e:
print(f"Authentication failed: {e}")
except ToolExecutionError as e:
print(f"Tool {e.tool_name} failed: {e.message}")
except TimeoutError as e:
print(f"Operation timed out: {e}")
except ConnectionError as e:
print(f"Connection failed: {e}")
Configuration Options
from owl_browser import RemoteConfig, RetryConfig
config = RemoteConfig(
url="https://your-domain.com",
token="secret",
# Timeout settings
timeout=30.0, # seconds
# Concurrency
max_concurrent=10,
# Retry configuration
retry=RetryConfig(
max_retries=3,
initial_delay_ms=100,
max_delay_ms=10000,
backoff_multiplier=2.0,
jitter_factor=0.1
),
# API prefix - determines URL structure for API calls
# Default: "/api" (production via nginx proxy)
# Set to "" for direct connection to http-server (development)
api_prefix="/api",
# SSL verification
verify_ssl=True
)
API Reference
OwlBrowser
connect() / connect_sync()- Connect to serverclose() / close_sync()- Close connectionexecute(tool_name, **params) / execute_sync(...)- Execute any toolhealth_check()- Check server healthlist_tools()- List all tool nameslist_methods()- List all method namesget_tool(name)- Get tool definition
FlowExecutor
execute(flow)- Execute a flowabort()- Abort current executionreset()- Reset abort flagload_flow(path)- Load flow from JSON file
Extractor
goto(url, wait_for_idle=True)- Navigate to URLselect(selector, fields)- Extract from all matchesselect_first(selector, fields)- Extract first matchcount(selector)- Count matching elementstable(selector, options)- Parse HTML tablesgrid(container, item)- Parse CSS gridsdefinition_list(selector)- Parse<dl>listsdetect_tables()- Auto-detect tablesmeta()- Extract page metadatajson_ld()- Extract JSON-LDdetect(options)- Detect repeating patternsdetect_and_extract(options)- Detect + extractlists(selector, options)- Extract lists/cardsscrape(selector, options)- Multi-page scrapeabort_scrape()- Abort running scrapeclean(options)- Remove obstructionshtml(clean_level)- Get page HTMLmarkdown()- Get page markdowntext(selector, regex)- Get filtered textdetect_site()- Detect site typesite_data(template)- Site-specific extraction
Requirements
- Python 3.12+
- aiohttp >= 3.9.0
- pyjwt[crypto] >= 2.8.0
- cryptography >= 42.0.0
- beautifulsoup4 >= 4.12.0
License
MIT License - see LICENSE file for details.
Links
- Website: https://www.owlbrowser.net
- Documentation: https://www.owlbrowser.net/docs
- GitHub: https://github.com/Olib-AI/olib-browser
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file owl_browser-2.0.7.tar.gz.
File metadata
- Download URL: owl_browser-2.0.7.tar.gz
- Upload date:
- Size: 130.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e223133182981d4f5b3f4404bccdf69144b0bc5323ff0bb4a90717d8971c1ca
|
|
| MD5 |
7f120d522b135057cbac2a5111ca73f5
|
|
| BLAKE2b-256 |
3b7adfe6ae018531d34ec92d36a56c1b138f7aeaf500acac9b8ad214cb1b9c3e
|
File details
Details for the file owl_browser-2.0.7-py3-none-any.whl.
File metadata
- Download URL: owl_browser-2.0.7-py3-none-any.whl
- Upload date:
- Size: 153.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e17638a0d04cbf66c1a39d4dd9ddea739ebd26e01c7d910214a3a76599f4697c
|
|
| MD5 |
512b083c00fc3bb9d385f56f1b22497c
|
|
| BLAKE2b-256 |
726ca10c4226570ba52b7cb61fb074039d7dc2a67d5bbf5391edded41e0cce37
|