Skip to main content

Intuned Browser SDK

Project description

Intuned Browser SDK (Python)

Intuned's Python SDK for browser automation and web data extraction, designed to work seamlessly with the Intuned platform.

Installation

Using Poetry (Recommended)

poetry add intuned-browser

Using pip

pip install intuned-browser

Features

The Intuned Browser SDK provides a comprehensive set of tools for browser automation and data extraction:

🤖 AI-Powered Extraction

  • Structured Data Extraction - Extract structured data from web pages using AI
  • Schema Validation - Validate extracted data against JSON schemas
  • Smart Page Loading Detection - Determine when pages have fully loaded

🌐 Web Automation Helpers

  • Navigation - Advanced URL navigation with go_to_url()
  • Content Loading - Scroll to load dynamic content with scroll_to_load_content()
  • Network Monitoring - Wait for network activity with wait_for_network_settled()
  • DOM Monitoring - Wait for DOM changes with wait_for_dom_settled()
  • Click Automation - Click elements until exhausted with click_until_exhausted()

📄 Content Processing

  • HTML Sanitization - Clean and sanitize HTML with sanitize_html()
  • Markdown Extraction - Convert HTML to markdown with extract_markdown()
  • URL Resolution - Resolve relative URLs with resolve_url()
  • Date Processing - Parse and process dates with process_date()

📁 File Operations

  • File Downloads - Download files with download_file()
  • S3 Integration - Upload and save files to S3 with upload_file_to_s3() and save_file_to_s3()

✅ Data Validation

  • Schema Validation - Validate data structures with validate_data_using_schema()
  • Empty Value Filtering - Filter empty values with filter_empty_values()

Quick Start

from intuned_browser import (
    extract_markdown,
    sanitize_html,
    go_to_url,
    wait_for_network_settled,
    validate_data_using_schema
)

# Example: Extract and process web content
async def extract_content(page):
    # Navigate to URL
    await go_to_url(page, "https://example.com")

    # Wait for network to settle
    await wait_for_network_settled(page)

    # Get and sanitize HTML
    html = await page.content()
    clean_html = sanitize_html(html)

    # Extract markdown
    markdown = extract_markdown(clean_html)

    return markdown

AI-Powered Data Extraction

from intuned_browser.ai import extract_structured_data
from intuned_browser.ai.types import JsonSchema

# Define your data schema
schema: JsonSchema = {
    "type": "object",
    "properties": {
        "title": {"type": "string"},
        "price": {"type": "number"},
        "description": {"type": "string"}
    },
    "required": ["title", "price"]
}

# Extract structured data using AI
async def extract_product_data(page):
    result = await extract_structured_data(
        page=page,
        schema=schema,
        prompt="Extract product information from this page"
    )
    return result

Documentation

For detailed documentation on all functions and types, see the documentation.

Support

For support, questions, or contributions, please contact the Intuned team at engineering@intunedhq.com.

About Intuned

Intuned provides powerful tools for browser automation, web scraping, and data extraction. Visit intunedhq.com to learn more.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intuned_browser-0.1.10.tar.gz (110.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

intuned_browser-0.1.10-py3-none-any.whl (141.8 kB view details)

Uploaded Python 3

File details

Details for the file intuned_browser-0.1.10.tar.gz.

File metadata

  • Download URL: intuned_browser-0.1.10.tar.gz
  • Upload date:
  • Size: 110.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for intuned_browser-0.1.10.tar.gz
Algorithm Hash digest
SHA256 ca40f6214e2230d341ec0d8c994d665e49e2050f1782d55fe9ffc07ca8a893e4
MD5 02e44d500e5c0e1d3807fbb33fe88c43
BLAKE2b-256 9bb13a2c96f56c808d01c23f722cf67f3ba2c00064a5e7cd2fb0afa1428d80f0

See more details on using hashes here.

File details

Details for the file intuned_browser-0.1.10-py3-none-any.whl.

File metadata

File hashes

Hashes for intuned_browser-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 bc9eed00152f444b4bb7086b2137ed35fa8b6359b441fe4f3fb5bf8a76e95a8a
MD5 3d74f92ed83d2e3ab81222f1f49cc11e
BLAKE2b-256 275c39d73fea6c67feac8a73260c0450baaa88d5d6e6e531d3c2026e067eef99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page