Skip to main content

Automate online browsing using python and AI

Project description

PyBA

Tell the AI what to do once. Get a Python script you can run forever.

PyBA uses LLMs to autonomously navigate any website, then exports the session as a standalone Playwright script - no API costs on repeat runs.

  

PyPIDocumentationOpenHub


The Problem with AI Browser Agents

Every AI browser agent has the same issue: you pay for every single run.

  • Run it 100 times? Pay for 100 LLM calls.
  • Same task every day? Pay every day.
  • The AI figures out the same clicks over and over.

PyBA is different. Let the AI figure it out once, then export a deterministic script you own forever.

from pyba import Engine

engine = Engine(openai_api_key="sk-...")

# Step 1: AI navigates autonomously
engine.sync_run(
    prompt="Go to Hacker News, click the top story, extract all comments"
)

# Step 2: Export as a standalone Playwright script
engine.generate_code(output_path="hacker_news_scraper.py")

Now run python hacker_news_scraper.py forever. No AI. No API costs. Just Playwright.


Installation

pip install py-browser-automation

What Can You Do?

Automate Repetitive Browser Tasks

engine.sync_run(
    prompt="Login to my bank, download this month's statement as PDF",
    automated_login_sites=["swissbank"]
)
engine.generate_code("download_statement.py")

OSINT & Reconnaissance

from pyba import DFS

dfs = DFS(openai_api_key="sk-...")
dfs.sync_run(
    prompt="Find all social media accounts linked to username 'targetuser123'"
)

Structured Data Extraction

from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price: float
    rating: float

engine.sync_run(
    prompt="Scrape all products from the first 3 pages",
    extraction_format=Product
)
# Data is extracted DURING navigation, stored in your database

Authenticated Workflows

engine.sync_run(
    prompt="Go to my Instagram DMs and message john Paula 'Running 10 mins late'",
    automated_login_sites=["instagram"]
)
# Credentials come from env vars - never exposed to the LLM

Four Exploration Modes

Mode Use Case Example
Normal Direct task execution "Fill out this form and submit"
Step Interactive, step-by-step control "Click here" → "Now search for X" → "Extract that"
DFS Deep investigation "Analyze this GitHub user's contribution patterns"
BFS Wide discovery "Map all pages linked from this homepage"
from pyba import Engine, Step, DFS, BFS

# Normal mode (default)
engine = Engine(openai_api_key="...")

# Step-by-step interactive mode
step = Step(openai_api_key="...")

# Deep-first exploration
dfs = DFS(openai_api_key="...")

# Breadth-first discovery
bfs = BFS(openai_api_key="...")

Interactive Step-by-Step Automation

from pyba import Step

step = Step(openai_api_key="sk-...")

await step.start()
await step.step("Go to google.com and search for 'playwright python'")
await step.step("Click the first result")
output = await step.step("Extract the installation instructions")
await step.stop()

Key Features

Code Generation

Export any successful run as a standalone Python script. Run it forever without AI.

Trace Files

Every run generates a Playwright trace.zip — replay exactly what happened in Trace Viewer.

Low Memory Mode

Saves ~120MB of idle RAM by lazy-loading heavy Python dependencies (oxymouse, google-genai, openai). Chromium flags improve container stability. Built for CI servers, containers, and low-spec machines.

engine = Engine(openai_api_key="sk-...", low_memory=True)

Stealth Mode

Anti-fingerprinting, random mouse movements, human-like delays. Bypass common bot detection.

Multi-Provider

Works with OpenAI, Google VertexAI, or Gemini.

Database Logging

Store every action in SQLite, PostgreSQL, or MySQL. Audit trails and replay capability.

Platform Logins

Built-in login handlers for Instagram, Gmail, Facebook. Credentials stay in env vars.


Quick Examples

Extract YouTube Video Metadata

engine.sync_run(
    prompt="Go to this YouTube video and extract: title, view count, like count, channel name, upload date"
)

Fill a Multi-Page Form

engine.sync_run(
    prompt="Fill out the job application: Name='John Doe', Email='john@email.com', upload resume from ~/resume.pdf, submit"
)
engine.generate_code("job_application.py")  # Replay anytime

Research a Company

dfs = DFS(openai_api_key="...")
dfs.sync_run(
    prompt="Find the leadership team, recent news, and funding history for Acme Corp"
)


Configuration

from pyba import Engine, Database

# With database logging
db = Database(engine="sqlite", name="runs.db")

engine = Engine(
    openai_api_key="sk-...",
    headless=False,           # Watch it work
    enable_tracing=True,      # Generate trace.zip
    max_depth=20,             # Max actions per run
    database=db               # Log everything
)

See full configuration options in the docs.


Origin

PyBA was built for automated intelligence and OSINT — replicating everything a human analyst can do in a browser, but with reproducibility and speed.

If you're doing security research, competitive intelligence, or just automating tedious browser tasks, this is for you.


Status

v0.3.0 - Active development. First stable release: December 18, 2025.

Breaking changes may occur. Pin your version in production.


If PyBA saved you time, consider giving it a ⭐

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_browser_automation-0.3.4.tar.gz (64.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_browser_automation-0.3.4-py3-none-any.whl (92.7 kB view details)

Uploaded Python 3

File details

Details for the file py_browser_automation-0.3.4.tar.gz.

File metadata

  • Download URL: py_browser_automation-0.3.4.tar.gz
  • Upload date:
  • Size: 64.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for py_browser_automation-0.3.4.tar.gz
Algorithm Hash digest
SHA256 47e11235b3328ec3da3bb5ae78094a0e793b9c98a6004b7e4aa2c9a192aedb60
MD5 2b093db7c75508d550ef9dcc6ff36abb
BLAKE2b-256 f833cb03741f0d2d12db8bc42c1236e89ddac3da089092680f3c42859d90b840

See more details on using hashes here.

File details

Details for the file py_browser_automation-0.3.4-py3-none-any.whl.

File metadata

File hashes

Hashes for py_browser_automation-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 07a5e8bf9871968f7b55cf4521fdbefac000349c3ba18b7d57fd29042c1a9f90
MD5 9f1af6d22727b5dedd1c09e79f5ce14f
BLAKE2b-256 f84a6d635b9c42b28ee0448305bb91e1d8b767ca4d6ea96819409a249d74fb1c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page