Skip to main content

Automate online browsing using python and AI

Project description

PyBA

Tell the AI what to do once. Get a Python script you can run forever.

PyBA uses LLMs to autonomously navigate any website, then exports the session as a standalone Playwright script - no API costs on repeat runs.

  

PyPIDocumentationOpenHub


The Problem with AI Browser Agents

Every AI browser agent has the same issue: you pay for every single run.

  • Run it 100 times? Pay for 100 LLM calls.
  • Same task every day? Pay every day.
  • The AI figures out the same clicks over and over.

PyBA is different. Let the AI figure it out once, then export a deterministic script you own forever.

from pyba import Engine

engine = Engine(openai_api_key="sk-...")

# Step 1: AI navigates autonomously
engine.sync_run(
    prompt="Go to Hacker News, click the top story, extract all comments"
)

# Step 2: Export as a standalone Playwright script
engine.generate_code(output_path="hacker_news_scraper.py")

Now run python hacker_news_scraper.py forever. No AI. No API costs. Just Playwright.


Installation

pip install py-browser-automation

What Can You Do?

Automate Repetitive Browser Tasks

engine.sync_run(
    prompt="Login to my bank, download this month's statement as PDF",
    automated_login_sites=["swissbank"]
)
engine.generate_code("download_statement.py")

OSINT & Reconnaissance

from pyba import DFS

dfs = DFS(openai_api_key="sk-...")
dfs.sync_run(
    prompt="Find all social media accounts linked to username 'targetuser123'"
)

Structured Data Extraction

from pydantic import BaseModel

class Product(BaseModel):
    name: str
    price: float
    rating: float

engine.sync_run(
    prompt="Scrape all products from the first 3 pages",
    extraction_format=Product
)
# Data is extracted DURING navigation, stored in your database

Authenticated Workflows

engine.sync_run(
    prompt="Go to my Instagram DMs and message john Paula 'Running 10 mins late'",
    automated_login_sites=["instagram"]
)
# Credentials come from env vars - never exposed to the LLM

Four Exploration Modes

Mode Use Case Example
Normal Direct task execution "Fill out this form and submit"
Step Interactive, step-by-step control "Click here" → "Now search for X" → "Extract that"
DFS Deep investigation "Analyze this GitHub user's contribution patterns"
BFS Wide discovery "Map all pages linked from this homepage"
from pyba import Engine, Step, DFS, BFS

# Normal mode (default)
engine = Engine(openai_api_key="...")

# Step-by-step interactive mode
step = Step(openai_api_key="...")

# Deep-first exploration
dfs = DFS(openai_api_key="...")

# Breadth-first discovery
bfs = BFS(openai_api_key="...")

Interactive Step-by-Step Automation

from pyba import Step

step = Step(openai_api_key="sk-...")

await step.start()
await step.step("Go to google.com and search for 'playwright python'")
await step.step("Click the first result")
output = await step.step("Extract the installation instructions")
await step.stop()

Key Features

Code Generation

Export any successful run as a standalone Python script. Run it forever without AI.

Trace Files

Every run generates a Playwright trace.zip — replay exactly what happened in Trace Viewer.

Low Memory Mode

Saves ~120MB of idle RAM by lazy-loading heavy Python dependencies (oxymouse, google-genai, openai). Chromium flags improve container stability. Built for CI servers, containers, and low-spec machines.

engine = Engine(openai_api_key="sk-...", low_memory=True)

Stealth Mode

Anti-fingerprinting, random mouse movements, human-like delays. Bypass common bot detection.

Multi-Provider

Works with OpenAI, Google VertexAI, or Gemini.

Database Logging

Store every action in SQLite, PostgreSQL, or MySQL. Audit trails and replay capability.

Platform Logins

Built-in login handlers for Instagram, Gmail, Facebook. Credentials stay in env vars.


Quick Examples

Extract YouTube Video Metadata

engine.sync_run(
    prompt="Go to this YouTube video and extract: title, view count, like count, channel name, upload date"
)

Fill a Multi-Page Form

engine.sync_run(
    prompt="Fill out the job application: Name='John Doe', Email='john@email.com', upload resume from ~/resume.pdf, submit"
)
engine.generate_code("job_application.py")  # Replay anytime

Research a Company

dfs = DFS(openai_api_key="...")
dfs.sync_run(
    prompt="Find the leadership team, recent news, and funding history for Acme Corp"
)


Configuration

from pyba import Engine, Database

# With database logging
db = Database(engine="sqlite", name="runs.db")

engine = Engine(
    openai_api_key="sk-...",
    headless=False,           # Watch it work
    enable_tracing=True,      # Generate trace.zip
    max_depth=20,             # Max actions per run
    database=db               # Log everything
)

See full configuration options in the docs.


Origin

PyBA was built for automated intelligence and OSINT — replicating everything a human analyst can do in a browser, but with reproducibility and speed.

If you're doing security research, competitive intelligence, or just automating tedious browser tasks, this is for you.


Status

v0.3.0 - Active development. First stable release: December 18, 2025.

Breaking changes may occur. Pin your version in production.


If PyBA saved you time, consider giving it a ⭐

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_browser_automation-0.3.3.tar.gz (61.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_browser_automation-0.3.3-py3-none-any.whl (89.5 kB view details)

Uploaded Python 3

File details

Details for the file py_browser_automation-0.3.3.tar.gz.

File metadata

  • Download URL: py_browser_automation-0.3.3.tar.gz
  • Upload date:
  • Size: 61.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for py_browser_automation-0.3.3.tar.gz
Algorithm Hash digest
SHA256 82ec577355742f5c90390a5f95ebec27d60aceff291bdc0e41cfd26fa8a12d3b
MD5 10cbed9349e0be04be2a3a4c6fc59c9c
BLAKE2b-256 0073a21f34c26649ed2acdfb0e3d8b7bfce429595fcd73af9d54cd6ecc863eca

See more details on using hashes here.

File details

Details for the file py_browser_automation-0.3.3-py3-none-any.whl.

File metadata

File hashes

Hashes for py_browser_automation-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 dd4f45d018ae257e8098c4d5d83129d69662647ab3a60647ca0e12fa3ae0439c
MD5 384e3776ce5482ab60ea17ad54373e3d
BLAKE2b-256 2949a244f2f279a9260064e7c53d9207dbbd362e9d8533de888a0b5b6146b286

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page