Automate online browsing using python and AI
Project description
PyBA
Tell the AI what to do once. Get a Python script you can run forever.
PyBA uses LLMs to autonomously navigate any website, then exports the session as a standalone Playwright script - no API costs on repeat runs.
PyPI • Documentation • OpenHub
The Problem with AI Browser Agents
Every AI browser agent has the same issue: you pay for every single run.
- Run it 100 times? Pay for 100 LLM calls.
- Same task every day? Pay every day.
- The AI figures out the same clicks over and over.
PyBA is different. Let the AI figure it out once, then export a deterministic script you own forever.
from pyba import Engine
engine = Engine(openai_api_key="sk-...")
# Step 1: AI navigates autonomously
engine.sync_run(
prompt="Go to Hacker News, click the top story, extract all comments"
)
# Step 2: Export as a standalone Playwright script
engine.generate_code(output_path="hacker_news_scraper.py")
Now run python hacker_news_scraper.py forever. No AI. No API costs. Just Playwright.
Installation
pip install py-browser-automation
What Can You Do?
Automate Repetitive Browser Tasks
engine.sync_run(
prompt="Login to my bank, download this month's statement as PDF",
automated_login_sites=["swissbank"]
)
engine.generate_code("download_statement.py")
OSINT & Reconnaissance
from pyba import DFS
dfs = DFS(openai_api_key="sk-...")
dfs.sync_run(
prompt="Find all social media accounts linked to username 'targetuser123'"
)
Structured Data Extraction
from pydantic import BaseModel
class Product(BaseModel):
name: str
price: float
rating: float
engine.sync_run(
prompt="Scrape all products from the first 3 pages",
extraction_format=Product
)
# Data is extracted DURING navigation, stored in your database
Authenticated Workflows
engine.sync_run(
prompt="Go to my Instagram DMs and message john Paula 'Running 10 mins late'",
automated_login_sites=["instagram"]
)
# Credentials come from env vars - never exposed to the LLM
Three Exploration Modes
| Mode | Use Case | Example |
|---|---|---|
| Normal | Direct task execution | "Fill out this form and submit" |
| DFS | Deep investigation | "Analyze this GitHub user's contribution patterns" |
| BFS | Wide discovery | "Map all pages linked from this homepage" |
from pyba import Engine, DFS, BFS
# Normal mode (default)
engine = Engine(openai_api_key="...")
# Deep-first exploration
dfs = DFS(openai_api_key="...")
# Breadth-first discovery
bfs = BFS(openai_api_key="...")
Key Features
Code Generation
Export any successful run as a standalone Python script. Run it forever without AI.
Trace Files
Every run generates a Playwright trace.zip — replay exactly what happened in Trace Viewer.
Stealth Mode
Anti-fingerprinting, random mouse movements, human-like delays. Bypass common bot detection.
Multi-Provider
Works with OpenAI, Google VertexAI, or Gemini.
Database Logging
Store every action in SQLite, PostgreSQL, or MySQL. Audit trails and replay capability.
Platform Logins
Built-in login handlers for Instagram, Gmail, Facebook. Credentials stay in env vars.
Quick Examples
Extract YouTube Video Metadata
engine.sync_run(
prompt="Go to this YouTube video and extract: title, view count, like count, channel name, upload date"
)
Fill a Multi-Page Form
engine.sync_run(
prompt="Fill out the job application: Name='John Doe', Email='john@email.com', upload resume from ~/resume.pdf, submit"
)
engine.generate_code("job_application.py") # Replay anytime
Research a Company
dfs = DFS(openai_api_key="...")
dfs.sync_run(
prompt="Find the leadership team, recent news, and funding history for Acme Corp"
)
Configuration
from pyba import Engine, Database
# With database logging
db = Database(engine="sqlite", name="runs.db")
engine = Engine(
openai_api_key="sk-...",
headless=False, # Watch it work
enable_tracing=True, # Generate trace.zip
max_depth=20, # Max actions per run
database=db # Log everything
)
See full configuration options in the docs.
Origin
PyBA was built for automated intelligence and OSINT — replicating everything a human analyst can do in a browser, but with reproducibility and speed.
If you're doing security research, competitive intelligence, or just automating tedious browser tasks, this is for you.
Status
v0.3.0 - Active development. First stable release: December 18, 2025.
Breaking changes may occur. Pin your version in production.
If PyBA saved you time, consider giving it a ⭐
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file py_browser_automation-0.3.1.tar.gz.
File metadata
- Download URL: py_browser_automation-0.3.1.tar.gz
- Upload date:
- Size: 62.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
748ed314020855b50b9a30029ac1c3e629b78b1617e5e2fdbcba0d11bc2a2311
|
|
| MD5 |
b5f5cfa1b6deeaa1ef9258a0a2e4ac63
|
|
| BLAKE2b-256 |
795af64fed3336982c37ad5e9dfe9a8c48895527a1c55672edb8588441d8c279
|
File details
Details for the file py_browser_automation-0.3.1-py3-none-any.whl.
File metadata
- Download URL: py_browser_automation-0.3.1-py3-none-any.whl
- Upload date:
- Size: 88.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
497ccb83d6bd61ed782dfb9472bb9c161b8d74477d01c3e13549f6d7e256484c
|
|
| MD5 |
4cbc4789a01a95a166e6641aeb6acf16
|
|
| BLAKE2b-256 |
b31d2b2774fc07e1af17241e4e4ee339dedf8dcc2b61c0bfaef4966272605726
|