Skip to main content

AI-powered browser automation

Project description

AI Browser

Let Claude Code / Cursor / Windsurf AI Agents operate the browser autonomously — stable, cheap, and automation-ready. Not a browser bot. This is the bridge layer between AI Agents and the browser.


Demo

Natural Language Driven

> Search GitHub for the most starred Python repos
⠋ Navigating to GitHub...
✓ Navigate to https://github.com
✓ Fill search: Python
✓ Click search button
✓ Extract top 10 repo names and star counts

Task complete:
  1. public-apis/public-apis        ⭐ 382k
  2. donnemartin/system-design-primer ⭐ 310k
  3. AUTOMATIC1111/stable-diffusion  ⭐ 280k
  ...

Benchmark Results

4 tasks × 3 runs each, measured on real pages with Chrome. Full data in benchmark_results.json.

Accuracy: 100% (12/12 runs successful)

Token efficiency:

Task DOM tokens sent to LLM
GitHub 656
DuckDuckGo 720
Wikipedia 3,408
Hacker News 4,998

Cost per 100 tasks (3 LLM calls each):

Model Cost
deepseek-chat ¥1.24
gpt-4o $3.11
claude-sonnet-4-6 $4.00
Playbook replay ¥0 (zero LLM calls)

vs Browser Use

Browser Use AI Browser
Sent to LLM Raw accessibility tree (~5,000–77,000 tokens) DomRouteMap (~200–5,000 tokens)
Token reduction 17x lower
Repeated tasks LLM re-interprets every time Playbook replay, zero LLM cost
Search tasks Opens browser → may trigger CAPTCHA HTTP search (DuckDuckGo/Bing), no browser

Quick Start

1. Install

pip install aibrowser

2. Start

./scripts/start.sh --install

3. Connect to Claude Code / Cursor

Add to your MCP config:

{
  "mcpServers": {
    "aibrowser": {
      "command": "bash",
      "args": ["-c", "exec python3 python_bridge/mcp_server.py"]
    }
  }
}

4. Use

In Claude Code, just say:

> What's the top story on Hacker News today?
> Compare iPhone 16 prices on Amazon vs eBay
> Log into my company dashboard and pull this week's report

Or control manually:

goto("https://example.com")
snapshot()       # structured page view
click("@3")      # stable element index
fill("@7", "hello")
screenshot()

Core Features

DomRouteMap — Stable Selectors, Every Time

No fragile CSS selectors or XPaths. Pages are compressed into semantic route keys like button[Login] or textbox[Username]. LLM interaction is stable, tokens reduced ~20x.

Playbook — Learn Once, Replay Forever

Every successful action chain is auto-saved as a Playbook. Replay on the same site next time — zero LLM calls, zero cost.

# First run: LLM explores the path
aib agent "Log into Wikipedia and search for AI"
# → 5 steps, 5 LLM calls

# Second run: plays from Playbook
aib agent "Log into Wikipedia and search for AI"
# → 0 LLM calls, instant

MoE — Mixture of Experts

Complex requests are decomposed into sub-tasks assigned to specialized experts:

  • Information Expert: Research without opening a browser (HTTP search + crawl + extract)
  • Browser Expert: Stable DOM interaction
  • Site Planner: Plan workflows using saved site profiles

Human-in-the-Loop Assistance

Behavioral simulation and interaction prompts for common human-verification scenarios, helping users complete legitimate web operations. Users are responsible for ensuring their actions comply with target websites' terms of service.

Responsible Use

AI Browser is a general-purpose automation tool designed for legitimate use cases such as:

  • Testing your own websites and applications (multi-tab concurrency simulates multi-user scenarios)
  • Automating your own workflows (form filling, data entry, report generation)
  • Accessibility and usability research
  • Compliance monitoring of publicly available information

You are solely responsible for ensuring your use complies with applicable laws and the terms of service of any websites you interact with. This tool must not be used to bypass security measures, conduct unauthorized access, or scrape data in violation of applicable laws.

Privacy Protection Mode

Optimized browser compatibility configuration to reduce environment differences in automated testing scenarios, supporting stable long-session operations.


Commands

Navigation:     goto, back, forward, reload
Reading:        text, html, snapshot, url, title, links, cookies, storage
Interaction:    click, fill, type, press, scroll, hover, select, upload
Inspection:     screenshot, inspect, get_attrs, is_element
Tabs:           tabs, tab, newtab, closetab, switch_tab
Sessions:       session-list, session-create, session-destroy, save_session, load_session
Agent:          agent, agent-stop, agent-status, agent-team, interrupt, history
Playbook:       playbook-list, playbook-save, playbook-replay, playbook-delete
Advanced:       js, chain, frame, state, cdp, download, privacy, handoff, shell

License

AI Browser is open-source under the MIT License.

Features

  • 90+ browser commands
  • DomRouteMap (stable selectors)
  • Playbook (saved workflows)
  • Multi-tab concurrency
  • MoE task decomposition
  • Knowledge base (semantic search)
  • Task scheduler (Cron)
  • Multi-tenant RBAC
  • Audit log
  • Unlimited agent runs
  • HTTP search (no browser needed)

Activate:

aib start

Configuration

Variable Description
AIBRIDGE_PORT Server port (default: 18936)
AIBRIDGE_AUTH_TOKEN API auth token
LLM_PROVIDER LLM provider (openai, ollama)
LLM_MODEL LLM model name
LLM_API_KEY LLM API key

Security tip: Store sensitive credentials like LLM_API_KEY in your system keychain (macOS Keychain / Windows Credential Manager / Linux Secret Service) rather than plaintext config files or env scripts.


Tech Stack

Component Technology
Language Python 3.11+
Browser engine browser-use >= 0.12.6
Browser System Chrome/Chromium
HTTP server aiohttp
MCP SDK FastMCP (stdio/SSE)
LLM providers OpenAI, Ollama, DeepSeek, GPT-4o
Vector store ChromaDB
CDP pydoll-python, native CDP

License

AI Browser is open-source under the MIT License.

This product includes third-party open-source components. See NOTICE for details.


🇨🇳 中文简介

AI Browser 让 Claude Code / Cursor / Windsurf 的 AI Agent 自主操作浏览器 —— 稳定、便宜、自动化就绪。

  • DomRouteMap:稳定选择器,Token 降低 20 倍
  • Playbook:学会就回放,零 LLM 成本
  • 多 tab 并发:同一浏览器同时操作多个标签页
  • 免费可用:核心功能不收费,按需升级

安装:pip install aibrowser

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aibrowser-1.1.1.tar.gz (998.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aibrowser-1.1.1-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file aibrowser-1.1.1.tar.gz.

File metadata

  • Download URL: aibrowser-1.1.1.tar.gz
  • Upload date:
  • Size: 998.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for aibrowser-1.1.1.tar.gz
Algorithm Hash digest
SHA256 b3ec2b7b62cd957ead09cfdadf301ebc8238f7c63ea60d65a62a3fb600a66e32
MD5 cf1687f40a91a180e3ed5e741ec86eef
BLAKE2b-256 8601a5e23f06c5c364d17a9accca133333f7fd1c5fb824f194f1fd51c57779a5

See more details on using hashes here.

File details

Details for the file aibrowser-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: aibrowser-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for aibrowser-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e295fb227f049ef90d12bf96cdb4948d0f17de788e5c23d8c860b9e1f3c7a7c3
MD5 d296aa0f9e9aa91a2cf60c4c67790565
BLAKE2b-256 caf0db6ecdac2f44b494080f9b4be2928d2a776daa3c653137062168ad6a34cd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page