Skip to main content

Surfari: Modular browser automation with LLM

Project description

Surfari

Surfari is a modular, LLM-powered browser automation framework built on Playwright.
It enables secure, scriptable, and intelligent interactions with websites — perfect for data extraction, automated workflows, and AI-assisted navigation.


✨ Key Features

  • Automatic Record, Parameterize & Replay
    Surfari automatically records both the exact sequence of LLM actions and a generalized, parameterized workflow at the same time.
    When running new tasks, Surfari plugs in the new values, replays the known workflow, and invokes the LLM only for review or recovery.
    🔑 Unique: Replays are fast and stable, while parameterization makes them flexible and reusable for new but structurally similar tasks.

  • Self-Healing Replay
    If the recorded path fails due to layout drift, Surfari seamlessly switches to real-time LLM reasoning for that step, then resumes deterministic replay — combining stability with resilience.

  • Agent Delegation & Collaboration
    A Navigation Agent can pause its own run and delegate subtasks to another agent in a separate tab, then resume after the subtask completes.
    Enables branching workflows, multi-agent collaboration, and parallel subtasks — like a team of agents cooperating inside one browser.

    sequenceDiagram
        participant A as Agent A (Main Task)
        participant B as Agent B (Delegated Subtask)
        participant H as Human-in-the-Loop
    
        A->>A: Start workflow
        A->>B: Delegate subtask<br/>(new tab/session)
        B->>B: Complete delegated task
        B-->>A: Return result, control resumes
    
        A->>A: Continue main workflow
        A-->>H: Request help if blocked<br/>(e.g. CAPTCHA, unknown field)
        H-->>A: Human resolves and resumes
        A->>A: Complete task and verify
    
  • Human-in-the-Loop Delegation
    When needed, Surfari can gracefully delegate control back to a human operator.
    You complete the missing step in the live browser, then the agent continues the workflow automatically.

  • Stable, Text-Based UI Targets
    Instead of brittle XPaths or random IDs, Surfari uses semantic text annotations as selectors.
    Enables highly stable record/replay with stable, meaningful UI targets.

  • Visual Decisioning (Action Box Overlay)
    Surfari can show the LLM’s reasoning and intended action in an on-page action box overlay next to the targeted element — making the agent’s decisions transparent, reviewable, and debuggable.

  • Configurable LLM Models (No Coding Required)
    Swap models like Google Gemini, OpenAI GPT, Anthropic Claude, just by name in config — no code changes needed.

  • Information Masking
    Automatically masks and unmasks account numbers, balances, and any digit-like strings, ensuring sensitive data remains protected during logs, prompts, and replays.

  • One or Multiple Actions Per Turn
    Choose between step-by-step interactivity (safer on dynamic sites) or multi-action per turn (faster on static or more predictable sites/workflows).

  • Custom Value Resolvers (Beyond Tool Calling)
    Unknown form values (inputs, select options, etc.) can be resolved automatically via direct APIs, retrieval-augmented search, or custom resolvers — without requiring tool calls through the LLM.

  • Tool Calling Integration

    • Python Tools: Easy integration via function calling.
    • MCP Tools: Stdio or HTTP servers supported for external integrations.
  • Screenshots for Grounding
    Use screenshots as additional context for the LLM to ensure accurate reasoning (a tad slower) Supports saving screenshots for later review.

  • PDF Download Automation
    Downloads PDFs from both direct download links and embedded Chrome PDF viewers.

  • Batch Execution from CSV
    Run or schedule multiple tasks in one batch — each task can target a different site, goal, or credential set, with its own settings (e.g., single vs. multi-action per turn, record/replay on/off, masking enabled/disabled, screenshots enabled/disabled).

  • OTP Handling
    Automatically solves text-message OTPs by setting up SMS forwarding from your phone to your Gmail, then auto-filling them during login.

  • Google Tools Integration
    Out-of-the-box support for Gmail, Google Sheets, and Google Docs.

  • Deployment Options

    • CLI Binaries: Platform-specific executables — no Python setup required. Just download and run.
    • Docker Deployment: Cloud mode with VNC-based browser streaming. Provision a VM and access the remote browser directly from your web browser.

📦 Installation

pip install surfari

Or from source:

git clone https://github.com/surfari-ai/surfari.git
cd surfari
pip install .

🚀 Quick Start

from surfari.cdp_browser import ChromiumManager
from surfari.surfari_logger import getLogger
from surfari.agents.navigation_agent import NavigationAgent
import asyncio

logger = getLogger(__name__)

async def test_navigation_agent():
    site_name, task_goal = "cricket", "Download my March-April 2025 statements."

    manager = await ChromiumManager.get_instance(use_system_chrome=False)
    page = await manager.get_new_page()

    nav_agent = NavigationAgent(site_name=site_name, enable_data_masking=False)
    answer = await nav_agent.run(page, task_goal=task_goal)

    print("Final answer:", answer)
    await ChromiumManager.stop_instance()
    
if __name__ == "__main__":
    asyncio.run(test_navigation_agent())

🔐 Credential Storage

  • Linux: Key stored in ~/.surfari/key_string with permissions rw------- (chmod 600).
  • macOS: Key stored in ~/.surfari/key_string or system keyring (via keyring library).
  • Windows: Key stored in system keyring (via keyring library).
  • Database: Encrypted SQLite in your Surfari environment.

🛠 Development

git clone https://github.com/surfari-ai/surfari.git
cd surfari
pip install -e .[dev]
python -m playwright install chromium

📂 Project Structure

src/surfari/
  ├── __init__.py
  ├── util/config.json
  ├── security/site_credential_manager.py
  ├── agents/
  │    └── navigation_agent/
  ├── view/html_to_text.js
  └── security/credentials.db

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/new-thing)
  3. Commit changes (git commit -m "Add new thing")
  4. Push to branch (git push origin feature/new-thing)
  5. Open a Pull Request

📜 License

MIT License — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

surfari-0.1.8.tar.gz (110.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

surfari-0.1.8-py3-none-any.whl (118.3 kB view details)

Uploaded Python 3

File details

Details for the file surfari-0.1.8.tar.gz.

File metadata

  • Download URL: surfari-0.1.8.tar.gz
  • Upload date:
  • Size: 110.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for surfari-0.1.8.tar.gz
Algorithm Hash digest
SHA256 48112ba2fccbc3f342e21e135e6cfd141a8019d32c3b29ff3c003ef665a7d2ed
MD5 fe52949c7e9828ca92d31de3ccc280eb
BLAKE2b-256 324e9feba8ad39e7c1ff8be54e12222e1e81fb98483fb1b8abff19fed2d6ccf9

See more details on using hashes here.

File details

Details for the file surfari-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: surfari-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 118.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for surfari-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 fbbb0d797d9d4db84af6bc3e2b00b6c8b2722e7023e82ef3ec22f175f52961f0
MD5 cd2425da2e0deb1d208d9b372e7615eb
BLAKE2b-256 6cc75e72946c3212992bb672c85e9fea8cc37b3b35211dd324e6af5aa5b6e34f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page