Skip to main content

Browser automation CLI for AI agents - reliable element targeting through accessibility tree refs

Project description

browse-now

Browser automation CLI for AI agents. Control your browser from the command line with reliable element targeting through accessibility tree refs.

Part of Nowledge Mem - Personal memory for AI agents.

Why browse-now?

  • Low overhead: No tool definitions loaded into LLM context (unlike MCP)
  • Reliable targeting: Uses accessibility tree refs (@e1, @e2) for 95%+ click reliability
  • AI-agent optimized: Designed for the snapshot → click → repeat workflow
  • Multi-browser support: Works with Chrome, Arc, Edge, and other Chromium browsers

Installation

pip install browse-now

Prerequisites

  1. Nowledge Memory Exchange Chrome extension (verson v2.0.71 or later) installed
  2. Nowledge Mem app running (provides the bridge server)

Quick Start

# Navigate to a page
browse-now open https://example.com

# Get interactive elements with refs
browse-now snapshot -i
# Output:
#   textbox "Search" [e1]
#   button "Submit" [e2]
#   link "About" [e3]

# Click by ref
browse-now click @e2

# Fill input
browse-now fill @e1 "AI agents" --submit

Core Workflow

The browse-now workflow matches the Claude Chrome Extension pattern:

1. browse-now open <url>           # Open page (isolated agent tab)
2. browse-now snapshot -i          # Get refs [e1], [e2]...
3. browse-now click @e5            # Click by ref
4. browse-now fill @e3 "text"      # Fill by ref
5. (If page changed) → snapshot -i again

Command Reference

Navigation

browse-now open <url>              # Navigate (isolated tab by default)
browse-now open <url> --no-isolated  # Use current active tab
browse-now back                    # Go back
browse-now forward                 # Go forward
browse-now reload                  # Reload page

Snapshot (Page Analysis)

browse-now snapshot                # Full accessibility tree
browse-now snapshot -i             # Interactive elements only (recommended)
browse-now snapshot -i -w 1000     # Wait 1s for late-loading content
browse-now snapshot -s "main"      # Scope to CSS selector

Interactions

# Click by ref (95%+ reliable - primary method)
browse-now click @e1

# Click by text (85% reliable - fallback for dialogs)
browse-now click -T "Submit"
browse-now click -T "确认"         # Works with Chinese

# Fill and type
browse-now fill @e2 "text"         # Clear and type
browse-now fill @e2 "text" --submit  # Fill and submit
browse-now type @e2 "more"         # Append without clearing

# Other interactions
browse-now press Enter             # Press keyboard key
browse-now hover @e1               # Hover (reveal hidden UI)
browse-now scroll down 500         # Scroll page

Get Information

browse-now get text @e1            # Get element text
browse-now get title               # Get page title
browse-now get url                 # Get current URL
browse-now get page-text           # Extract full page text

Screenshots

browse-now screenshot page.png     # Save to file
browse-now screenshot --full       # Full page screenshot
browse-now screenshot -e @e1       # Element screenshot

Wait

browse-now wait @e1                # Wait for element
browse-now wait 2                  # Wait 2 seconds
browse-now wait 500ms              # Wait 500 milliseconds

Tabs

browse-now tabs                    # List open tabs
browse-now switch <tab_id>         # Switch to tab by ID
browse-now agent-tab status        # Check agent tab status

Multi-Browser

browse-now browsers                # List connected browsers
browse-now -b arc_123 open <url>   # Target specific browser

JSON Output

For programmatic use, add -j or --json:

browse-now -j snapshot -i
browse-now -j click @e1
browse-now -j get title

Python API

from browse_now import BrowserClient

async def main():
    async with BrowserClient() as browser:
        await browser.navigate("https://example.com")
        await browser.click("#login-btn")
        await browser.fill("#email", "user@example.com")
        await browser.fill("#password", "secret", submit=True)

import asyncio
asyncio.run(main())

Sync API

from browse_now import BrowserClientSync

browser = BrowserClientSync()
browser.navigate("https://example.com")
browser.click("#btn")

Reliability Guide

Method Reliability Use Case
click @eN 95%+ Primary method
click -T "text" 85% Fallback for dialogs/menus

Decision tree:

  1. snapshot -i → Found ref? → click @eN
  2. Sparse results? → Use screenshot + click -T "visible text"
  3. Re-snapshot with -w 1000 if content loads late

Example: Weibo Delete Flow

browse-now snapshot -i             # Find "更多" [e45]
browse-now click @e45              # Open menu
browse-now snapshot -i             # Find "删除" [e67]
browse-now click @e67              # Delete
browse-now snapshot -i             # Find "确定" [e89]
browse-now click @e89              # Confirm

Troubleshooting

  1. No browser connected: Make sure the Nowledge Memory Exchange extension is installed and active
  2. Connection refused: Ensure Nowledge Mem is running
  3. Element not found: Run snapshot -i again after page changes
  4. Sparse results: Site may have poor accessibility. Use screenshot + click -T "visible text"
  5. Multiple browsers: Use -b <browser_id> to target specific browser

Documentation

Acknowledgments

Inspired by Claude Chrome Extension and vercel-labs/agent-browser for the browser automation approach.

License

Proprietary - see Nowledge Terms of Service

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browse_now-2.0.75.tar.gz (28.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

browse_now-2.0.75-py3-none-any.whl (31.6 kB view details)

Uploaded Python 3

File details

Details for the file browse_now-2.0.75.tar.gz.

File metadata

  • Download URL: browse_now-2.0.75.tar.gz
  • Upload date:
  • Size: 28.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for browse_now-2.0.75.tar.gz
Algorithm Hash digest
SHA256 9eb6ab9e275ba0f5d4acc193dc7018d854467ef33a12afe929ae04fa4ed8c0bd
MD5 54d40186400309721acde7b25ebabce1
BLAKE2b-256 04b8e1e5a9d98c9806a8af2f3b18fefa46ae576980eb25912a6e0b2082e966a1

See more details on using hashes here.

File details

Details for the file browse_now-2.0.75-py3-none-any.whl.

File metadata

  • Download URL: browse_now-2.0.75-py3-none-any.whl
  • Upload date:
  • Size: 31.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for browse_now-2.0.75-py3-none-any.whl
Algorithm Hash digest
SHA256 86c729ae8bcf8163a32bd23cc25a48fa7718bd6cd3d43ce137be2789ea164726
MD5 cd5cabeb74c4a431881a769f8484f082
BLAKE2b-256 0d6efcfa2c33394cf04718c7e664b91b861d089dfeeda54408012f6d888a1dd2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page