Skip to main content

CLI-first browser automation with platform intelligence

Project description

anyweb

CLI-first browser automation for AI agents. One command reads a webpage, one command clicks a button — no boilerplate, no browser lifecycle to manage.

Built on Playwright with anti-detection, a persistent daemon, and platform-specific extractors for X, Zhihu, XHS, and more.

Why anyweb

AI coding agents (Claude Code, Codex, etc.) need to interact with the web but can't drive a browser natively. anyweb fills that gap:

  • Atomic commandsopen, click, input, keys, read — each one maps to a single action
  • Persistent daemon — browser stays alive between commands, ~50ms latency vs ~3s cold start
  • Multi-page parallelism--page flag runs named tabs concurrently, perfect for batch queries
  • Accessibility treestate --ax returns a compact, structured page representation with text, URLs, and element indices — ideal for LLM context
  • Content-stable waitingwait content-stable detects when dynamic content (Grok responses, streaming results) finishes loading
  • Platform intelligenceread auto-detects X/Zhihu/XHS and applies platform-specific extraction

Install

pip install anyweb        # or: uv tool install anyweb
anyweb install            # install Chromium

Quick start

# Read any webpage — auto-detects platform
anyweb read "https://x.com/karpathy"

# JSON output for programmatic use
anyweb --json read "https://x.com/karpathy"

# Search within a platform
anyweb search x "embodied AI news" --limit 10

# System health check
anyweb doctor

Commands

High-level

Command What it does
read URL Smart extract: auto-detect platform, render JS, return clean content
search PLATFORM QUERY Search within X / Zhihu / XHS
login PLATFORM Save session cookies (opens browser window)
status Daemon, browsers, pages, sessions, recommended actions
doctor [--fix] Dependency + credential check with auto-repair

Atomic browser

Command What it does
open URL Navigate (starts daemon if needed)
state List interactive elements with indices
state --ax Accessibility tree — compact text + URLs for LLMs
click IDX Click element by index or CSS selector
input IDX "text" Click element, then type
type "text" Type via keyboard events (no click)
keys Enter Send keyboard events
hover IDX Hover over element
scroll down Scroll the page
eval "js" Execute JavaScript (supports Promises)
screenshot Capture screenshot
get text|title|url Get page info
wait selector "h1" Wait for CSS selector
wait content-stable Wait for dynamic content to settle
back Navigate back
close Close browser; daemon stays alive
shutdown Stop everything

Diagnostics

Command What it does
log [-f] [--errors] View, follow, or filter daemon command log
sessions List saved sessions with expiry info
cookies Cookie management

Multi-page parallelism

--page NAME runs multiple named tabs concurrently — essential for batch operations like querying Grok 4 times in parallel:

# Open 4 pages
anyweb open --page g1 "https://x.com/i/grok?new=true"
anyweb open --page g2 "https://x.com/i/grok?new=true" &
anyweb open --page g3 "https://x.com/i/grok?new=true" &
anyweb open --page g4 "https://x.com/i/grok?new=true" &
wait

# Submit queries in parallel
anyweb input --page g1 textarea "Query 1" && anyweb keys --page g1 Enter &
anyweb input --page g2 textarea "Query 2" && anyweb keys --page g2 Enter &
wait

# Wait for all responses to finish
anyweb wait --page g1 content-stable --duration 5 --timeout 120000 &
anyweb wait --page g2 content-stable --duration 5 --timeout 120000 &
wait

# Read results
anyweb state --page g1 --ax > /tmp/result1.txt
anyweb state --page g2 --ax > /tmp/result2.txt

read manages pages internally — no --page needed for batch reads:

anyweb --json read "https://x.com/AnthropicAI" > /tmp/a.json &
anyweb --json read "https://x.com/OpenAI" > /tmp/b.json &
anyweb --json read "https://x.com/karpathy" > /tmp/c.json &
wait

Accessibility tree

state --ax returns the page structure optimized for LLM consumption — text, links, and interactive elements in a compact format:

URL: https://x.com/karpathy
Title: Andrej Karpathy (@karpathy) / X
Refs: 42 interactive elements
---
RootWebArea "Andrej Karpathy (@karpathy) / X"
  e0:link "Home" -> https://x.com/home
  e1:link "Search" -> https://x.com/explore
  ...
  StaticText "The hottest new programming language is English"
  e12:link "View tweet" -> https://x.com/karpathy/status/1617979122625712128

Options: --interactive-only (clickable/typable elements only), --depth N (limit tree depth).

Platform intelligence

read auto-detects the platform from the URL and applies the right extraction:

Platform Detection Extraction
X/Twitter x.com, twitter.com Tweets, threads, profiles, engagement stats, media URLs
Zhihu zhihu.com Auto-expand folded content, answers
XHS xiaohongshu.com Notes, comments, images
Generic everything else Article body, metadata

Force a platform: anyweb read -p x "https://example.com"

Session management

Login once, use forever. Cookies persist across daemon restarts:

anyweb login x                    # opens browser, you log in, cookies saved
anyweb login x --method import    # import from system Chrome (no manual login)
anyweb login x --method cookie    # paste cookies directly

doctor and status show session health:

=== Sessions ===
  x         : saved 2026-05-18 | cookies: 465 | expires 2027-05-25
  zhihu     : saved 2026-05-08 | cookies: 36  | expires 2026-11-03
  xhs       : no auth cookie ❌

=== Recommended Actions ===
  anyweb login xhs

Output formats

anyweb read URL              # plain text (default, pipe-friendly)
anyweb --json read URL       # structured JSON with metadata
anyweb --markdown read URL   # markdown with frontmatter

Architecture

CLI ──▶ Unix socket ──▶ Daemon (persistent) ──▶ Playwright browser
                         │
                         ├── Headless engine (default, fast)
                         └── CDP engine (system Chrome, for login/headed)
  • Daemon auto-starts on first command, stays alive across invocations
  • Dual engine: headless Playwright for speed, CDP to system Chrome for --headed / login
  • Anti-detection: playwright-stealth applied automatically
  • Session isolation: per-platform cookie storage in ~/.anyweb/sessions/

Requirements

  • Python 3.11+
  • macOS or Linux

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyweb-0.6.1.tar.gz (167.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anyweb-0.6.1-py3-none-any.whl (66.2 kB view details)

Uploaded Python 3

File details

Details for the file anyweb-0.6.1.tar.gz.

File metadata

  • Download URL: anyweb-0.6.1.tar.gz
  • Upload date:
  • Size: 167.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for anyweb-0.6.1.tar.gz
Algorithm Hash digest
SHA256 328d4cb38d4f326c7503620cd7d13ce3f7adb3cf0271824bc54e02ef01fde5b5
MD5 34e4d23296a3511dcff7fbacfcbd1b5d
BLAKE2b-256 11f4f0459c6ed66351b107127d1a28584f3111a919e76814a51504f2233bb069

See more details on using hashes here.

File details

Details for the file anyweb-0.6.1-py3-none-any.whl.

File metadata

  • Download URL: anyweb-0.6.1-py3-none-any.whl
  • Upload date:
  • Size: 66.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for anyweb-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 839eceda67b777eea49843fc306e74af25662bbe847658aaeff7d0d45ba62f1a
MD5 943da5145b4725aaccec55c04151dc90
BLAKE2b-256 2fae93f7e7958442a0b3b0f6d3f0f214d24e9203cfa0d13787fad105ccf27607

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page