Skip to main content

CLI-first browser automation with platform intelligence

Project description

anyweb

CLI-first browser automation for AI agents. One command reads a webpage, one command clicks a button — no boilerplate, no browser lifecycle to manage.

Built on Playwright with anti-detection, a persistent daemon, and platform-specific extractors for X, Zhihu, XHS, and more.

Why anyweb

AI coding agents (Claude Code, Codex, etc.) need to interact with the web but can't drive a browser natively. anyweb fills that gap:

  • Atomic commandsopen, click, input, keys, read — each one maps to a single action
  • Persistent daemon — browser stays alive between commands, ~50ms latency vs ~3s cold start
  • Multi-page parallelism--page flag runs named tabs concurrently, perfect for batch queries
  • Accessibility treestate --ax returns a compact, structured page representation with text, URLs, and element indices — ideal for LLM context
  • Content-stable waitingwait content-stable detects when dynamic content (Grok responses, streaming results) finishes loading
  • Platform intelligenceread auto-detects X/Zhihu/XHS and applies platform-specific extraction

Install

pip install anyweb        # or: uv tool install anyweb
anyweb install            # install Chromium

Quick start

# Read any webpage — auto-detects platform
anyweb read "https://x.com/karpathy"

# JSON output for programmatic use
anyweb --json read "https://x.com/karpathy"

# Search within a platform
anyweb search x "embodied AI news" --limit 10

# System health check
anyweb doctor

Commands

High-level

Command What it does
read URL Smart extract: auto-detect platform, render JS, return clean content
search PLATFORM QUERY Search within X / Zhihu / XHS
login PLATFORM Save session cookies (opens browser window)
status Daemon, browsers, pages, sessions, recommended actions
doctor [--fix] Dependency + credential check with auto-repair

Atomic browser

Command What it does
open URL Navigate (starts daemon if needed)
state List interactive elements with indices
state --ax Accessibility tree — compact text + URLs for LLMs
click IDX Click element by index or CSS selector
input IDX "text" Click element, then type
type "text" Type via keyboard events (no click)
keys Enter Send keyboard events
hover IDX Hover over element
scroll down Scroll the page
eval "js" Execute JavaScript (supports Promises)
screenshot Capture screenshot
get text|title|url Get page info
wait selector "h1" Wait for CSS selector
wait content-stable Wait for dynamic content to settle
back Navigate back
close Close browser; daemon stays alive
shutdown Stop everything

Diagnostics

Command What it does
log [-f] [--errors] View, follow, or filter daemon command log
sessions List saved sessions with expiry info
cookies Cookie management

Multi-page parallelism

--page NAME runs multiple named tabs concurrently — essential for batch operations like querying Grok 4 times in parallel:

# Open 4 pages
anyweb open --page g1 "https://x.com/i/grok?new=true"
anyweb open --page g2 "https://x.com/i/grok?new=true" &
anyweb open --page g3 "https://x.com/i/grok?new=true" &
anyweb open --page g4 "https://x.com/i/grok?new=true" &
wait

# Submit queries in parallel
anyweb input --page g1 textarea "Query 1" && anyweb keys --page g1 Enter &
anyweb input --page g2 textarea "Query 2" && anyweb keys --page g2 Enter &
wait

# Wait for all responses to finish
anyweb wait --page g1 content-stable --duration 5 --timeout 120000 &
anyweb wait --page g2 content-stable --duration 5 --timeout 120000 &
wait

# Read results
anyweb state --page g1 --ax > /tmp/result1.txt
anyweb state --page g2 --ax > /tmp/result2.txt

read manages pages internally — no --page needed for batch reads:

anyweb --json read "https://x.com/AnthropicAI" > /tmp/a.json &
anyweb --json read "https://x.com/OpenAI" > /tmp/b.json &
anyweb --json read "https://x.com/karpathy" > /tmp/c.json &
wait

Accessibility tree

state --ax returns the page structure optimized for LLM consumption — text, links, and interactive elements in a compact format:

URL: https://x.com/karpathy
Title: Andrej Karpathy (@karpathy) / X
Refs: 42 interactive elements
---
RootWebArea "Andrej Karpathy (@karpathy) / X"
  e0:link "Home" -> https://x.com/home
  e1:link "Search" -> https://x.com/explore
  ...
  StaticText "The hottest new programming language is English"
  e12:link "View tweet" -> https://x.com/karpathy/status/1617979122625712128

Options: --interactive-only (clickable/typable elements only), --depth N (limit tree depth).

Platform intelligence

read auto-detects the platform from the URL and applies the right extraction:

Platform Detection Extraction
X/Twitter x.com, twitter.com Tweets, threads, profiles, engagement stats, media URLs
Zhihu zhihu.com Auto-expand folded content, answers
XHS xiaohongshu.com Notes, comments, images
Generic everything else Article body, metadata

Force a platform: anyweb read -p x "https://example.com"

Session management

Login once, use forever. Cookies persist across daemon restarts:

anyweb login x                    # opens browser, you log in, cookies saved
anyweb login x --method import    # import from system Chrome (no manual login)
anyweb login x --method cookie    # paste cookies directly

doctor and status show session health:

=== Sessions ===
  x         : saved 2026-05-18 | cookies: 465 | expires 2027-05-25
  zhihu     : saved 2026-05-08 | cookies: 36  | expires 2026-11-03
  xhs       : no auth cookie ❌

=== Recommended Actions ===
  anyweb login xhs

Output formats

anyweb read URL              # plain text (default, pipe-friendly)
anyweb --json read URL       # structured JSON with metadata
anyweb --markdown read URL   # markdown with frontmatter

Architecture

CLI ──▶ Unix socket ──▶ Daemon (persistent) ──▶ Playwright browser
                         │
                         ├── Headless engine (default, fast)
                         └── CDP engine (system Chrome, for login/headed)
  • Daemon auto-starts on first command, stays alive across invocations
  • Dual engine: headless Playwright for speed, CDP to system Chrome for --headed / login
  • Anti-detection: playwright-stealth applied automatically
  • Session isolation: per-platform cookie storage in ~/.anyweb/sessions/

Requirements

  • Python 3.11+
  • macOS or Linux

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyweb-0.5.3.tar.gz (167.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anyweb-0.5.3-py3-none-any.whl (66.4 kB view details)

Uploaded Python 3

File details

Details for the file anyweb-0.5.3.tar.gz.

File metadata

  • Download URL: anyweb-0.5.3.tar.gz
  • Upload date:
  • Size: 167.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for anyweb-0.5.3.tar.gz
Algorithm Hash digest
SHA256 4d560a5755ffa2de5fb493657839b836a25e381b419ad77983579db58c017b51
MD5 e3d47d60995b706e5d2587c58241f01e
BLAKE2b-256 924fa57cad1377f9b14f1a5e1db43baad430fb6ee3e19047c67322c7456ee853

See more details on using hashes here.

File details

Details for the file anyweb-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: anyweb-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 66.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for anyweb-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b177c721a4d07c0be77166e57f91cc636e535820409427638efbb51f119ea06d
MD5 3c4fdb499553efb642bc2f6f371bc9bc
BLAKE2b-256 02537462d31bcaabdcfc9f74486b4b5a6e4ca1ed2a2beedfe5ba6efc66c558a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page