Skip to main content

CLI-first browser automation with platform intelligence

Project description

anyweb

CLI-first browser automation for AI agents. One command reads a webpage, one command clicks a button — no boilerplate, no browser lifecycle to manage.

Built on Playwright with anti-detection, a persistent daemon, and platform-specific extractors for X, Zhihu, XHS, and more.

Why anyweb

AI coding agents (Claude Code, Codex, etc.) need to interact with the web but can't drive a browser natively. anyweb fills that gap:

  • Atomic commandsopen, click, input, keys, read — each one maps to a single action
  • Persistent daemon — browser stays alive between commands, ~50ms latency vs ~3s cold start
  • Multi-page parallelism--page flag runs named tabs concurrently, perfect for batch queries
  • Accessibility treestate --ax returns a compact, structured page representation with text, URLs, and element indices — ideal for LLM context
  • Content-stable waitingwait content-stable detects when dynamic content (Grok responses, streaming results) finishes loading
  • Platform intelligenceread auto-detects X/Zhihu/XHS and applies platform-specific extraction

Install

pip install anyweb        # or: uv tool install anyweb
anyweb install            # install Chromium

Quick start

# Read any webpage — auto-detects platform
anyweb read "https://x.com/karpathy"

# JSON output for programmatic use
anyweb --json read "https://x.com/karpathy"

# Search within a platform
anyweb search x "embodied AI news" --limit 10

# System health check
anyweb doctor

Commands

High-level

Command What it does
read URL Smart extract: auto-detect platform, render JS, return clean content
search PLATFORM QUERY Search within X / Zhihu / XHS
login PLATFORM Save session cookies (opens browser window)
status Daemon, browsers, pages, sessions, recommended actions
doctor [--fix] Dependency + credential check with auto-repair

Atomic browser

Command What it does
open URL Navigate (starts daemon if needed)
state List interactive elements with indices
state --ax Accessibility tree — compact text + URLs for LLMs
click IDX Click element by index or CSS selector
input IDX "text" Click element, then type
type "text" Type via keyboard events (no click)
keys Enter Send keyboard events
hover IDX Hover over element
scroll down Scroll the page
eval "js" Execute JavaScript (supports Promises)
screenshot Capture screenshot
get text|title|url Get page info
wait selector "h1" Wait for CSS selector
wait content-stable Wait for dynamic content to settle
back Navigate back
close Close browser; daemon stays alive
shutdown Stop everything

Diagnostics

Command What it does
log [-f] [--errors] View, follow, or filter daemon command log
sessions List saved sessions with expiry info
cookies Cookie management

Multi-page parallelism

--page NAME runs multiple named tabs concurrently — essential for batch operations like querying Grok 4 times in parallel:

# Open 4 pages
anyweb open --page g1 "https://x.com/i/grok?new=true"
anyweb open --page g2 "https://x.com/i/grok?new=true" &
anyweb open --page g3 "https://x.com/i/grok?new=true" &
anyweb open --page g4 "https://x.com/i/grok?new=true" &
wait

# Submit queries in parallel
anyweb input --page g1 textarea "Query 1" && anyweb keys --page g1 Enter &
anyweb input --page g2 textarea "Query 2" && anyweb keys --page g2 Enter &
wait

# Wait for all responses to finish
anyweb wait --page g1 content-stable --duration 5 --timeout 120000 &
anyweb wait --page g2 content-stable --duration 5 --timeout 120000 &
wait

# Read results
anyweb state --page g1 --ax > /tmp/result1.txt
anyweb state --page g2 --ax > /tmp/result2.txt

read manages pages internally — no --page needed for batch reads:

anyweb --json read "https://x.com/AnthropicAI" > /tmp/a.json &
anyweb --json read "https://x.com/OpenAI" > /tmp/b.json &
anyweb --json read "https://x.com/karpathy" > /tmp/c.json &
wait

Accessibility tree

state --ax returns the page structure optimized for LLM consumption — text, links, and interactive elements in a compact format:

URL: https://x.com/karpathy
Title: Andrej Karpathy (@karpathy) / X
Refs: 42 interactive elements
---
RootWebArea "Andrej Karpathy (@karpathy) / X"
  e0:link "Home" -> https://x.com/home
  e1:link "Search" -> https://x.com/explore
  ...
  StaticText "The hottest new programming language is English"
  e12:link "View tweet" -> https://x.com/karpathy/status/1617979122625712128

Options: --interactive-only (clickable/typable elements only), --depth N (limit tree depth).

Platform intelligence

read auto-detects the platform from the URL and applies the right extraction:

Platform Detection Extraction
X/Twitter x.com, twitter.com Tweets, threads, profiles, engagement stats, media URLs
Zhihu zhihu.com Auto-expand folded content, answers
XHS xiaohongshu.com Notes, comments, images
Generic everything else Article body, metadata

Force a platform: anyweb read -p x "https://example.com"

Session management

Login once, use forever. Cookies persist across daemon restarts:

anyweb login x                    # opens browser, you log in, cookies saved
anyweb login x --method import    # import from system Chrome (no manual login)
anyweb login x --method cookie    # paste cookies directly

doctor and status show session health:

=== Sessions ===
  x         : saved 2026-05-18 | cookies: 465 | expires 2027-05-25
  zhihu     : saved 2026-05-08 | cookies: 36  | expires 2026-11-03
  xhs       : no auth cookie ❌

=== Recommended Actions ===
  anyweb login xhs

Output formats

anyweb read URL              # plain text (default, pipe-friendly)
anyweb --json read URL       # structured JSON with metadata
anyweb --markdown read URL   # markdown with frontmatter

Architecture

CLI ──▶ Unix socket ──▶ Daemon (persistent) ──▶ Playwright browser
                         │
                         ├── Headless engine (default, fast)
                         └── CDP engine (system Chrome, for login/headed)
  • Daemon auto-starts on first command, stays alive across invocations
  • Dual engine: headless Playwright for speed, CDP to system Chrome for --headed / login
  • Anti-detection: playwright-stealth applied automatically
  • Session isolation: per-platform cookie storage in ~/.anyweb/sessions/

Requirements

  • Python 3.11+
  • macOS or Linux

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyweb-0.6.0.tar.gz (164.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anyweb-0.6.0-py3-none-any.whl (64.2 kB view details)

Uploaded Python 3

File details

Details for the file anyweb-0.6.0.tar.gz.

File metadata

  • Download URL: anyweb-0.6.0.tar.gz
  • Upload date:
  • Size: 164.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for anyweb-0.6.0.tar.gz
Algorithm Hash digest
SHA256 6d6b641069155e3efd29d08e04ac9d21f4b6c0f9b97c0dafff582664cfe5b2de
MD5 0950f2d7027ab2383d009735dd2d4902
BLAKE2b-256 f09092744000ba81806bd619b3766e3a094a7a75408b4ff306b22caf622a0b20

See more details on using hashes here.

File details

Details for the file anyweb-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: anyweb-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 64.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for anyweb-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a800f3f641cef928a4ba0d2ef35b526e7ec8cb4f0e7b10a05a7de8f9ec8adf14
MD5 32d417105b38c45cac5a47584a2504eb
BLAKE2b-256 373457f31308d9aa6cd4859a8c571bb854aff65a21a46bd33f4dc2d8d84a928c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page