CLI-first browser automation with platform intelligence

These details have not been verified by PyPI

Project links

Project description

anyweb

CLI-first browser automation for AI agents. One command reads a webpage, one command clicks a button — no boilerplate, no browser lifecycle to manage.

Built on Playwright with anti-detection, a persistent daemon, and platform-specific extractors for X, Zhihu, XHS, and more.

PyPI: https://pypi.org/project/anyweb/

Why anyweb

AI coding agents (Claude Code, Codex, etc.) need to interact with the web but can't drive a browser natively. anyweb fills that gap:

Atomic commands — open, click, input, keys, read — each one maps to a single action
Persistent daemon — browser stays alive between commands, ~50ms latency vs ~3s cold start
Lazy browser — the daemon starts the browser only when a command needs it; status, sessions, and log work with no browser running
Multi-page parallelism — --page flag runs named tabs concurrently, perfect for batch queries
Accessibility tree — state --ax returns a compact, structured page representation with text, URLs, and element indices — ideal for LLM context
Content-stable waiting — wait content-stable detects when dynamic content (Grok responses, streaming results) finishes loading
Platform intelligence — read auto-detects X/Zhihu/XHS and applies platform-specific extraction

Install

pip install anyweb        # or: uv tool install anyweb
anyweb install            # install Chromium
anyweb init               # optional: write ~/.anyweb/config.toml
anyweb doctor             # verify dependencies, config, and runtime

Quick start

# Read any webpage — auto-detects platform
anyweb read "https://x.com/karpathy"

# JSON output for programmatic use
anyweb --json read "https://x.com/karpathy"

# Search within a platform
anyweb search x "embodied AI news" --limit 10

# System health check
anyweb doctor

Commands

High-level

Command	What it does
`read URL`	Smart extract: auto-detect platform, render JS, return clean content
`search PLATFORM QUERY`	Search within X / Zhihu / XHS / WeChat
`login PLATFORM`	Save session cookies (opens real Chrome)
`intercept URL --pattern P`	Open a page and capture API responses matching a pattern
`status`	Daemon, browser, pages, sessions, recommended actions
`doctor [--fix]`	Dependency + config + runtime + credential check with auto-repair

Atomic browser

Command	What it does
`open URL`	Navigate (starts daemon + browser if needed)
`state`	List interactive elements with indices
`state --ax`	Accessibility tree — compact text + URLs for LLMs
`click IDX`	Click element by index, AX ref (`e5`), or CSS selector
`input IDX "text"`	Click element, then type
`type "text"`	Type via keyboard events (no click)
`keys Enter`	Send keyboard events
`hover IDX`	Hover over element
`scroll down`	Scroll the page
`eval "js"`	Execute JavaScript (supports Promises)
`screenshot`	Capture screenshot
`get text\|title\|url\|html`	Get page info
`wait selector "h1"`	Wait for CSS selector
`wait content-stable`	Wait for dynamic content to settle
`back`	Navigate back
`close`	Close browser; daemon stays alive
`shutdown`	Stop everything (browser + daemon)

All atomic commands accept --page NAME for parallel named tabs.

Setup & diagnostics

Command	What it does
`init`	Create `~/.anyweb/config.toml` (`--mode`, `--port`, `--proxy`, `--force`)
`install`	Install Playwright browsers
`doctor [--fix]`	Dependencies, config, runtime, credentials
`log [-f] [--errors]`	View, follow, or filter daemon command log
`sessions`	List active browser sessions and pages
`status`	Daemon PID/RSS, browser, pages, cookie expiry
`cookies get\|export\|import\|clear`	Cookie management

Configuration

anyweb works with zero config. To pin behavior, write ~/.anyweb/config.toml:

anyweb init                              # mode=auto, port=9222, no proxy
anyweb init --mode headless --proxy http://127.0.0.1:7890 --force

# ~/.anyweb/config.toml
[browser]
mode  = "auto"        # auto | headed | headless
port  = 9222          # CDP port used in headed mode
proxy = ""            # e.g. http://127.0.0.1:7890

Resolution order: environment variables > config.toml > defaults.

Setting	Env override	Default
Browser mode	`ANYWEB_MODE`	`auto`
Proxy	`ANYWEB_PROXY` (falls back to `ALL_PROXY`/`HTTPS_PROXY`/`HTTP_PROXY`)	none

mode = auto resolves to headed on a desktop (macOS, or Linux with DISPLAY/WAYLAND_DISPLAY) and headless on a headless server. doctor reports the resolved mode under its Config and Runtime checks.

WSL

On WSL, mode = auto uses a layered browser strategy:

Prefer Linux Chrome inside WSL with WSLg/X11/Wayland.
Fall back to Windows Chrome launched from WSL when Linux Chrome is missing.
Use headless Playwright when no headed browser path is available.

Run anyweb doctor to see which path your machine is using. For server-style WSL sessions, pin headless mode:

ANYWEB_MODE=headless anyweb read https://example.com

Multi-page parallelism

--page NAME runs multiple named tabs concurrently — essential for batch operations like querying Grok 4 times in parallel:

# Open 4 pages
anyweb open --page g1 "https://x.com/i/grok?new=true"
anyweb open --page g2 "https://x.com/i/grok?new=true" &
anyweb open --page g3 "https://x.com/i/grok?new=true" &
anyweb open --page g4 "https://x.com/i/grok?new=true" &
wait

# Submit queries in parallel
anyweb input --page g1 textarea "Query 1" && anyweb keys --page g1 Enter &
anyweb input --page g2 textarea "Query 2" && anyweb keys --page g2 Enter &
wait

# Wait for all responses to finish
anyweb wait --page g1 content-stable --duration 5 --timeout 120000 &
anyweb wait --page g2 content-stable --duration 5 --timeout 120000 &
wait

# Read results
anyweb state --page g1 --ax > /tmp/result1.txt
anyweb state --page g2 --ax > /tmp/result2.txt

read manages pages internally — no --page needed for batch reads:

anyweb --json read "https://x.com/AnthropicAI" > /tmp/a.json &
anyweb --json read "https://x.com/OpenAI" > /tmp/b.json &
anyweb --json read "https://x.com/karpathy" > /tmp/c.json &
wait

Accessibility tree

state --ax returns the page structure optimized for LLM consumption — text, links, and interactive elements in a compact format:

URL: https://x.com/karpathy
Title: Andrej Karpathy (@karpathy) / X
Refs: 42 interactive elements
---
RootWebArea "Andrej Karpathy (@karpathy) / X"
  e0:link "Home" -> https://x.com/home
  e1:link "Search" -> https://x.com/explore
  ...
  StaticText "The hottest new programming language is English"
  e12:link "View tweet" -> https://x.com/karpathy/status/1617979122625712128

Options: -i / --interactive-only (clickable/typable elements only), --depth N (limit tree depth). Click any ref directly: anyweb click e12.

Platform intelligence

read auto-detects the platform from the URL and applies the right extraction:

Platform	Detection	Extraction
X/Twitter	`x.com`, `twitter.com`	Tweets, threads, profiles, engagement stats, media URLs
Zhihu	`zhihu.com`	Auto-expand folded content, answers
XHS	`xiaohongshu.com`	Notes, comments, images
Generic	everything else	Article body, metadata

Force a platform: anyweb read -p x "https://example.com"

Session management

anyweb login x                    # opens real Chrome, you log in, cookies saved
anyweb login x --method import    # import from system Chrome (no manual login)
anyweb login x --method cookie    # paste cookies directly

doctor and status show session health:

=== Sessions ===
  x         : saved 2026-05-18 | cookies: 465 | expires 2027-05-25
  zhihu     : saved 2026-05-08 | cookies: 36  | expires 2026-11-03
  xhs       : no auth cookie ❌

=== Recommended Actions ===
  anyweb login xhs

Output formats

anyweb read URL              # plain text (default, pipe-friendly)
anyweb --json read URL       # structured JSON with metadata
anyweb --markdown read URL   # markdown with frontmatter

Architecture

CLI ──▶ Unix socket ──▶ Daemon (persistent) ──▶ one browser
MCP ──▶ stdio JSON-RPC ─┘    │
                             ├── mode=headed   → real Chrome via CDP (login state, watchable)
                             └── mode=headless → Playwright Chromium (servers, fast)

Single browser, auto-detected mode — config.toml sets mode (auto/headed/headless); auto picks headed on a desktop, headless on a server. No more engine routing or --headed/--headless flags.
Lazy startup — the daemon launches the browser only when a command needs it. status, sessions, and log respond instantly with no browser running.
Daemon auto-starts on first command (guarded by a file lock), stays alive across invocations, ~50ms per command.
Headed = CDP to a dedicated Chrome on the configured port — so cookies/login state are real and you can watch it work. Headless = Playwright Chromium — faster, no display required.
Anti-detection: playwright-stealth + behavior simulation applied automatically.
Session isolation: per-platform cookie storage in ~/.anyweb/sessions/.
Runtime layout: ~/.anyweb/ holds config.toml, sessions/, the daemon socket/PID, and daemon.log.

Full design doc: docs/architecture.md.

Requirements

Python 3.11+
macOS or Linux

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.6.3

Jul 8, 2026

0.6.2

May 25, 2026

0.6.1

May 25, 2026

0.6.0

May 25, 2026

0.5.3

May 24, 2026

0.5.2

May 24, 2026

0.5.1

May 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anyweb-0.6.3.tar.gz (175.9 kB view details)

Uploaded Jul 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

anyweb-0.6.3-py3-none-any.whl (71.9 kB view details)

Uploaded Jul 8, 2026 Python 3

File details

Details for the file anyweb-0.6.3.tar.gz.

File metadata

Download URL: anyweb-0.6.3.tar.gz
Upload date: Jul 8, 2026
Size: 175.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for anyweb-0.6.3.tar.gz
Algorithm	Hash digest
SHA256	`130fe1207d3527a613c82b594fa923b2cfc623da0191cb098333cf2160b4303b`
MD5	`1d6655bd07807811510c8d31d3ea9e39`
BLAKE2b-256	`c6afd790c4efbb3607ba9b46e5f6f7f6f50f10dc718af18aebf32804f9d5beff`

See more details on using hashes here.

File details

Details for the file anyweb-0.6.3-py3-none-any.whl.

File metadata

Download URL: anyweb-0.6.3-py3-none-any.whl
Upload date: Jul 8, 2026
Size: 71.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for anyweb-0.6.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5059d8ab967801e93606554f5983b677015bfe2a175fc07172d94a15aed2f96e`
MD5	`ce2473593d4f475ff700e98a33249dea`
BLAKE2b-256	`035de7e9d697c301cdb667c70d70309f3a7ad0d7c35c621be4394ce1b981ad3c`

See more details on using hashes here.

anyweb 0.6.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

anyweb

Why anyweb

Install

Quick start

Commands

High-level

Atomic browser

Setup & diagnostics

Configuration

WSL

Multi-page parallelism

Accessibility tree

Platform intelligence

Session management

Output formats

Architecture

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes