Terminal-based web page inspector for AI debugging sessions
Project description
webtap
Browser debugging via Chrome DevTools Protocol with native event storage and dynamic querying.
โจ Features
- ๐ Native CDP Storage - Events stored exactly as received in DuckDB
- ๐ฏ Dynamic Field Discovery - Automatically indexes all field paths from events
- ๐ซ Smart Filtering - Built-in filters for ads, tracking, analytics noise
- ๐ SQL Querying - Direct DuckDB access for complex analysis
- ๐ MCP Ready - Tools and resources for Claude/LLMs
- ๐จ Rich Display - Tables, alerts, and formatted output
- ๐ Python Inspection - Full Python environment for data exploration
๐ Prerequisites
Required system dependencies:
- google-chrome-stable or chromium - Browser with DevTools Protocol support
# macOS
brew install --cask google-chrome
# Ubuntu/Debian
wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
sudo sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
sudo apt update
sudo apt install google-chrome-stable
# Arch Linux
yay -S google-chrome # or google-chrome-stable from AUR
# Fedora
sudo dnf install google-chrome-stable
๐ฆ Installation
# Install via uv tool (recommended)
uv tool install webtap-tool
# Or with pipx
pipx install webtap-tool
# Update to latest
uv tool upgrade webtap-tool
# Uninstall
uv tool uninstall webtap-tool
๐ Quick Start
# 1. Install webtap
uv tool install webtap-tool
# 2. Optional: Setup helpers (first time only)
webtap --cli setup-filters # Download default filter configurations
webtap --cli setup-extension # Download Chrome extension files
webtap --cli setup-chrome # Install Chrome wrapper for debugging
# 3. Launch Chrome with debugging
webtap --cli run-chrome # Or manually: google-chrome-stable --remote-debugging-port=9222
# 4. Start webtap REPL
webtap
# 5. Connect and explore
>>> pages() # List available Chrome pages
>>> connect(0) # Connect to first page
>>> network() # View network requests (filtered)
>>> console() # View console messages
>>> events({"url": "*api*"}) # Query any CDP field dynamically
๐ MCP Setup for Claude
# Quick setup with Claude CLI
claude mcp add webtap -- webtap --mcp
Or manually configure Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"webtap": {
"command": "webtap",
"args": ["--mcp"]
}
}
}
๐ฎ Usage
Interactive REPL
webtap # Start REPL
webtap --mcp # Start as MCP server
CLI Commands
webtap --cli setup-filters # Download filter configurations
webtap --cli setup-extension # Download Chrome extension
webtap --cli setup-chrome # Install Chrome wrapper script
webtap --cli run-chrome # Launch Chrome with debugging
webtap --cli --help # Show all CLI commands
Commands
>>> pages() # List available Chrome pages
>>> connect(0) # Connect to first page
>>> network() # View network requests (filtered)
>>> console() # View console messages
>>> events({"url": "*api*"}) # Query any CDP field dynamically
>>> body(50) # Get response body
>>> inspect(49) # View event details
>>> js("document.title") # Execute JavaScript
Command Reference
| Command | Description |
|---|---|
pages() |
List available Chrome pages |
connect(page=0) |
Connect to page by index |
disconnect() |
Disconnect from current page |
navigate(url) |
Navigate to URL |
network(no_filters=False) |
View network requests |
console() |
View console messages |
events(filters) |
Query events dynamically |
inspect(rowid, expr=None) |
Inspect event details |
body(response_id, expr=None) |
Get response body |
js(code, wait_return=True) |
Execute JavaScript |
filters(action="list") |
Manage noise filters |
clear(events=True) |
Clear events/console/cache |
Core Commands
Connection & Navigation
pages() # List Chrome pages
connect(0) # Connect by index (shorthand)
connect(page=1) # Connect by index (explicit)
connect(page_id="xyz") # Connect by page ID
disconnect() # Disconnect from current page
navigate("https://...") # Navigate to URL
reload(ignore_cache=False) # Reload page
back() / forward() # Navigate history
page() # Show current page info
Dynamic Event Querying
# Query ANY field across ALL event types using dict filters
events({"url": "*github*"}) # Find GitHub requests
events({"status": 404}) # Find all 404s
events({"type": "xhr", "method": "POST"}) # Find AJAX POSTs
events({"headers": "*"}) # Extract all headers
# Field names are fuzzy-matched and case-insensitive
events({"URL": "*api*"}) # Works! Finds 'url', 'URL', 'documentURL'
events({"err": "*"}) # Finds 'error', 'errorText', 'err'
Network Monitoring
network() # Filtered network requests (default)
network(no_filters=True) # Show everything (noisy!)
network(filters=["ads", "tracking"]) # Specific filter categories
Filter Management
# Manage noise filters
filters() # Show current filters (default action="list")
filters(action="load") # Load from .webtap/filters.json
filters(action="add", config={"domain": "*doubleclick*", "category": "ads"})
filters(action="save") # Persist to disk
filters(action="toggle", config={"category": "ads"}) # Toggle category
# Built-in categories: ads, tracking, analytics, telemetry, cdn, fonts, images
Data Inspection
# Inspect events by rowid
inspect(49) # View event details by rowid
inspect(50, expr="data['params']['response']['headers']") # Extract field
# Response body inspection with Python expressions
body(49) # Get response body
body(49, expr="import json; json.loads(body)") # Parse JSON
body(49, expr="len(body)") # Check size
# Request interception
fetch("enable") # Enable request interception
fetch("disable") # Disable request interception
requests() # Show paused requests
resume(123) # Continue paused request by ID
fail(123) # Fail paused request by ID
Console & JavaScript
console() # View console messages
js("document.title") # Evaluate JavaScript (returns value)
js("console.log('Hello')", wait_return=False) # Execute without waiting
clear() # Clear events (default)
clear(console=True) # Clear browser console
clear(events=True, console=True, cache=True) # Clear everything
Architecture
Native CDP Storage Philosophy
Chrome Tab
โ CDP Events (WebSocket)
DuckDB Storage (events table)
โ SQL Queries + Field Discovery
Service Layer (WebTapService)
โโโ NetworkService - Request filtering
โโโ ConsoleService - Message handling
โโโ FetchService - Request interception
โโโ BodyService - Response caching
โ
Commands (Thin Wrappers)
โโโ events() - Query any field
โโโ network() - Filtered requests
โโโ console() - Messages
โโโ body() - Response bodies
โโโ js() - JavaScript execution
โ
API Server (FastAPI on :8765)
โโโ Chrome Extension Integration
How It Works
- Events stored as-is - No transformation, full CDP data preserved
- Field paths indexed - Every unique path like
params.response.statustracked - Dynamic discovery - Fuzzy matching finds fields without schemas
- SQL generation - User queries converted to DuckDB JSON queries
- On-demand fetching - Bodies, cookies fetched only when needed
Advanced Usage
Direct SQL Queries
# Access DuckDB directly
sql = """
SELECT json_extract_string(event, '$.params.response.url') as url,
json_extract_string(event, '$.params.response.status') as status
FROM events
WHERE json_extract_string(event, '$.method') = 'Network.responseReceived'
"""
results = state.cdp.query(sql)
Field Discovery
# See what fields are available
state.cdp.field_paths.keys() # All discovered field names
# Find all paths for a field
state.cdp.discover_field_paths("url")
# Returns: ['params.request.url', 'params.response.url', 'params.documentURL', ...]
Direct CDP Access
# Send CDP commands directly
state.cdp.execute("Network.getResponseBody", {"requestId": "123"})
state.cdp.execute("Storage.getCookies", {})
state.cdp.execute("Runtime.evaluate", {"expression": "window.location.href"})
Chrome Extension
Install the extension from packages/webtap/extension/:
- Open
chrome://extensions/ - Enable Developer mode
- Load unpacked โ Select extension folder
- Click extension icon to connect to pages
Examples
List and Connect to Pages
>>> pages()
## Chrome Pages
| Index | Title | URL | ID | Connected |
|:------|:---------------------|:-------------------------------|:-------|:----------|
| 0 | Messenger | https://www.m...1743198803269/ | DC8... | No |
| 1 | GitHub - replkit2 | https://githu...elsen/replkit2 | DD4... | No |
| 2 | YouTube Music | https://music.youtube.com/ | F83... | No |
_3 pages available_
<pages: 1 fields>
>>> connect(1)
## Connection Established
**Page:** GitHub - angelsen/replkit2
**URL:** https://github.com/angelsen/replkit2
<connect: 1 fields>
Monitor Network Traffic
>>> network()
## Network Requests
| ID | ReqID | Method | Status | URL | Type | Size |
|:-----|:-------------|:-------|:-------|:------------------------------------------------|:---------|:-----|
| 3264 | 682214.9033 | GET | 200 | https://api.github.com/graphql | Fetch | 22KB |
| 2315 | 682214.8985 | GET | 200 | https://api.github.com/repos/angelsen/replkit2 | Fetch | 16KB |
| 359 | 682214.8638 | GET | 200 | https://github.githubassets.com/assets/app.js | Script | 21KB |
_3 requests_
### Next Steps
- **Analyze responses:** `body(3264)` - fetch response body
- **Parse HTML:** `body(3264, "bs4(body, 'html.parser').find('title').text")`
- **Extract JSON:** `body(3264, "json.loads(body)['data']")`
- **Find patterns:** `body(3264, "re.findall(r'/api/\\w+', body)")`
- **Decode JWT:** `body(3264, "jwt.decode(body, options={'verify_signature': False})")`
- **Search events:** `events({'url': '*api*'})` - find all API calls
- **Intercept traffic:** `fetch('enable')` then `requests()` - pause and modify
<network: 1 fields>
View Console Messages
>>> console()
## Console Messages
| ID | Level | Source | Message | Time |
|:-----|:-----------|:---------|:----------------------------------------------------------------|:---------|
| 5939 | WARNING | security | An iframe which has both allow-scripts and allow-same-origin... | 11:42:46 |
| 2319 | LOG | console | API request completed | 11:42:40 |
| 32 | ERROR | network | Failed to load resource: the server responded with a status... | 12:47:41 |
_3 messages_
### Next Steps
- **Inspect error:** `inspect(32)` - view full stack trace
- **Find all errors:** `events({'level': 'error'})` - filter console errors
- **Extract stack:** `inspect(32, "data.get('stackTrace', {})")`
- **Search messages:** `events({'message': '*failed*'})` - pattern match
- **Check network:** `network()` - may show failed requests causing errors
<console: 1 fields>
Find and Analyze API Calls
>>> events({"url": "*api*", "method": "POST"})
## Query Results
| RowID | Method | URL | Status |
|:------|:----------------------------|:--------------------------------|:-------|
| 49 | Network.requestWillBeSent | https://api.github.com/graphql | - |
| 50 | Network.responseReceived | https://api.github.com/graphql | 200 |
_2 events_
<events: 1 fields>
>>> body(50, expr="import json; json.loads(body)['data']")
{'viewer': {'login': 'octocat', 'name': 'The Octocat'}}
>>> inspect(49) # View full request details
Debug Failed Requests
>>> events({"status": 404})
## Query Results
| RowID | Method | URL | Status |
|:------|:-------------------------|:----------------------------------|:-------|
| 32 | Network.responseReceived | https://api.example.com/missing | 404 |
| 29 | Network.responseReceived | https://api.example.com/notfound | 404 |
_2 events_
<events: 1 fields>
>>> events({"errorText": "*"}) # Find network errors
>>> events({"type": "Failed"}) # Find failed resources
Monitor Specific Domains
>>> events({"url": "*myapi.com*"}) # Your API
>>> events({"url": "*localhost*"}) # Local development
>>> events({"url": "*stripe*"}) # Payment APIs
Extract Headers and Cookies
>>> events({"headers": "*authorization*"}) # Find auth headers
>>> state.cdp.execute("Storage.getCookies", {}) # Get all cookies
>>> events({"setCookie": "*"}) # Find Set-Cookie headers
Filter Configuration
WebTap includes aggressive default filters to reduce noise. Customize in .webtap/filters.json:
{
"ads": {
"domains": ["*doubleclick*", "*googlesyndication*", "*adsystem*"],
"types": ["Ping", "Beacon"]
},
"tracking": {
"domains": ["*google-analytics*", "*segment*", "*mixpanel*"],
"types": ["Image", "Script"]
}
}
Design Principles
- Store AS-IS - No transformation of CDP events
- Query On-Demand - Extract only what's needed
- Dynamic Discovery - No predefined schemas
- SQL-First - Leverage DuckDB's JSON capabilities
- Minimal Memory - Store only CDP data
Requirements
- Chrome/Chromium with debugging enabled
- Python 3.12+
- Dependencies: websocket-client, duckdb, replkit2, fastapi, uvicorn, beautifulsoup4
๐๏ธ Architecture
Built on ReplKit2 for dual REPL/MCP functionality.
Key Design:
- Store AS-IS - No transformation of CDP events
- Query On-Demand - Extract only what's needed
- Dynamic Discovery - No predefined schemas
- SQL-First - Leverage DuckDB's JSON capabilities
- Minimal Memory - Store only CDP data
๐ Documentation
- Architecture - System design
- Vision - Design philosophy
- Services - Service layer implementations
- Commands - Command implementations
๐ ๏ธ Development
# Clone repository
git clone https://github.com/angelsen/tap-tools
cd tap-tools
# Install for development
uv sync --package webtap
# Run development version
uv run --package webtap webtap
# Run tests and checks
make check-webtap # Check build
make format # Format code
make lint # Fix linting
API Server
WebTap automatically starts a FastAPI server on port 8765 for Chrome extension integration:
GET /status- Connection statusGET /pages- List available Chrome pagesPOST /connect- Connect to a pagePOST /disconnect- Disconnect from current pagePOST /clear- Clear events/console/cacheGET /fetch/paused- Get paused requestsPOST /filters/toggle/{category}- Toggle filter categories
The API server runs in a background thread and doesn't block the REPL.
๐ License
MIT - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file webtap_tool-0.8.1.tar.gz.
File metadata
- Download URL: webtap_tool-0.8.1.tar.gz
- Upload date:
- Size: 262.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1dee82b2600f72d64ee0f1fef8a2969537308e2cd0bc7986e8d15215940b28e
|
|
| MD5 |
6cce4a05bf0652f148e7d1d4250d99e5
|
|
| BLAKE2b-256 |
294f6ab8408cf33d9e7550ed1fbefbd1b7d6b6ce0efa92495e65fe3e7c0b1fb4
|
File details
Details for the file webtap_tool-0.8.1-py3-none-any.whl.
File metadata
- Download URL: webtap_tool-0.8.1-py3-none-any.whl
- Upload date:
- Size: 269.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f4fe3c05ee61c69b824db1707e9e4d0109176b8c357e73878c79f9fb24a6ba4
|
|
| MD5 |
64387ece229c31edebb640f493cecd8e
|
|
| BLAKE2b-256 |
e5edbb8adf9b4a9a9d2bd71ae1360f8cf0083f39dee5bc16911279a28ee8799b
|