Terminal-based web page inspector for AI debugging sessions

These details have not been verified by PyPI

Project description

webtap

Browser debugging via Chrome DevTools Protocol with native event storage and dynamic querying.

✨ Features

🔍 Native CDP Storage - Events stored exactly as received in DuckDB
🎯 Dynamic Field Discovery - Automatically indexes all field paths from events
🚫 Smart Filtering - Built-in filters for ads, tracking, analytics noise
📊 SQL Querying - Direct DuckDB access for complex analysis
🔌 MCP Ready - Tools and resources for Claude/LLMs
🎨 Rich Display - Tables, alerts, and formatted output
🐍 Python Inspection - Full Python environment for data exploration

📋 Prerequisites

Required system dependencies:

google-chrome-stable or chromium - Browser with DevTools Protocol support

# macOS
brew install --cask google-chrome

# Ubuntu/Debian
wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | sudo apt-key add -
sudo sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
sudo apt update
sudo apt install google-chrome-stable

# Arch Linux
yay -S google-chrome  # or google-chrome-stable from AUR

# Fedora
sudo dnf install google-chrome-stable

📦 Installation

# Install via uv tool (recommended)
uv tool install webtap-tool

# Or with pipx
pipx install webtap-tool

# Update to latest
uv tool upgrade webtap-tool

# Uninstall
uv tool uninstall webtap-tool

🚀 Quick Start

# 1. Install webtap
uv tool install webtap-tool

# 2. Optional: Setup helpers (first time only)
webtap --cli setup-filters       # Download default filter configurations
webtap --cli setup-extension     # Download Chrome extension files
webtap --cli setup-chrome        # Install Chrome wrapper for debugging

# 3. Launch Chrome with debugging
webtap --cli run-chrome          # Or manually: google-chrome-stable --remote-debugging-port=9222

# 4. Start webtap REPL
webtap

# 5. Connect and explore
>>> pages()                          # List available Chrome pages
>>> connect(0)                       # Connect to first page
>>> network()                        # View network requests (filtered)
>>> console()                        # View console messages
>>> events({"url": "*api*"})         # Query any CDP field dynamically

🔌 MCP Setup for Claude

# Quick setup with Claude CLI
claude mcp add webtap -- webtap --mcp

Or manually configure Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "webtap": {
      "command": "webtap",
      "args": ["--mcp"]
    }
  }
}

🎮 Usage

Interactive REPL

webtap                     # Start REPL
webtap --mcp               # Start as MCP server

CLI Commands

webtap --cli setup-filters      # Download filter configurations
webtap --cli setup-extension    # Download Chrome extension
webtap --cli setup-chrome       # Install Chrome wrapper script
webtap --cli run-chrome         # Launch Chrome with debugging
webtap --cli --help            # Show all CLI commands

Commands

>>> pages()                          # List available Chrome pages
>>> connect(0)                       # Connect to first page
>>> network()                        # View network requests (filtered)
>>> console()                        # View console messages
>>> events({"url": "*api*"})         # Query any CDP field dynamically
>>> body(50)                         # Get response body
>>> inspect(49)                      # View event details
>>> js("document.title")             # Execute JavaScript

Command Reference

Command	Description
`pages()`	List available Chrome pages
`connect(page=0)`	Connect to page by index
`disconnect()`	Disconnect from current page
`navigate(url)`	Navigate to URL
`network(no_filters=False)`	View network requests
`console()`	View console messages
`events(filters)`	Query events dynamically
`inspect(rowid, expr=None)`	Inspect event details
`body(response_id, expr=None)`	Get response body
`js(code, wait_return=True)`	Execute JavaScript
`filters(action="list")`	Manage noise filters
`clear(events=True)`	Clear events/console/cache

Core Commands

Connection & Navigation

pages()                      # List Chrome pages
connect(0)                   # Connect by index (shorthand)
connect(page=1)              # Connect by index (explicit)
connect(page_id="xyz")       # Connect by page ID  
disconnect()                 # Disconnect from current page
navigate("https://...")      # Navigate to URL
reload(ignore_cache=False)   # Reload page
back() / forward()           # Navigate history
page()                       # Show current page info

Dynamic Event Querying

# Query ANY field across ALL event types using dict filters
events({"url": "*github*"})              # Find GitHub requests
events({"status": 404})                  # Find all 404s
events({"type": "xhr", "method": "POST"})   # Find AJAX POSTs  
events({"headers": "*"})                 # Extract all headers

# Field names are fuzzy-matched and case-insensitive
events({"URL": "*api*"})     # Works! Finds 'url', 'URL', 'documentURL'
events({"err": "*"})         # Finds 'error', 'errorText', 'err'

Network Monitoring

network()                              # Filtered network requests (default)
network(no_filters=True)               # Show everything (noisy!)
network(filters=["ads", "tracking"])   # Specific filter categories

Filter Management

# Manage noise filters
filters()                                # Show current filters (default action="list")
filters(action="load")                   # Load from .webtap/filters.json
filters(action="add", config={"domain": "*doubleclick*", "category": "ads"})
filters(action="save")                   # Persist to disk
filters(action="toggle", config={"category": "ads"})  # Toggle category

# Built-in categories: ads, tracking, analytics, telemetry, cdn, fonts, images

Data Inspection

# Inspect events by rowid
inspect(49)                         # View event details by rowid
inspect(50, expr="data['params']['response']['headers']")  # Extract field

# Response body inspection with Python expressions
body(49)                            # Get response body
body(49, expr="import json; json.loads(body)")  # Parse JSON
body(49, expr="len(body)")         # Check size

# Request interception
fetch("enable")                     # Enable request interception
fetch("disable")                    # Disable request interception
requests()                          # Show paused requests
resume(123)                         # Continue paused request by ID
fail(123)                           # Fail paused request by ID

Console & JavaScript

console()                           # View console messages
js("document.title")                # Evaluate JavaScript (returns value)
js("console.log('Hello')", wait_return=False)  # Execute without waiting
clear()                             # Clear events (default)
clear(console=True)                 # Clear browser console
clear(events=True, console=True, cache=True)  # Clear everything

Architecture

Native CDP Storage Philosophy

Chrome Tab
    ↓ CDP Events (WebSocket)
DuckDB Storage (events table)
    ↓ SQL Queries + Field Discovery
Service Layer (WebTapService)
    ├── NetworkService - Request filtering
    ├── ConsoleService - Message handling
    ├── FetchService - Request interception
    └── BodyService - Response caching
    ↓
Commands (Thin Wrappers)
    ├── events() - Query any field
    ├── network() - Filtered requests  
    ├── console() - Messages
    ├── body() - Response bodies
    └── js() - JavaScript execution
    ↓
API Server (FastAPI on :8765)
    └── Chrome Extension Integration

How It Works

Events stored as-is - No transformation, full CDP data preserved
Field paths indexed - Every unique path like params.response.status tracked
Dynamic discovery - Fuzzy matching finds fields without schemas
SQL generation - User queries converted to DuckDB JSON queries
On-demand fetching - Bodies, cookies fetched only when needed

Advanced Usage

Direct SQL Queries

# Access DuckDB directly
sql = """
    SELECT json_extract_string(event, '$.params.response.url') as url,
           json_extract_string(event, '$.params.response.status') as status
    FROM events 
    WHERE json_extract_string(event, '$.method') = 'Network.responseReceived'
"""
results = state.cdp.query(sql)

Field Discovery

# See what fields are available
state.cdp.field_paths.keys()  # All discovered field names

# Find all paths for a field
state.cdp.discover_field_paths("url")
# Returns: ['params.request.url', 'params.response.url', 'params.documentURL', ...]

Direct CDP Access

# Send CDP commands directly
state.cdp.execute("Network.getResponseBody", {"requestId": "123"})
state.cdp.execute("Storage.getCookies", {})
state.cdp.execute("Runtime.evaluate", {"expression": "window.location.href"})

Chrome Extension

Install the extension from packages/webtap/extension/:

Open chrome://extensions/
Enable Developer mode
Load unpacked → Select extension folder
Click extension icon to connect to pages

Examples

List and Connect to Pages

>>> pages()
## Chrome Pages

| Index | Title                | URL                            | ID     | Connected |
|:------|:---------------------|:-------------------------------|:-------|:----------|
| 0     | Messenger            | https://www.m...1743198803269/ | DC8... | No        |
| 1     | GitHub - replkit2    | https://githu...elsen/replkit2 | DD4... | No        |
| 2     | YouTube Music        | https://music.youtube.com/     | F83... | No        |

_3 pages available_
<pages: 1 fields>

>>> connect(1)
## Connection Established

**Page:** GitHub - angelsen/replkit2

**URL:** https://github.com/angelsen/replkit2
<connect: 1 fields>

Monitor Network Traffic

>>> network()
## Network Requests

| ID   | ReqID        | Method | Status | URL                                             | Type     | Size |
|:-----|:-------------|:-------|:-------|:------------------------------------------------|:---------|:-----|
| 3264 | 682214.9033  | GET    | 200    | https://api.github.com/graphql                  | Fetch    | 22KB |
| 2315 | 682214.8985  | GET    | 200    | https://api.github.com/repos/angelsen/replkit2  | Fetch    | 16KB |
| 359  | 682214.8638  | GET    | 200    | https://github.githubassets.com/assets/app.js   | Script   | 21KB |

_3 requests_

### Next Steps

- **Analyze responses:** `body(3264)` - fetch response body
- **Parse HTML:** `body(3264, "bs4(body, 'html.parser').find('title').text")`
- **Extract JSON:** `body(3264, "json.loads(body)['data']")`
- **Find patterns:** `body(3264, "re.findall(r'/api/\\w+', body)")`
- **Decode JWT:** `body(3264, "jwt.decode(body, options={'verify_signature': False})")`
- **Search events:** `events({'url': '*api*'})` - find all API calls
- **Intercept traffic:** `fetch('enable')` then `requests()` - pause and modify
<network: 1 fields>

View Console Messages

>>> console()
## Console Messages

| ID   | Level      | Source   | Message                                                         | Time     |
|:-----|:-----------|:---------|:----------------------------------------------------------------|:---------|
| 5939 | WARNING    | security | An iframe which has both allow-scripts and allow-same-origin... | 11:42:46 |
| 2319 | LOG        | console  | API request completed                                           | 11:42:40 |
| 32   | ERROR      | network  | Failed to load resource: the server responded with a status...  | 12:47:41 |

_3 messages_

### Next Steps

- **Inspect error:** `inspect(32)` - view full stack trace
- **Find all errors:** `events({'level': 'error'})` - filter console errors
- **Extract stack:** `inspect(32, "data.get('stackTrace', {})")`
- **Search messages:** `events({'message': '*failed*'})` - pattern match
- **Check network:** `network()` - may show failed requests causing errors
<console: 1 fields>

Find and Analyze API Calls

>>> events({"url": "*api*", "method": "POST"})
## Query Results

| RowID | Method                      | URL                             | Status |
|:------|:----------------------------|:--------------------------------|:-------|
| 49    | Network.requestWillBeSent   | https://api.github.com/graphql  | -      |
| 50    | Network.responseReceived    | https://api.github.com/graphql  | 200    |

_2 events_
<events: 1 fields>

>>> body(50, expr="import json; json.loads(body)['data']")
{'viewer': {'login': 'octocat', 'name': 'The Octocat'}}

>>> inspect(49)  # View full request details

Debug Failed Requests

>>> events({"status": 404})
## Query Results

| RowID | Method                   | URL                               | Status |
|:------|:-------------------------|:----------------------------------|:-------|
| 32    | Network.responseReceived | https://api.example.com/missing   | 404    |
| 29    | Network.responseReceived | https://api.example.com/notfound  | 404    |

_2 events_
<events: 1 fields>

>>> events({"errorText": "*"})  # Find network errors
>>> events({"type": "Failed"})  # Find failed resources

Monitor Specific Domains

>>> events({"url": "*myapi.com*"})  # Your API
>>> events({"url": "*localhost*"})  # Local development
>>> events({"url": "*stripe*"})     # Payment APIs

Extract Headers and Cookies

>>> events({"headers": "*authorization*"})  # Find auth headers
>>> state.cdp.execute("Storage.getCookies", {})  # Get all cookies
>>> events({"setCookie": "*"})  # Find Set-Cookie headers

Filter Configuration

WebTap includes aggressive default filters to reduce noise. Customize in .webtap/filters.json:

{
  "ads": {
    "domains": ["*doubleclick*", "*googlesyndication*", "*adsystem*"],
    "types": ["Ping", "Beacon"]
  },
  "tracking": {
    "domains": ["*google-analytics*", "*segment*", "*mixpanel*"],
    "types": ["Image", "Script"]
  }
}

Design Principles

Store AS-IS - No transformation of CDP events
Query On-Demand - Extract only what's needed
Dynamic Discovery - No predefined schemas
SQL-First - Leverage DuckDB's JSON capabilities
Minimal Memory - Store only CDP data

Requirements

Chrome/Chromium with debugging enabled
Python 3.12+
Dependencies: websocket-client, duckdb, replkit2, fastapi, uvicorn, beautifulsoup4

🏗️ Architecture

Built on ReplKit2 for dual REPL/MCP functionality.

Key Design:

Store AS-IS - No transformation of CDP events
Query On-Demand - Extract only what's needed
Dynamic Discovery - No predefined schemas
SQL-First - Leverage DuckDB's JSON capabilities
Minimal Memory - Store only CDP data

📚 Documentation

Architecture - System design
Vision - Design philosophy
Services - Service layer implementations
Commands - Command implementations

🛠️ Development

# Clone repository
git clone https://github.com/angelsen/tap-tools
cd tap-tools

# Install for development
uv sync --package webtap

# Run development version
uv run --package webtap webtap

# Run tests and checks
make check-webtap   # Check build
make format         # Format code
make lint           # Fix linting

API Server

WebTap automatically starts a FastAPI server on port 8765 for Chrome extension integration:

GET /status - Connection status
GET /pages - List available Chrome pages
POST /connect - Connect to a page
POST /disconnect - Disconnect from current page
POST /clear - Clear events/console/cache
GET /fetch/paused - Get paused requests
POST /filters/toggle/{category} - Toggle filter categories

The API server runs in a background thread and doesn't block the REPL.

📄 License

MIT - see LICENSE for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.17.1

Apr 6, 2026

0.17.0

Mar 29, 2026

0.16.1

Mar 28, 2026

0.16.0

Mar 22, 2026

0.15.0

Feb 14, 2026

0.14.3

Feb 10, 2026

0.14.2

Feb 9, 2026

0.14.1

Jan 20, 2026

0.14.0

Jan 20, 2026

0.13.0

Jan 9, 2026

0.12.0

Dec 27, 2025

0.11.1

Dec 19, 2025

0.11.0

Dec 19, 2025

0.10.0

Dec 18, 2025

0.9.1

Dec 17, 2025

0.9.0

Dec 17, 2025

0.8.1

Oct 16, 2025

0.8.0

Oct 16, 2025

0.7.1

Oct 12, 2025

0.7.0

Oct 10, 2025

0.6.0

Oct 10, 2025

0.5.2

Oct 9, 2025

0.5.1

Oct 9, 2025

0.5.0

Oct 9, 2025

0.4.0

Sep 28, 2025

0.3.0

Sep 19, 2025

This version

0.2.3

Sep 12, 2025

0.2.2

Sep 12, 2025

0.2.1

Sep 12, 2025

0.2.0

Sep 12, 2025

0.1.5

Sep 10, 2025

0.1.4

Sep 8, 2025

0.1.3

Sep 5, 2025

0.1.2

Sep 5, 2025

0.1.1

Sep 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webtap_tool-0.2.3.tar.gz (224.4 kB view details)

Uploaded Sep 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

webtap_tool-0.2.3-py3-none-any.whl (239.7 kB view details)

Uploaded Sep 12, 2025 Python 3

File details

Details for the file webtap_tool-0.2.3.tar.gz.

File metadata

Download URL: webtap_tool-0.2.3.tar.gz
Upload date: Sep 12, 2025
Size: 224.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.15

File hashes

Hashes for webtap_tool-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`657d78c0b008434d67ade45077b50c3cb775db9ef467a36b9b2a23526195c859`
MD5	`a718f41428b1418f0910cdb5eaf1a793`
BLAKE2b-256	`f72c1a70ec19f5e53d99fff705d3e9bf3123e7f416239f21cc3060121df211bb`

See more details on using hashes here.

File details

Details for the file webtap_tool-0.2.3-py3-none-any.whl.

File metadata

Download URL: webtap_tool-0.2.3-py3-none-any.whl
Upload date: Sep 12, 2025
Size: 239.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.15

File hashes

Hashes for webtap_tool-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`513c5bbd0d6c7914817d4967d66eb9ed45705ae5a2febaf5838855fbbe8cc727`
MD5	`a3fad44c0b5888e5574879f372faab4b`
BLAKE2b-256	`a9b0669535a891a248128e8da172c8bc7500feb7c8632e45338d28ab1369adc2`

See more details on using hashes here.

webtap-tool 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

webtap

✨ Features

📋 Prerequisites

📦 Installation

🚀 Quick Start

🔌 MCP Setup for Claude

🎮 Usage

Interactive REPL

CLI Commands

Commands

Command Reference

Core Commands

Connection & Navigation

Dynamic Event Querying

Network Monitoring

Filter Management

Data Inspection

Console & JavaScript

Architecture

Native CDP Storage Philosophy

How It Works

Advanced Usage

Direct SQL Queries

Field Discovery

Direct CDP Access

Chrome Extension

Examples

List and Connect to Pages

Monitor Network Traffic

View Console Messages

Find and Analyze API Calls

Debug Failed Requests

Monitor Specific Domains

Extract Headers and Cookies

Filter Configuration

Design Principles

Requirements

🏗️ Architecture

📚 Documentation

🛠️ Development

API Server

📄 License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes