Lightweight CDP browser control for Python — with an AI agent that can browse, read PDFs, manage files, and automate tasks.
Project description
tappi
Your own AI agent that controls a real browser and manages files — running entirely on your machine.
🌐 tappi.synthworx.com — Official home page & docs. Tappi is and will always be fully open source (MIT).
Give it a task in plain English. It opens your browser, navigates pages, clicks buttons, fills forms, reads content, creates PDFs, updates spreadsheets, and schedules recurring jobs. All your logins and cookies carry over. Everything stays local — your data never leaves your machine.
Think of it as a personal automation assistant with two superpowers: browser control and file management, sandboxed to one directory. Secure enough for work. Powerful enough to replace most browser automation scripts you've ever written.
Why tappi?
Every AI browser tool today pays a tax — either in tokens or in capability:
- Screenshot-based agents (Operator, Computer Use) send full page images to the LLM. The model squints at pixels, guesses coordinates, and prays it clicks the right button. A single interaction can burn 5-10K tokens on vision alone.
- DOM/accessibility tree tools (Playwright MCP, browser tools) dump the entire page structure into context. A single Reddit page can produce 50K+ tokens of nested elements. The LLM reads a novel just to find a button.
Tappi does neither. It indexes interactive elements into a compact numbered list:
[0] (link) Homepage → https://github.com/
[1] (button) Sign in
[2] (link) Explore → /explore
[3] (button) Submit Order
The LLM says click 3. Done. ~200 tokens instead of 5-50K. That's the difference.
- 10x more token-efficient than both screenshot-based and DOM-dump approaches. Structured element lists give the model exactly what it needs — nothing more.
- Better LLM decisions. Numbered elements with semantic labels (
[3] (button) Submit Order) are unambiguous. No hallucinated CSS selectors. No coordinate guessing. No wading through thousands of DOM nodes. - Real browser, real sessions. Connects to Chrome via CDP — your saved logins, cookies, and extensions are all there. Log in once, automate forever.
- Sandboxed by design. One workspace directory. One browser. No filesystem access beyond the sandbox. Safe for corporate environments where you can't install full automation platforms.
- Works everywhere. Linux, macOS, Windows. Python 3.10+. Single
pip install.
pip install tappi # Everything: CDP + MCP server + AI agent
Table of Contents
- Installation
- Quick Start
- AI Agent Mode ← New
- Web UI ← New
- Tutorial: Your First Automation
- How It Works
- Python Library
- CLI Reference
- Profiles
- Shadow DOM Support
- MCP Server ← New
- FAQ
- License
Installation
One-Line Installer (recommended)
Downloads Python if needed, creates a virtual environment, installs tappi, and drops a "Launch tappi" shortcut on your Desktop. Inspect the scripts first if you like — they're in the repo.
macOS:
curl -fsSL https://raw.githubusercontent.com/shaihazher/tappi/main/install/install-macos.sh | bash
Linux (Debian/Ubuntu, Fedora, Arch):
curl -fsSL https://raw.githubusercontent.com/shaihazher/tappi/main/install/install-linux.sh | bash
Windows (PowerShell):
irm https://raw.githubusercontent.com/shaihazher/tappi/main/install/install-windows.ps1 | iex
After install, double-click "Launch tappi" on your Desktop — it starts the browser, launches the web UI, and opens it automatically. Pick your AI provider and API key in the Settings page on first launch. See the Web UI Tutorial for a visual walkthrough.
Manual Install (with venv)
If you prefer to set things up yourself. Requires Python 3.10+.
macOS
# Install Python 3.13 (skip if you already have 3.10+)
brew install python@3.13
# Create and activate a virtual environment
python3.13 -m venv ~/.tappi-venv
source ~/.tappi-venv/bin/activate
# Install tappi
pip install --upgrade pip
pip install tappi
# Verify
bpy --version
To auto-activate on every new terminal, add to your ~/.zshrc (or ~/.bash_profile):
source ~/.tappi-venv/bin/activate
Linux (Debian/Ubuntu)
# Install Python 3.13 and venv (skip if you already have 3.10+)
sudo apt update
sudo apt install -y python3 python3-pip python3-venv
# Create and activate a virtual environment
python3 -m venv ~/.tappi-venv
source ~/.tappi-venv/bin/activate
# Install tappi
pip install --upgrade pip
pip install tappi
# Verify
bpy --version
To auto-activate on every new terminal, add to your ~/.bashrc:
source ~/.tappi-venv/bin/activate
Linux (Fedora/RHEL)
# Install Python 3.13 (skip if you already have 3.10+)
sudo dnf install -y python3 python3-pip
# Create and activate a virtual environment
python3 -m venv ~/.tappi-venv
source ~/.tappi-venv/bin/activate
# Install tappi
pip install --upgrade pip
pip install tappi
# Verify
bpy --version
Linux (Arch)
# Install Python (skip if you already have 3.10+)
sudo pacman -Sy python python-pip
# Create and activate a virtual environment
python -m venv ~/.tappi-venv
source ~/.tappi-venv/bin/activate
# Install tappi
pip install --upgrade pip
pip install tappi
# Verify
bpy --version
Windows
# Install Python 3.13 (skip if you already have 3.10+)
winget install Python.Python.3.13
# Create and activate a virtual environment
python -m venv $env:USERPROFILE\.tappi-venv
& "$env:USERPROFILE\.tappi-venv\Scripts\Activate.ps1"
# Install tappi
pip install --upgrade pip
pip install tappi
# Verify
bpy --version
To auto-activate on every new terminal, add to your PowerShell profile (notepad $PROFILE):
. "$env:USERPROFILE\.tappi-venv\Scripts\Activate.ps1"
Quick Install (no venv)
If you just want to get going and don't care about virtual environments:
pip install tappi
Quick Start
If you used the one-line installer, just double-click "Launch tappi" on your Desktop. Done.
From the terminal:
# Launch browser + web UI (opens http://127.0.0.1:8321)
bpy launch && bpy serve
# Or use the CLI agent directly
bpy setup # one-time: pick provider + API key
bpy launch # start the browser
bpy agent "Go to github.com and find today's trending Python repos"
AI Agent Mode
The agent is an LLM with 6 tools that can browse the web, read/write files, create PDFs, manage spreadsheets, run shell commands, and schedule recurring tasks — all within a sandboxed workspace directory.
Setup
bpy setup
The wizard walks you through:
- LLM Provider — OpenRouter, Anthropic, Claude Max (OAuth), OpenAI, AWS Bedrock, Azure, Google Vertex
- API Key — paste your key (or OAuth token for Claude Max)
- Model — defaults per provider, fully configurable
- Workspace — sandboxed directory for all file operations
- Browser Profile — which browser profile the agent uses
- Shell Access — toggle on/off
All config lives in ~/.tappi/config.json.
Providers
| Provider | Auth | Status |
|---|---|---|
| OpenRouter | API key | ✅ Ready |
| Anthropic | API key | ✅ Ready |
| Claude Max (OAuth) | OAuth token (sk-ant-oat01-...) |
✅ Ready |
| OpenAI | API key | ✅ Ready |
| AWS Bedrock | AWS credentials | ✅ Ready (via LiteLLM) |
| Azure OpenAI | API key + endpoint | ✅ Ready (via LiteLLM) |
| Google Vertex AI | Service account | ✅ Ready (via LiteLLM) |
All providers work through LiteLLM — one interface, any model.
Claude Max (OAuth) — Use Your Subscription
If you have a Claude Pro/Max subscription ($20-200/mo), you can use your OAuth token instead of paying per-API-call. This is the same token Claude Code uses.
bpy setup
# Choose "Claude Max (OAuth)"
# Paste your token: sk-ant-oat01-...
Where to find your token:
- If you use Claude Code: check your credentials file or environment
- The token format is
sk-ant-oat01-...(different from API keys which aresk-ant-api03-...) - It works as a drop-in replacement — no proxy, no special config
CLI Usage
Interactive mode
bpy agent
tappi agent (type 'quit' to exit, 'reset' to clear)
You: Go to hacker news and find the top post about AI
🔧 browser → launch
🔧 browser → open
🔧 browser → elements
🔧 browser → text
Agent: The top AI-related post on Hacker News right now is "GPT-5 Released"
with 342 points. It links to openai.com/blog/gpt5 and the discussion has
127 comments. Want me to read the article or the comments?
One-shot mode
bpy agent "Create a PDF report of today's weather in Houston"
The agent figures out the steps: open a weather site → extract data → create HTML → convert to PDF → save to workspace.
Tools
The agent has 6 tools, each exposed as a JSON schema the LLM calls natively:
| Tool | What it does |
|---|---|
| browser | Navigate, click, type, read pages, screenshots, tab management. Uses your real browser with saved logins. |
| files | Read, write, list, move, copy, delete files — sandboxed to workspace. |
| Read text from PDFs (PyMuPDF), create PDFs from HTML (WeasyPrint). | |
| spreadsheet | Read/write CSV and Excel (.xlsx) files, create new ones with headers. |
| shell | Run shell commands (cwd = workspace). Can be disabled in settings. |
| cron | Schedule recurring tasks with cron expressions or intervals. |
How the Agent Loop Works
User message
↓
┌──────────────────────────┐
│ LLM (via LiteLLM) │ ◄── Sees all 6 tools as JSON schemas
│ Decides what to do │
└──────────┬───────────────┘
│
▼
┌─ Tool calls? ──┐
│ │
Yes No → Return text response
│
▼
Execute each tool call
│
▼
Append results to conversation
│
▼
Loop back to LLM ────────────► (max 50 iterations)
The loop is synchronous — each tool call blocks until complete. No timeouts. The LLM sees tool results and decides the next step, just like a human would.
Cron (Scheduled Tasks)
Tell the agent to schedule recurring tasks:
You: Schedule a job to check trending repos on GitHub every morning at 9 AM
Agent: Done. Created job "GitHub Trends" with schedule "0 9 * * *".
Jobs are stored in ~/.tappi/jobs.json and persist across restarts. When bpy serve is running, APScheduler fires each job in its own agent session.
# Via CLI
bpy agent "List my scheduled jobs"
bpy agent "Pause the GitHub Trends job"
bpy agent "Remove job abc123"
Web UI
bpy serve # http://127.0.0.1:8321
bpy serve --port 9000 # custom port
The web UI has 4 sections:
💬 Chat
Full chat interface with live tool call visibility. As the agent works, you see each tool call and its result in real-time via WebSocket.
🌍 Browser Profiles
View and create browser profiles. Each profile has its own Chrome sessions (cookies, logins) and CDP port. Create profiles for different use cases — work, personal, social media.
⏰ Scheduled Jobs
View all cron jobs with their schedule, status (active/paused), and task description. Jobs are created via chat ("schedule a task to...").
⚙️ Settings
- Model — change the LLM model
- Browser Profile — select which profile the agent uses
- Shell Access — enable/disable shell commands
- Workspace — view the sandboxed directory
Note: Provider and API key changes require
bpy setup(CLI) — these aren't exposed in the web UI for security.
Tutorial: Your First Automation
Step 1: Launch the browser
bpy launch
✓ Chrome launched on port 9222
Profile: ~/.tappi/profiles/default
⚡ First launch — a fresh Chrome window opened.
Log into the sites you want to automate (Gmail, GitHub, etc.).
Those sessions will persist for all future launches.
First time only: A fresh Chrome window opens. Log into the websites you want to automate. Close the window when done. Your sessions are saved in the profile.
Step 2: Control it
bpy open github.com # Navigate
bpy elements # See what's clickable
bpy click 3 # Click element [3]
bpy type 5 "hello world" # Type into element [5]
bpy text # Read the page
bpy screenshot page.png # Screenshot
Every interactive element gets a number. Use that number with click and type.
How It Works
The connection
┌─────────────┐ CDP (WebSocket) ┌──────────────────┐
│ tappi │ ◄──────────────────────► │ Chrome/Chromium │
│ (your code) │ localhost:9222 │ (your sessions) │
└─────────────┘ └──────────────────┘
bpy launch starts Chrome with --remote-debugging-port=9222 and a persistent --user-data-dir. All commands connect to that port via WebSocket.
Real mouse events
click uses CDP's Input.dispatchMouseEvent — real mouse presses, not .click(). Works with React, Vue, Angular, and every framework.
Shadow DOM piercing
The element scanner recursively enters every shadow root. Reddit, GitHub, Salesforce, Angular Material — all work automatically.
Framework-aware typing
type dispatches proper input and change events using React's native value setter. SPAs with controlled components get the value update correctly.
Using as a Python Library
from tappi import Browser
Browser.launch() # Start Chrome
b = Browser() # Connect
b.open("https://github.com")
elements = b.elements() # List interactive elements
b.click(1) # Click by index
b.type(2, "search query") # Type into input
text = b.text() # Read visible text
b.screenshot("page.png") # Screenshot
b.upload("~/file.pdf") # Upload file
Profile management
from tappi.profiles import create_profile, list_profiles, get_profile
create_profile("work") # → port 9222
create_profile("personal") # → port 9223
# Run multiple simultaneously
work = get_profile("work")
Browser.launch(port=work["port"], user_data_dir=work["path"])
b = Browser(f"http://127.0.0.1:{work['port']}")
Agent as a library
from tappi.agent.loop import Agent
agent = Agent(
browser_profile="default",
on_tool_call=lambda name, params, result: print(f"🔧 {name}"),
)
response = agent.chat("Go to github.com and find trending repos")
print(response)
# Multi-turn
response = agent.chat("Now check the first one and summarize the README")
print(response)
# Reset conversation
agent.reset()
CLI Reference
Agent Commands
| Command | Description |
|---|---|
bpy setup |
Configure LLM provider, workspace, browser |
bpy agent [message] |
Chat with the agent (interactive or one-shot) |
bpy serve [--port 8321] |
Start the web UI |
Browser Commands
| Command | Description |
|---|---|
bpy launch [name] |
Start Chrome with a named profile |
bpy launch new [name] |
Create a new profile |
bpy launch list |
List all profiles |
bpy launch --default <name> |
Set the default profile |
Navigation
| Command | Description |
|---|---|
bpy open <url> |
Navigate to URL |
bpy url |
Print current URL |
bpy back / forward / refresh |
History navigation |
Interaction
| Command | Description |
|---|---|
bpy elements [selector] |
List interactive elements (numbered) |
bpy click <index> |
Click element by number |
bpy type <index> <text> |
Type into element |
bpy upload <path> [selector] |
Upload file |
Content
| Command | Description |
|---|---|
bpy text [selector] |
Extract visible text |
bpy html <selector> |
Get element HTML |
bpy eval <js> |
Run JavaScript |
bpy screenshot [path] |
Save screenshot |
Other
| Command | Description |
|---|---|
bpy tabs / tab <n> / newtab / close |
Tab management |
bpy scroll <dir> [px] |
Scroll the page |
bpy wait <ms> |
Wait (for scripts) |
Profiles
Each profile is a separate Chrome session with its own logins, cookies, and CDP port.
bpy launch # Default profile (port 9222)
bpy launch new work # Create "work" (port 9223)
bpy launch work # Launch it
bpy launch list # See all profiles
bpy launch --default work # Set default
bpy launch delete old # Remove a profile
# Run multiple simultaneously
bpy launch # Terminal 1: default on 9222
bpy launch work # Terminal 2: work on 9223
CDP_URL=http://127.0.0.1:9223 bpy tabs # Control work profile
Profiles live at ~/.tappi/profiles/<name>/. Config at ~/.tappi/config.json.
Shadow DOM Support
tappi automatically pierces shadow DOM boundaries. No configuration needed.
bpy open reddit.com
bpy elements # Finds elements inside shadow roots
bpy click 5 # Works normally
Environment Variables
| Variable | Description | Default |
|---|---|---|
CDP_URL |
CDP endpoint URL | http://127.0.0.1:9222 |
NO_COLOR |
Disable colored output | (unset) |
ANTHROPIC_API_KEY |
Anthropic/Claude Max key | (from config) |
OPENROUTER_API_KEY |
OpenRouter key | (from config) |
OPENAI_API_KEY |
OpenAI key | (from config) |
MCP Server
tappi includes a built-in MCP (Model Context Protocol) server, so you can use it with Claude Desktop, Cursor, Windsurf, OpenClaw, or any MCP-compatible AI agent.
Claude Desktop — One-Click Install (.mcpb)
The easiest way to add tappi to Claude Desktop is the .mcpb bundle — a single file that installs everything:
- Download
tappi-0.5.1.mcpbfrom the latest release - Double-click it — Claude Desktop installs the extension automatically
- Start Chrome with
tappi launchor--remote-debugging-port=9222 - Ask Claude to browse the web
No pip install. No config editing. No Python on your PATH. The bundle includes all source code and dependencies — Claude Desktop manages the runtime via uv.
See it in action: Real Claude Desktop conversation using tappi MCP
Manual Setup (pip)
If you prefer manual installation or use other MCP clients:
pip install tappi
Add to your claude_desktop_config.json:
{
"mcpServers": {
"tappi": {
"command": "tappi",
"args": ["mcp"],
"env": {
"CDP_URL": "http://127.0.0.1:9222"
}
}
}
}
Don't want to install anything? Use uvx (comes with uv):
{
"mcpServers": {
"tappi": {
"command": "uvx",
"args": ["tappi", "mcp"],
"env": {
"CDP_URL": "http://127.0.0.1:9222"
}
}
}
}
Prefer npm? There's a thin wrapper that delegates to the Python server:
npx tappi-mcp
Claude Desktop config with npx:
{
"mcpServers": {
"tappi": {
"command": "npx",
"args": ["tappi-mcp"],
"env": {
"CDP_URL": "http://127.0.0.1:9222"
}
}
}
}
Cursor / Windsurf
Same config format — add the tappi server to your MCP settings with the command above.
OpenClaw
tappi is available as an OpenClaw skill on ClawHub:
clawhub install tappi
HTTP/SSE Transport
For MCP clients that prefer HTTP instead of stdio:
tappi mcp --sse # default: 127.0.0.1:8377
tappi mcp --sse --port 9000 # custom port
Available Tools
The MCP server exposes 23 tools:
| Tool | Description |
|---|---|
tappi_open |
Navigate to a URL |
tappi_elements |
List interactive elements (numbered, shadow DOM piercing) |
tappi_click |
Click element by index |
tappi_type |
Type into element by index |
tappi_text |
Extract visible page text |
tappi_eval |
Run JavaScript in page context |
tappi_screenshot |
Capture page screenshot |
tappi_tabs |
List open tabs |
tappi_tab |
Switch tab |
tappi_scroll |
Scroll page |
tappi_upload |
Upload file (bypasses OS dialog) |
tappi_click_xy |
Click at coordinates (cross-origin iframes) |
tappi_iframe_rect |
Get iframe bounding box |
| ... and 10 more | newtab, close, url, back, forward, refresh, html, hover_xy, drag_xy, wait |
How It's Different
Unlike Playwright MCP or browser tool ARIA snapshots, tappi's MCP server:
- Connects to your existing Chrome — all sessions, cookies, extensions carry over
- Pierces shadow DOM — Gmail, Reddit, GitHub all work natively
- Returns compact indexed output —
[3] (button) Submitinstead of a 50K-token accessibility tree - Uses 3-10x fewer tokens per interaction
- No headless browser — runs in your real Chrome, invisible to bot detection
Prerequisites
Start Chrome with remote debugging enabled:
# Option 1: tappi launch (manages profiles for you)
tappi launch
# Option 2: Manual
google-chrome --remote-debugging-port=9222
Set CDP_URL in your MCP config to point to your Chrome instance (default: http://127.0.0.1:9222).
FAQ
Q: What's the difference between bpy agent and bpy commands?
bpy agent talks to an LLM that decides what to do. bpy click 3 directly executes a browser command. Use agent mode for complex multi-step tasks; use direct commands for scripting.
Q: Can I use my Claude Max subscription instead of paying per-API-call?
Yes. Choose "Claude Max (OAuth)" during bpy setup and paste your OAuth token (sk-ant-oat01-...). Same token Claude Code uses.
Q: Do I need to log in every time?
No. Log in once during your first bpy launch. Sessions persist in the profile directory.
Q: What browsers are supported? Chrome, Chromium, Brave, Microsoft Edge — anything Chromium-based with CDP support.
Q: Does it work headless?
Yes. bpy launch --headless runs without a visible window. Log in with a visible window first to set up sessions.
Q: Is my data safe?
File operations are sandboxed to your workspace directory. The agent cannot access files outside it. Shell access can be disabled. API keys are stored locally in ~/.tappi/config.json.
Q: How is this different from Selenium/Playwright?
| tappi | Selenium | Playwright | |
|---|---|---|---|
| Session reuse | ✅ | ❌ | Partial |
| AI agent | ✅ | ❌ | ❌ |
| Shadow DOM | ✅ | ❌ | ❌ |
| Dependencies | 1 (core) | Heavy | Heavy |
| Install size | ~100KB | ~50MB | ~200MB+ |
Architecture
tappi/
├── tappi/
│ ├── core.py # CDP engine (Phase 1)
│ ├── cli.py # bpy CLI
│ ├── profiles.py # Named profile management
│ ├── js_expressions.py # Injected JS for element scanning
│ ├── agent/
│ │ ├── loop.py # Agentic while-loop (LiteLLM)
│ │ ├── config.py # Provider/workspace/model config
│ │ ├── setup.py # Interactive setup wizard
│ │ └── tools/
│ │ ├── browser.py # Browser tool (wraps core.py)
│ │ ├── files.py # Sandboxed file ops
│ │ ├── pdf.py # PDF read (PyMuPDF) + create (WeasyPrint)
│ │ ├── spreadsheet.py # CSV + Excel (openpyxl)
│ │ ├── shell.py # Sandboxed shell execution
│ │ └── cron.py # APScheduler cron jobs
│ └── server/
│ └── app.py # FastAPI web UI + API
└── pyproject.toml
Blog Posts
- 🏆 Tappi Is the Most Token-Efficient Browser Tool for AI Agents — Deep competitive analysis with live benchmarks vs Agent-Browser, Playwright CLI, and more
- 🚀 Every AI Browser Tool Is Broken Except One — The original benchmark (59K tokens 3/3 ✅ vs 252K for browser tools)
- 🔌 Tappi MCP Is Live — MCP server for Claude Desktop
- 🖥️ Tappi Web UI Tutorial — Visual walkthrough
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tappi-0.7.0.tar.gz.
File metadata
- Download URL: tappi-0.7.0.tar.gz
- Upload date:
- Size: 5.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca9ca66b33c7bcb10fd8778d4bac1da0df74ebe0ab2b39b7d68cbfc6feb85c72
|
|
| MD5 |
05ef00688b57fc80480156b9ead9fc7e
|
|
| BLAKE2b-256 |
24e9b4e99624cb25b093dca4932a13e265fe7f141c760398763f1268e7548423
|
File details
Details for the file tappi-0.7.0-py3-none-any.whl.
File metadata
- Download URL: tappi-0.7.0-py3-none-any.whl
- Upload date:
- Size: 129.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a60d1feb3d7292a63cc1feff1d424579dd94e5229f1539c582d0efc3cf0c9eca
|
|
| MD5 |
843cebca0e9ffcae5a11d2a3290f74b7
|
|
| BLAKE2b-256 |
3684e83d1cae7d0e90ae746c4e8477b176e45294d7fe82053cf4845efb56b58a
|