Lightweight CDP browser control for Python — with an AI agent that can browse, read PDFs, manage files, and automate tasks.

These details have not been verified by PyPI

Project links

Project description

tappi

Your own AI agent that controls a real browser and manages files — running entirely on your machine.

🌐 tappi.synthworx.com — Official home page & docs. Tappi is and will always be fully open source (MIT).

Give it a task in plain English. It opens your browser, navigates pages, clicks buttons, fills forms, reads content, creates PDFs, updates spreadsheets, and schedules recurring jobs. All your logins and cookies carry over. Everything stays local — your data never leaves your machine.

Think of it as a personal automation assistant with two superpowers: browser control and file management, sandboxed to one directory. Secure enough for work. Powerful enough to replace most browser automation scripts you've ever written.

Why tappi?

Every AI browser tool today pays a tax — either in tokens or in capability:

Screenshot-based agents (Operator, Computer Use) send full page images to the LLM. The model squints at pixels, guesses coordinates, and prays it clicks the right button. A single interaction can burn 5-10K tokens on vision alone.
DOM/accessibility tree tools (Playwright MCP, browser tools) dump the entire page structure into context. A single Reddit page can produce 50K+ tokens of nested elements. The LLM reads a novel just to find a button.

Tappi does neither. It indexes interactive elements into a compact numbered list:

[0] (link) Homepage → https://github.com/
[1] (button) Sign in
[2] (link) Explore → /explore
[3] (button) Submit Order

The LLM says click 3. Done. ~200 tokens instead of 5-50K. That's the difference.

10x more token-efficient than both screenshot-based and DOM-dump approaches. Structured element lists give the model exactly what it needs — nothing more.
Better LLM decisions. Numbered elements with semantic labels ([3] (button) Submit Order) are unambiguous. No hallucinated CSS selectors. No coordinate guessing. No wading through thousands of DOM nodes.
Real browser, real sessions. Connects to Chrome via CDP — your saved logins, cookies, and extensions are all there. Log in once, automate forever.
Sandboxed by design. One workspace directory. One browser. No filesystem access beyond the sandbox. Safe for corporate environments where you can't install full automation platforms.
Works everywhere. Linux, macOS, Windows. Python 3.10+. Single pip install.

pip install tappi            # Everything: CDP + MCP server + AI agent

Quick Start
AI Agent Mode ← New
Web UI ← New
Tutorial: Your First Automation
How It Works
Python Library
CLI Reference
Profiles
Shadow DOM Support
MCP Server ← New
FAQ
License

Quick Start

# Install tappi (includes CDP, MCP server, and AI agent)
pip install tappi

# One-time setup: choose provider, enter API key, set workspace
bpy setup

# Launch a browser
bpy launch

# Chat with the agent
bpy agent "Go to github.com and find today's trending Python repos"

# Or use the web UI
bpy serve

AI Agent Mode

The agent is an LLM with 6 tools that can browse the web, read/write files, create PDFs, manage spreadsheets, run shell commands, and schedule recurring tasks — all within a sandboxed workspace directory.

Setup

bpy setup

The wizard walks you through:

LLM Provider — OpenRouter, Anthropic, Claude Max (OAuth), OpenAI, AWS Bedrock, Azure, Google Vertex
API Key — paste your key (or OAuth token for Claude Max)
Model — defaults per provider, fully configurable
Workspace — sandboxed directory for all file operations
Browser Profile — which browser profile the agent uses
Shell Access — toggle on/off

All config lives in ~/.tappi/config.json.

Providers

Provider	Auth	Status
OpenRouter	API key	✅ Ready
Anthropic	API key	✅ Ready
Claude Max (OAuth)	OAuth token (`sk-ant-oat01-...`)	✅ Ready
OpenAI	API key	✅ Ready
AWS Bedrock	AWS credentials	✅ Ready (via LiteLLM)
Azure OpenAI	API key + endpoint	✅ Ready (via LiteLLM)
Google Vertex AI	Service account	✅ Ready (via LiteLLM)

All providers work through LiteLLM — one interface, any model.

Claude Max (OAuth) — Use Your Subscription

If you have a Claude Pro/Max subscription ($20-200/mo), you can use your OAuth token instead of paying per-API-call. This is the same token Claude Code uses.

bpy setup
# Choose "Claude Max (OAuth)"
# Paste your token: sk-ant-oat01-...

Where to find your token:

If you use Claude Code: check your credentials file or environment
The token format is sk-ant-oat01-... (different from API keys which are sk-ant-api03-...)
It works as a drop-in replacement — no proxy, no special config

CLI Usage

Interactive mode

bpy agent

tappi agent (type 'quit' to exit, 'reset' to clear)

You: Go to hacker news and find the top post about AI
  🔧 browser → launch
  🔧 browser → open
  🔧 browser → elements
  🔧 browser → text

Agent: The top AI-related post on Hacker News right now is "GPT-5 Released"
with 342 points. It links to openai.com/blog/gpt5 and the discussion has
127 comments. Want me to read the article or the comments?

One-shot mode

bpy agent "Create a PDF report of today's weather in Houston"

The agent figures out the steps: open a weather site → extract data → create HTML → convert to PDF → save to workspace.

Tools

The agent has 6 tools, each exposed as a JSON schema the LLM calls natively:

Tool	What it does
browser	Navigate, click, type, read pages, screenshots, tab management. Uses your real browser with saved logins.
files	Read, write, list, move, copy, delete files — sandboxed to workspace.
pdf	Read text from PDFs (PyMuPDF), create PDFs from HTML (WeasyPrint).
spreadsheet	Read/write CSV and Excel (.xlsx) files, create new ones with headers.
shell	Run shell commands (cwd = workspace). Can be disabled in settings.
cron	Schedule recurring tasks with cron expressions or intervals.

How the Agent Loop Works

User message
    ↓
┌──────────────────────────┐
│   LLM (via LiteLLM)      │ ◄── Sees all 6 tools as JSON schemas
│   Decides what to do      │
└──────────┬───────────────┘
           │
           ▼
    ┌─ Tool calls? ──┐
    │                 │
   Yes               No → Return text response
    │
    ▼
Execute each tool call
    │
    ▼
Append results to conversation
    │
    ▼
Loop back to LLM ────────────►  (max 50 iterations)

The loop is synchronous — each tool call blocks until complete. No timeouts. The LLM sees tool results and decides the next step, just like a human would.

Cron (Scheduled Tasks)

Tell the agent to schedule recurring tasks:

You: Schedule a job to check trending repos on GitHub every morning at 9 AM
Agent: Done. Created job "GitHub Trends" with schedule "0 9 * * *".

Jobs are stored in ~/.tappi/jobs.json and persist across restarts. When bpy serve is running, APScheduler fires each job in its own agent session.

# Via CLI
bpy agent "List my scheduled jobs"
bpy agent "Pause the GitHub Trends job"
bpy agent "Remove job abc123"

Web UI

bpy serve                    # http://127.0.0.1:8321
bpy serve --port 9000        # custom port

The web UI has 4 sections:

💬 Chat

Full chat interface with live tool call visibility. As the agent works, you see each tool call and its result in real-time via WebSocket.

🌍 Browser Profiles

View and create browser profiles. Each profile has its own Chrome sessions (cookies, logins) and CDP port. Create profiles for different use cases — work, personal, social media.

⏰ Scheduled Jobs

View all cron jobs with their schedule, status (active/paused), and task description. Jobs are created via chat ("schedule a task to...").

⚙️ Settings

Model — change the LLM model
Browser Profile — select which profile the agent uses
Shell Access — enable/disable shell commands
Workspace — view the sandboxed directory

Note: Provider and API key changes require bpy setup (CLI) — these aren't exposed in the web UI for security.

Tutorial: Your First Automation

Step 1: Launch the browser

bpy launch

✓ Chrome launched on port 9222
  Profile: ~/.tappi/profiles/default

⚡ First launch — a fresh Chrome window opened.
   Log into the sites you want to automate (Gmail, GitHub, etc.).
   Those sessions will persist for all future launches.

First time only: A fresh Chrome window opens. Log into the websites you want to automate. Close the window when done. Your sessions are saved in the profile.

Step 2: Control it

bpy open github.com         # Navigate
bpy elements                # See what's clickable
bpy click 3                 # Click element [3]
bpy type 5 "hello world"    # Type into element [5]
bpy text                    # Read the page
bpy screenshot page.png     # Screenshot

Every interactive element gets a number. Use that number with click and type.

How It Works

The connection

┌─────────────┐     CDP (WebSocket)     ┌──────────────────┐
│  tappi  │ ◄──────────────────────► │  Chrome/Chromium  │
│  (your code) │     localhost:9222       │  (your sessions)  │
└─────────────┘                          └──────────────────┘

bpy launch starts Chrome with --remote-debugging-port=9222 and a persistent --user-data-dir. All commands connect to that port via WebSocket.

Real mouse events

click uses CDP's Input.dispatchMouseEvent — real mouse presses, not .click(). Works with React, Vue, Angular, and every framework.

Shadow DOM piercing

The element scanner recursively enters every shadow root. Reddit, GitHub, Salesforce, Angular Material — all work automatically.

Framework-aware typing

type dispatches proper input and change events using React's native value setter. SPAs with controlled components get the value update correctly.

Using as a Python Library

from tappi import Browser

Browser.launch()              # Start Chrome
b = Browser()                 # Connect

b.open("https://github.com")
elements = b.elements()       # List interactive elements
b.click(1)                    # Click by index
b.type(2, "search query")     # Type into input
text = b.text()               # Read visible text
b.screenshot("page.png")      # Screenshot
b.upload("~/file.pdf")        # Upload file

Profile management

from tappi.profiles import create_profile, list_profiles, get_profile

create_profile("work")        # → port 9222
create_profile("personal")    # → port 9223

# Run multiple simultaneously
work = get_profile("work")
Browser.launch(port=work["port"], user_data_dir=work["path"])
b = Browser(f"http://127.0.0.1:{work['port']}")

Agent as a library

from tappi.agent.loop import Agent

agent = Agent(
    browser_profile="default",
    on_tool_call=lambda name, params, result: print(f"🔧 {name}"),
)

response = agent.chat("Go to github.com and find trending repos")
print(response)

# Multi-turn
response = agent.chat("Now check the first one and summarize the README")
print(response)

# Reset conversation
agent.reset()

CLI Reference

Agent Commands

Command	Description
`bpy setup`	Configure LLM provider, workspace, browser
`bpy agent [message]`	Chat with the agent (interactive or one-shot)
`bpy serve [--port 8321]`	Start the web UI

Browser Commands

Command	Description
`bpy launch [name]`	Start Chrome with a named profile
`bpy launch new [name]`	Create a new profile
`bpy launch list`	List all profiles
`bpy launch --default <name>`	Set the default profile

Navigation

Command	Description
`bpy open <url>`	Navigate to URL
`bpy url`	Print current URL
`bpy back` / `forward` / `refresh`	History navigation

Interaction

Command	Description
`bpy elements [selector]`	List interactive elements (numbered)
`bpy click <index>`	Click element by number
`bpy type <index> <text>`	Type into element
`bpy upload <path> [selector]`	Upload file

Content

Command	Description
`bpy text [selector]`	Extract visible text
`bpy html <selector>`	Get element HTML
`bpy eval <js>`	Run JavaScript
`bpy screenshot [path]`	Save screenshot

Other

Command	Description
`bpy tabs` / `tab <n>` / `newtab` / `close`	Tab management
`bpy scroll <dir> [px]`	Scroll the page
`bpy wait <ms>`	Wait (for scripts)

Profiles

Each profile is a separate Chrome session with its own logins, cookies, and CDP port.

bpy launch                  # Default profile (port 9222)
bpy launch new work         # Create "work" (port 9223)
bpy launch work             # Launch it
bpy launch list             # See all profiles
bpy launch --default work   # Set default
bpy launch delete old       # Remove a profile

# Run multiple simultaneously
bpy launch                  # Terminal 1: default on 9222
bpy launch work             # Terminal 2: work on 9223
CDP_URL=http://127.0.0.1:9223 bpy tabs   # Control work profile

Profiles live at ~/.tappi/profiles/<name>/. Config at ~/.tappi/config.json.

Shadow DOM Support

tappi automatically pierces shadow DOM boundaries. No configuration needed.

bpy open reddit.com
bpy elements        # Finds elements inside shadow roots
bpy click 5         # Works normally

Environment Variables

Variable	Description	Default
`CDP_URL`	CDP endpoint URL	`http://127.0.0.1:9222`
`NO_COLOR`	Disable colored output	(unset)
`ANTHROPIC_API_KEY`	Anthropic/Claude Max key	(from config)
`OPENROUTER_API_KEY`	OpenRouter key	(from config)
`OPENAI_API_KEY`	OpenAI key	(from config)

MCP Server

tappi includes a built-in MCP (Model Context Protocol) server, so you can use it with Claude Desktop, Cursor, Windsurf, OpenClaw, or any MCP-compatible AI agent.

Claude Desktop — One-Click Install (.mcpb)

The easiest way to add tappi to Claude Desktop is the .mcpb bundle — a single file that installs everything:

Download tappi-0.5.1.mcpb from the latest release
Double-click it — Claude Desktop installs the extension automatically
Start Chrome with tappi launch or --remote-debugging-port=9222
Ask Claude to browse the web

No pip install. No config editing. No Python on your PATH. The bundle includes all source code and dependencies — Claude Desktop manages the runtime via uv.

See it in action: Real Claude Desktop conversation using tappi MCP

Manual Setup (pip)

If you prefer manual installation or use other MCP clients:

pip install tappi

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "tappi": {
      "command": "tappi",
      "args": ["mcp"],
      "env": {
        "CDP_URL": "http://127.0.0.1:9222"
      }
    }
  }
}

Don't want to install anything? Use uvx (comes with uv):

{
  "mcpServers": {
    "tappi": {
      "command": "uvx",
      "args": ["tappi", "mcp"],
      "env": {
        "CDP_URL": "http://127.0.0.1:9222"
      }
    }
  }
}

Prefer npm? There's a thin wrapper that delegates to the Python server:

npx tappi-mcp

Claude Desktop config with npx:

{
  "mcpServers": {
    "tappi": {
      "command": "npx",
      "args": ["tappi-mcp"],
      "env": {
        "CDP_URL": "http://127.0.0.1:9222"
      }
    }
  }
}

Cursor / Windsurf

Same config format — add the tappi server to your MCP settings with the command above.

OpenClaw

tappi is available as an OpenClaw skill on ClawHub:

clawhub install tappi

HTTP/SSE Transport

For MCP clients that prefer HTTP instead of stdio:

tappi mcp --sse                    # default: 127.0.0.1:8377
tappi mcp --sse --port 9000        # custom port

Available Tools

The MCP server exposes 23 tools:

Tool	Description
`tappi_open`	Navigate to a URL
`tappi_elements`	List interactive elements (numbered, shadow DOM piercing)
`tappi_click`	Click element by index
`tappi_type`	Type into element by index
`tappi_text`	Extract visible page text
`tappi_eval`	Run JavaScript in page context
`tappi_screenshot`	Capture page screenshot
`tappi_tabs`	List open tabs
`tappi_tab`	Switch tab
`tappi_scroll`	Scroll page
`tappi_upload`	Upload file (bypasses OS dialog)
`tappi_click_xy`	Click at coordinates (cross-origin iframes)
`tappi_iframe_rect`	Get iframe bounding box
... and 10 more	`newtab`, `close`, `url`, `back`, `forward`, `refresh`, `html`, `hover_xy`, `drag_xy`, `wait`

How It's Different

Unlike Playwright MCP or browser tool ARIA snapshots, tappi's MCP server:

Connects to your existing Chrome — all sessions, cookies, extensions carry over
Pierces shadow DOM — Gmail, Reddit, GitHub all work natively
Returns compact indexed output — [3] (button) Submit instead of a 50K-token accessibility tree
Uses 3-10x fewer tokens per interaction
No headless browser — runs in your real Chrome, invisible to bot detection

Prerequisites

Start Chrome with remote debugging enabled:

# Option 1: tappi launch (manages profiles for you)
tappi launch

# Option 2: Manual
google-chrome --remote-debugging-port=9222

Set CDP_URL in your MCP config to point to your Chrome instance (default: http://127.0.0.1:9222).

FAQ

Q: What's the difference between bpy agent and bpy commands? bpy agent talks to an LLM that decides what to do. bpy click 3 directly executes a browser command. Use agent mode for complex multi-step tasks; use direct commands for scripting.

Q: Can I use my Claude Max subscription instead of paying per-API-call? Yes. Choose "Claude Max (OAuth)" during bpy setup and paste your OAuth token (sk-ant-oat01-...). Same token Claude Code uses.

Q: Do I need to log in every time? No. Log in once during your first bpy launch. Sessions persist in the profile directory.

Q: What browsers are supported? Chrome, Chromium, Brave, Microsoft Edge — anything Chromium-based with CDP support.

Q: Does it work headless? Yes. bpy launch --headless runs without a visible window. Log in with a visible window first to set up sessions.

Q: Is my data safe? File operations are sandboxed to your workspace directory. The agent cannot access files outside it. Shell access can be disabled. API keys are stored locally in ~/.tappi/config.json.

Q: How is this different from Selenium/Playwright?

	tappi	Selenium	Playwright
Session reuse	✅	❌	Partial
AI agent	✅	❌	❌
Shadow DOM	✅	❌	❌
Dependencies	1 (core)	Heavy	Heavy
Install size	~100KB	~50MB	~200MB+

Architecture

tappi/
├── tappi/
│   ├── core.py                 # CDP engine (Phase 1)
│   ├── cli.py                  # bpy CLI
│   ├── profiles.py             # Named profile management
│   ├── js_expressions.py       # Injected JS for element scanning
│   ├── agent/
│   │   ├── loop.py             # Agentic while-loop (LiteLLM)
│   │   ├── config.py           # Provider/workspace/model config
│   │   ├── setup.py            # Interactive setup wizard
│   │   └── tools/
│   │       ├── browser.py      # Browser tool (wraps core.py)
│   │       ├── files.py        # Sandboxed file ops
│   │       ├── pdf.py          # PDF read (PyMuPDF) + create (WeasyPrint)
│   │       ├── spreadsheet.py  # CSV + Excel (openpyxl)
│   │       ├── shell.py        # Sandboxed shell execution
│   │       └── cron.py         # APScheduler cron jobs
│   └── server/
│       └── app.py              # FastAPI web UI + API
└── pyproject.toml

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.9.0

Feb 21, 2026

0.8.3

Feb 21, 2026

0.8.2

Feb 21, 2026

0.8.1

Feb 21, 2026

0.8.0

Feb 21, 2026

0.7.5

Feb 21, 2026

0.7.4

Feb 21, 2026

0.7.3

Feb 21, 2026

0.7.2

Feb 21, 2026

0.7.1

Feb 21, 2026

0.7.0

Feb 21, 2026

0.6.6

Feb 21, 2026

0.6.5

Feb 21, 2026

0.6.4

Feb 21, 2026

0.6.3

Feb 20, 2026

0.6.2

Feb 20, 2026

0.6.1

Feb 20, 2026

0.6.0

Feb 20, 2026

0.5.5

Feb 20, 2026

0.5.4

Feb 20, 2026

0.5.3

Feb 20, 2026

This version

0.5.2

Feb 20, 2026

0.5.1

Feb 20, 2026

0.5.0

Feb 20, 2026

0.4.1

Feb 19, 2026

0.4.0

Feb 19, 2026

0.3.0

Feb 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tappi-0.5.2.tar.gz (3.6 MB view details)

Uploaded Feb 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tappi-0.5.2-py3-none-any.whl (114.4 kB view details)

Uploaded Feb 20, 2026 Python 3

File details

Details for the file tappi-0.5.2.tar.gz.

File metadata

Download URL: tappi-0.5.2.tar.gz
Upload date: Feb 20, 2026
Size: 3.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for tappi-0.5.2.tar.gz
Algorithm	Hash digest
SHA256	`5ad607ca547ad91fc5ff36e0cc0643822bdd874ff90016007bbcaa915c79afe0`
MD5	`36b11d96e58c9f6f5fba7bb8e93afcc9`
BLAKE2b-256	`ca9d56b5126f94299452ee49d63183330c0a9480d3945078e87155a4a9087519`

See more details on using hashes here.

File details

Details for the file tappi-0.5.2-py3-none-any.whl.

File metadata

Download URL: tappi-0.5.2-py3-none-any.whl
Upload date: Feb 20, 2026
Size: 114.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for tappi-0.5.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ae91b50086431a8614d6b2f3938b97a0090a32b43f4e6daf104442435a877f4d`
MD5	`6f945c9ea1e5088a80d5ff4296af1431`
BLAKE2b-256	`0f319a7a1d395757799da94e240979de569db8d762019dfd010c00b77c910da6`

See more details on using hashes here.

tappi 0.5.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

tappi

Why tappi?

Table of Contents

Quick Start

AI Agent Mode

Setup

Providers

Claude Max (OAuth) — Use Your Subscription

CLI Usage

Interactive mode

One-shot mode

Tools

How the Agent Loop Works

Cron (Scheduled Tasks)

Web UI

💬 Chat

🌍 Browser Profiles

⏰ Scheduled Jobs

⚙️ Settings

Tutorial: Your First Automation

Step 1: Launch the browser

Step 2: Control it

How It Works

The connection

Real mouse events

Shadow DOM piercing

Framework-aware typing

Using as a Python Library

Profile management

Agent as a library

CLI Reference

Agent Commands

Browser Commands

Navigation

Interaction

Content

Other

Profiles

Shadow DOM Support

Environment Variables

MCP Server

Claude Desktop — One-Click Install (.mcpb)

Manual Setup (pip)

Cursor / Windsurf

OpenClaw

HTTP/SSE Transport

Available Tools

How It's Different

Prerequisites

FAQ

Architecture

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes