Skip to main content

Browser-based human-in-the-loop UIs for AI coding agents

Project description

OpenWebGoggles

AI coding agents are good at writing code. They are not good at showing you things. An agent can generate a 200-line diff, but it has no way to pull up a side-by-side review UI, highlight the parts that matter, and wait for you to say "approved" or "try again with fewer abstractions."

OpenWebGoggles fixes that. It gives any agent — Claude Code, a shell script, anything that can write JSON — the ability to open a browser-based UI and get structured decisions back from a human.

Not a chat interface. Not a terminal dump. A real interactive panel: forms, approval flows, dashboards, multi-step wizards. The kind of thing you'd build if you had a few days and a frontend team. Except the agent builds it on the fly from a JSON schema, and the whole round-trip takes seconds.

Agent ←→ OpenWebGoggles Server ←→ Browser UI ←→ Human

"The goggles — they do everything."

The Big Use Case: Review Before You Commit

Your agent just finished a round of work — refactored the auth module, updated three API endpoints, added tests. Before it commits, it opens a review UI in your browser that groups the changes by category and shows you what changed and why:

{
  "title": "Pre-Commit Review",
  "message": "### 3 categories of changes across 8 files\nReview each category and approve or request changes.",
  "message_format": "markdown",
  "status": "pending_review",
  "data": {
    "sections": [
      { "type": "text", "title": "Auth Refactor (4 files)", "format": "markdown",
        "content": "Replaced session-based auth with JWT tokens.\n\n- `auth.py` — new `create_token()` / `verify_token()` functions\n- `middleware.py` — swapped session lookup for token validation\n- `login.py` / `logout.py` — updated to issue/revoke JWTs" },
      { "type": "text", "title": "API Updates (2 files)", "format": "markdown",
        "content": "Updated `/users` and `/settings` to use new auth middleware.\n\n- Response codes unchanged, no breaking changes" },
      { "type": "text", "title": "Tests (2 files)", "format": "markdown",
        "content": "Added 14 tests for token lifecycle. All passing." },
      { "type": "form", "fields": [
        { "key": "feedback", "label": "Notes (optional)", "type": "textarea",
          "placeholder": "Anything you want changed before committing?" }
      ]}
    ]
  },
  "actions_requested": [
    { "id": "approve", "type": "approve", "label": "Commit & Push" },
    { "id": "revise", "type": "reject", "label": "Request Changes" }
  ]
}

You see the summary, scan the categories, and either approve or type "the logout endpoint should also clear the cookie" and hit Request Changes. The agent gets your structured response and acts on it. No scrolling through git diff in a terminal.

This pattern works as a pre-commit checkpoint, a PR summary review, or any time an agent wants sign-off before taking an irreversible action.

More Examples

Here's another example — a security audit where the agent has 12 findings to triage. Without OpenWebGoggles, it dumps them into the terminal and asks you to type approve or reject twelve times. With OpenWebGoggles, it opens a tabbed wizard in your browser — one finding per screen, editable severity dropdowns, analyst notes, a progress bar — and reads back your structured decisions when you're done.

The agent doesn't need to know HTML. It writes a JSON object describing what it wants to show, and the built-in dynamic renderer handles the rest:

{
  "title": "Security Finding 1 of 12",
  "status": "waiting_input",
  "data": {
    "sections": [
      { "type": "text", "content": "**SQL Injection** in `/api/users` endpoint" },
      { "type": "form", "fields": [
        { "key": "severity", "label": "Severity", "type": "select",
          "options": ["critical", "high", "medium", "low"], "value": "high" },
        { "key": "notes", "label": "Analyst Notes", "type": "textarea" }
      ]}
    ]
  },
  "actions_requested": [
    { "id": "confirm", "type": "approve", "label": "Confirmed" },
    { "id": "fp", "type": "reject", "label": "False Positive" }
  ]
}

The agent gets back:

{
  "actions": [{
    "action_id": "confirm",
    "type": "approve",
    "value": { "severity": "critical", "notes": "Escalated — no parameterized queries anywhere in this module." }
  }]
}

Structured data in, structured data out. The browser is just the rendering layer in between.

Quick Start

Install from PyPI:

# Recommended — isolates dependencies, puts binary on PATH
pipx install openwebgoggles

# Alternative — works fine, but shares dependencies with your Python environment
pip install openwebgoggles
Don't have pipx?

pipx installs Python CLI tools in isolated environments. Install it once:

# macOS (Homebrew)
brew install pipx && pipx ensurepath

# Linux / macOS (without Homebrew)
python3 -m pip install --user pipx && pipx ensurepath

# Then restart your terminal

See pipx.pypa.io for more options.

Then bootstrap for your editor:

Claude Code

cd your-project
openwebgoggles init claude

Creates .mcp.json and .claude/settings.json in your project. Restart Claude Code and you're live. Run this in each project where you want the tools available.

OpenCode

openwebgoggles init opencode

Adds to ~/.config/opencode/opencode.json (global config) — available in every project. Restart OpenCode and you're live.

To set up a specific project instead: openwebgoggles init opencode /path/to/project

Try It

Tell your agent:

"Show me a review UI for these changes and wait for my approval."

"Create a dashboard showing the build progress."

"Walk me through these security findings one at a time with severity dropdowns."

The agent figures out the JSON schema, calls webview, and a panel opens in your browser. You make your decisions, click approve, and the agent continues with your structured response.

What Gets Installed

Five MCP tools — that's the entire API surface:

Tool What it does
webview(state) Show a UI and block until the human responds
webview_update(state) Push live updates without blocking (progress, logs, status)
webview_read() Poll for actions without blocking
webview_status() Check if a session is active
webview_close() Close the session

Manual Setup

The init command is recommended — it resolves the absolute path to the binary so editors don't depend on PATH. But if you'd rather configure by hand, use the full path to the binary in your project's .mcp.json:

{
  "mcpServers": {
    "openwebgoggles": {
      "command": "/full/path/to/openwebgoggles"
    }
  }
}

Or for OpenCode, add to opencode.json:

{
  "mcp": {
    "openwebgoggles": {
      "type": "local",
      "command": ["/full/path/to/openwebgoggles"],
      "enabled": true
    }
  }
}

Tip: Find your binary path with which openwebgoggles or pipx list.

Bash Scripts (for shell-based agents)

If your agent orchestrates via shell scripts — or if you just want to understand the mechanics — the bash interface exposes the same capabilities:

# Start a session
bash scripts/start_webview.sh --app dynamic

# Push state to the browser
bash scripts/write_state.sh '{"version":1, "status":"pending_review", "title":"Review Changes", ...}'

# Block until the human responds (up to 5 minutes)
ACTIONS=$(bash scripts/wait_for_action.sh --timeout 300)

# Clean up
bash scripts/stop_webview.sh
Script Purpose
start_webview.sh --app <name> [--port N] Launch server and open browser
write_state.sh '<json>' Atomic state write
wait_for_action.sh [--timeout N] Block until human acts
read_actions.sh [--clear] Read actions, optionally clear
stop_webview.sh Graceful shutdown
init_webview_app.sh <name> Scaffold a custom app

How It Works Under the Hood

The architecture is deliberately simple. Three JSON files in a .openwebgoggles/ directory are the entire interface between the agent and the browser.

File Direction Purpose
state.json Agent → Browser What to show: data, UI schema, requested actions
actions.json Browser → Agent What the human decided
manifest.json Shared Session config: ports, app name, auth token

The Python server watches these files and pushes updates to the browser over WebSocket in real time. The browser renders the UI and writes responses back. The agent reads the response file and continues.

This means you can debug the entire system by looking at three JSON files. No hidden state, no message queues, no databases. If something looks wrong in the browser, cat .openwebgoggles/state.json and you'll see exactly what the agent sent.

The Dynamic Renderer

Most use cases don't require custom HTML. The built-in dynamic app takes a JSON schema and renders a complete, styled interface.

Section types: text, items, form, actions, progress, log, diff, table, tabs

Form field types: text, textarea, number, select, checkbox, email, url, static

Action styles: primary, success, danger, warning, ghost, approve, reject, submit, delete

You can combine these to build approval flows, configuration wizards, data entry forms, triage interfaces — really any structured interaction that runs on fields, selections, and decisions. For 80% of use cases, you never touch HTML.

Rich Section Types

Beyond basic forms and text, the renderer supports content types purpose-built for developer workflows:

  • progress — Task checklist with status icons and percentage bar. Pair with webview_update(merge=True) to stream live progress as your agent works.
  • log — Scrolling terminal output with ANSI color support. Great for build output, test results, or any streaming text.
  • diff — Unified diff viewer with line numbers, green/red coloring, and hunk headers. Show code changes without forcing the human to read raw patches.
  • table — Sortable data table with optional row selection. Good for test results, dependency lists, or any tabular data.
  • tabs — Client-side tabbed panels. Nest any other section types inside each tab. No server round-trip on tab switch.

Live Updates

webview_update() pushes state changes to the browser without blocking. The agent can continue working while the UI updates in real time:

webview_update({"status": "processing", "message": "Running tests..."}, merge=True)

Use merge=True to update specific fields without replacing the entire state. Or use presets for common patterns:

webview_update({"tasks": [...], "percentage": 75}, preset="progress")

Field Validation

Fields support client-side validation that blocks form submission until resolved:

{"key": "email", "type": "email", "label": "Email", "required": true,
 "pattern": "^[^@]+@[^@]+$", "errorMessage": "Enter a valid email"}

Available validators: required, pattern (regex), minLength, maxLength.

Conditional Fields

Show or hide fields based on other field values using behaviors:

{
  "data": {"sections": [...]},
  "behaviors": [
    {"when": {"field": "type", "equals": "custom"}, "show": ["custom_name"]},
    {"when": {"field": "confirm", "checked": true}, "enable": ["submit"]}
  ]
}

Conditions: equals, notEquals, in, notIn, checked, unchecked, empty, notEmpty, matches.

Multi-Panel Layouts

Use layout + panels for side-by-side content:

{
  "layout": {"type": "sidebar", "sidebarWidth": "280px"},
  "panels": {
    "sidebar": {"sections": [{"type": "items", "items": [...]}]},
    "main": {"sections": [{"type": "text", "content": "..."}]}
  }
}

Layout types: sidebar (main + nav), split (equal columns). Both collapse to single-column on mobile.

Custom Apps

When the dynamic renderer isn't enough — complex visualizations, custom layouts, domain-specific interactions — you can build a custom app:

bash scripts/init_webview_app.sh my-dashboard

This scaffolds index.html, app.js, and style.css with the SDK already wired up. The client SDK is vanilla JavaScript with zero dependencies:

const wv = new OpenWebGoggles();
await wv.connect();

// Listen for state updates from the agent
wv.onStateUpdate((state) => {
  // Render however you want
});

// Send structured responses back
await wv.approve("action-id", { comment: "Looks good" });
await wv.reject("action-id");
await wv.submitInput("field-id", "user input");
await wv.sendAction("custom-id", "custom", { any: "data" });

Two working examples are included in examples/:

  • approval-review — Code review UI with unified diffs, per-file toggles, approve/reject with comments
  • security-qa — Step-by-step security findings triage with editable fields, severity dropdowns, and a progress bar

These aren't toy demos. They're functional interfaces that handle real workflows. Start by reading their source if you're building something custom.

Patterns That Work Well

Single approval. Agent shows a summary, human clicks approve or reject. The simplest case, and probably the most common.

Pre-commit change review. This is the killer use case. Your agent finishes a round of changes — refactoring, feature work, security hardening — and before it commits, it opens a review UI that summarizes what changed and why, grouped by category. You see the diffs, the rationale, and approve or ask questions right in the browser. Think of it as a pre-commit checklist where the agent is the one presenting and you're the one signing off. Way faster than scrolling through a terminal dump of git diff, and you get structured feedback back to the agent if something needs to change.

Multi-step wizard. For N items that need review, show one at a time. The agent calls webview in a loop, advancing to the next item after each response. This avoids overwhelming the user with a wall of decisions.

Live dashboard. Agent calls webview to display initial state, then uses webview_update(merge=True) to stream progress, logs, and status changes in real time. The human sees a live-updating UI and can act when ready. Pair with progress and log sections for build pipelines, test runs, or deployment workflows.

Batch triage. Show all items at once with per-item actions — tabs, cards, or a list with inline controls. Works well when the total count is under 10 or so.

Security

The trust model is straightforward: the agent and the browser are on the same machine, and nobody else should be able to read or tamper with the communication between them.

Nine defense layers enforce this, all enabled by default:

  • Localhost-only binding — the server only listens on 127.0.0.1
  • Bearer token auth — 32-byte session token, constant-time comparison
  • WebSocket first-message auth — token verified before any data flows
  • Ed25519 signatures — server signs every state update (cryptographic proof of origin)
  • HMAC-SHA256 — browser signs every action (tamper detection)
  • Nonce replay prevention — each action can only be submitted once
  • Content Security Policy — per-request nonce blocks inline script injection
  • SecurityGate — 22 XSS patterns, zero-width character detection, schema validation
  • Rate limiting — 30 actions per minute per session

All cryptographic keys are ephemeral — generated in memory at session start, zeroed on shutdown, never written to disk in plaintext. The test suite covers OWASP Top 10, MITRE ATT&CK techniques, and LLM-specific attack vectors across 748 tests.

The tradeoff is real, though. This level of defense adds complexity to the codebase. If you're running in a fully trusted local environment and want to understand what each layer does, the security tests are the best documentation.

Development

# Run the full test suite
python -m pytest -v

# Lint
ruff check scripts/

Python 3.11+ required. Core dependencies: websockets, PyNaCl, mcp.

Reference Documentation

For the full details:

  • Data Contract — JSON file formats, state lifecycle, status values
  • SDK API — Complete client SDK reference
  • Integration Guide — Step-by-step patterns for connecting from other tools

License

Apache License 2.0 — see LICENSE.


Built by Techtoboggan.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openwebgoggles-0.8.1.tar.gz (548.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openwebgoggles-0.8.1-py3-none-any.whl (512.5 kB view details)

Uploaded Python 3

File details

Details for the file openwebgoggles-0.8.1.tar.gz.

File metadata

  • Download URL: openwebgoggles-0.8.1.tar.gz
  • Upload date:
  • Size: 548.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openwebgoggles-0.8.1.tar.gz
Algorithm Hash digest
SHA256 27155cbf25aa4699046798654d40b38c9fdaa8571d4f13e0a9a80a73062a01ed
MD5 981efaf97ae505c01592f5a072d28fb4
BLAKE2b-256 52610dbed4f31511fe00be9726469546948631d3c4076b0d0c1d27ffdc048a14

See more details on using hashes here.

Provenance

The following attestation bundles were made for openwebgoggles-0.8.1.tar.gz:

Publisher: publish.yml on techtoboggan/openwebgoggles

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file openwebgoggles-0.8.1-py3-none-any.whl.

File metadata

  • Download URL: openwebgoggles-0.8.1-py3-none-any.whl
  • Upload date:
  • Size: 512.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for openwebgoggles-0.8.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e40b86cca74b5ef75b48eabee033637a979f56a39ebd6066e49b609fe9f1d5e5
MD5 1d5d35ff60980fb05b4b62b1965b84ab
BLAKE2b-256 8bf1bcbdfc570d0c3bb8e5f2417fc9b230e1fb0b9feba6cc6702a423d2a61ad1

See more details on using hashes here.

Provenance

The following attestation bundles were made for openwebgoggles-0.8.1-py3-none-any.whl:

Publisher: publish.yml on techtoboggan/openwebgoggles

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page