Gives coding agents eyes for frontend work — visual QA and verification powered by Yutori n1.
Project description
frontend-visualqa
Gives coding agents eyes for frontend work — visual QA and verification powered by Yutori n1.
What it does
- Verifies explicit visual claims against a running localhost frontend
- Captures screenshots for quick visual inspection
- Reuses browser sessions across MCP tool calls for multi-step debugging
- Works as a CLI (
frontend-visualqa verify), MCP server (frontend-visualqa serve), or agent skill (/frontend-visualqa)
Does not start your dev server. If the URL is unreachable, claims return not_testable.
Why visualqa?
Playwright MCP can click, type, and assert against the DOM — but it cannot see the page. It can run cleanly on the wrong page, assert modal.isVisible() on a modal rendered off-screen, or miss a layout that broke on mobile.
n1 is a pixels-to-actions model trained with RL on live websites. Two capabilities matter here:
-
Self-correcting navigation — Point the agent at the product catalog instead of a specific product page and n1 recognizes the wrong page, clicks through to the right one, and reports
trace.wrong_page_recovered: true. Playwright MCP would run assertions on the wrong page and silently pass — garbage in, garbage out.
n1 lands on the product catalog→
Navigated to the correct product page -
Rich visual evaluation — On the cart page, both items show sale prices ($149.99 and $79.99) but n1 caught that the subtotal of $279.98 uses the original prices — the discount was never applied. On the API dashboard, the quota label reads "100%" but the progress bar is visibly only two-thirds full. Playwright MCP would pass both — the DOM text is consistent and the progress bar width is just a CSS value.
n1 catches the discount-not-applied bug
Label says 100% but the bar is only at 2/3rds
Known limitation
- Native
<select>dropdowns — n1 cannot see or interact with native HTML<select>dropdown options because they render as OS-level widgets outside the browser viewport. If your page uses native selects, replace them with custom in-browser dropdown components for visual testing, or pre-fill the selection via URL parameters.
Install
Prerequisites
Install uv if you don't already have it:
curl -LsSf https://astral.sh/uv/install.sh | sh
Quick install (recommended)
-
Install CLIs:
uv tool install frontend-visualqa \ --with-executables-from yutori \ --with-executables-from playwright playwright install chromium
This installs the
frontend-visualqa,yutori, andplaywrightCLIs and downloads the Chromium browser binary. -
Log into Yutori API:
yutori auth login
This opens your browser to save your Yutori API key to
~/.yutori/config.json.Or, manually add your API key
Go to platform.yutori.com and add your key to the config file:
mkdir -p ~/.yutori cat > ~/.yutori/config.json << 'EOF' {"api_key": "yt-your-api-key"} EOF
-
Register the MCP server using add-mcp (works with all clients):
npx add-mcp -g -n frontend-visualqa "frontend-visualqa serve"
Pick the clients you want to configure.
-
Install skills using skills.sh:
npx skills add yutori-ai/frontend-visualqa -g
Adds the
/frontend-visualqaslash command for claim-based visual QA guidance.-ginstalls at user scope. Omit-gfor project-local install. -
Restart your agent (Codex, Claude Code, etc) so the installs are picked up.
To uninstall later:
uv tool uninstall frontend-visualqa npx skills remove -g frontend-visualqa
add-mcphas no remove command. Delete thefrontend-visualqaentry from your client's MCP config (e.g.~/.mcp.json).
Manual per-client setup
Claude Code
Plugin (recommended) — installs MCP tools + skill together:
/plugin marketplace add yutori-ai/frontend-visualqa
/plugin install frontend-visualqa@frontend-visualqa-plugins
MCP only (if you prefer not to use the plugin):
claude mcp add --scope user frontend-visualqa -- frontend-visualqa serve
Codex
codex mcp add frontend-visualqa -- frontend-visualqa serve
Skills can be installed via npx skills add above, or with $skill-installer inside Codex:
$skill-installer install https://github.com/yutori-ai/frontend-visualqa/tree/main/.agents/skills/frontend-visualqa
Cursor / VS Code / other MCP hosts
Use the checked-in .mcp.json, or point your client at frontend-visualqa serve.
From source
uv sync
uv run playwright install chromium
Register the MCP server with your client using uvx --from /absolute/path/to/frontend-visualqa frontend-visualqa serve as the command.
Uninstall
Claude Code plugin
/plugin uninstall frontend-visualqa@frontend-visualqa-plugins -s user
Codex
Remove the MCP server entry from ~/.codex/config.toml, then delete the skill directory:
rm -rf ~/.agents/skills/frontend-visualqa
Restart Codex after removing.
Examples
The repo includes demo pages you can use immediately — no dev server required:
# From the repo root, serve the included demo pages
cd /path/to/frontend-visualqa
lsof -ti:8000 | xargs kill 2>/dev/null; python3 -m http.server 8000 -d examples &
Self-correcting navigation — start on the wrong page and watch n1 find its way:
# n1 lands on the product catalog, clicks through to find the product detail page
# The Yutori cursor leads each action with visual feedback
frontend-visualqa verify http://localhost:8000/ecommerce_store.html \
--headed \
--claims 'The product detail page shows Wireless Headphones Pro priced at $149.99'
Catching regressions — mix passing and failing claims:
frontend-visualqa verify http://localhost:8000/analytics_dashboard.html \
--headed \
--claims \
'The API status indicator shows Active' \
'The monthly quota progress bar is completely filled'
# → first claim passes, second fails (label says 100% but bar is ~65% full)
Catching pricing bugs — verify that discounts are actually applied:
frontend-visualqa verify 'http://localhost:8000/ecommerce_store.html#/cart' \
--headed \
--claims 'The cart subtotal is correct'
# → fails: n1 sums the sale prices ($229.98) and catches the $279.98 subtotal
Use against your own frontend the same way — just swap the URL:
frontend-visualqa screenshot http://localhost:3000
frontend-visualqa verify http://localhost:3000/dashboard \
--claims 'The revenue chart is visible without scrolling'
More examples
Navigation hint for claims that require interaction:
frontend-visualqa verify http://localhost:8000/ecommerce_store.html \
--claims 'The cart badge shows 3 items' \
--navigation-hint "Click 'Add to Cart' on the Mechanical Keyboard K7 product card."
Autonomous form filling — n1 picks a date and catches a timezone bug:
frontend-visualqa verify 'http://localhost:8000/booking_form.html#step3' \
--claims 'The date on the confirmation page matches the date selected on the calendar'
# → fails: n1 picks a date, books the slot, and catches the off-by-one on the confirmation page
Scrolling to find off-screen content:
frontend-visualqa verify http://localhost:8000/analytics_dashboard.html \
--claims 'The /api/v1/webhooks endpoint returned a 200 OK status'
# → fails: n1 scrolls to the request table and finds a 500 Error
MCP tools
| Tool | Description |
|---|---|
verify_visual_claims |
Structured pass/fail visual checks with screenshot evidence |
take_screenshot |
Capture current page state |
manage_browser |
Inspect, reset, close, or resize the shared browser session |
Recommended agent workflow
- Ensure the local frontend is running
take_screenshotto confirm page state- Write 1–5 concrete visual claims
verify_visual_claims- Fix code, rerun claims until they pass
CLI reference
frontend-visualqa <command> [options]
| Command | Description |
|---|---|
verify |
Verify visual claims against a URL |
screenshot |
Capture a screenshot |
login |
Open a headed browser to log in and save the session |
serve |
Start the MCP stdio server |
status |
Show browser status as JSON |
verify options
frontend-visualqa verify <url> --claims 'claim1' 'claim2' [options]
| Flag | Default | Description |
|---|---|---|
--claims |
(required) | One or more visual claims |
--navigation-hint |
Interaction guidance before judging | |
--width / --height |
1280 / 800 | Viewport size |
--device-scale-factor |
1.0 | DPR |
--headed |
off | Show the browser (implies --visualize) |
--visualize / --no-visualize |
on when headed | Show in-browser action overlay (cursor, click pulses, scroll dots, status chip) |
--browser-mode |
ephemeral | ephemeral or persistent |
--user-data-dir |
Custom profile directory | |
--session-key |
default | Named browser session |
--max-steps-per-claim |
12 | Max actions per claim |
--claim-timeout-seconds |
120 | Per-claim timeout |
--run-timeout-seconds |
300 | Whole-run timeout |
--reporter |
native | Output reporter (native, ctrf). Repeat for multiple. |
Browser modes and visualization
| Mode | Flag | Cookies persist? | Use case |
|---|---|---|---|
| Ephemeral (default) | — | No | Public pages, CI |
| Persistent | --browser-mode persistent |
Yes | Auth-gated local dev |
Persistent profile setup
Log in once, reuse for all future runs:
# 1. One-time login — opens a headed browser, log in, press Enter to save
frontend-visualqa login http://localhost:3000/login
# 2. Subsequent runs reuse the saved session
frontend-visualqa verify http://localhost:3000/dashboard \
--browser-mode persistent \
--claims 'The user avatar is visible in the header'
Profile stored at ~/.cache/frontend-visualqa/browser-profile/ by default. Override with --user-data-dir:
frontend-visualqa login http://localhost:3000/login \
--user-data-dir /tmp/my-project-profile
frontend-visualqa verify http://localhost:3000/dashboard \
--browser-mode persistent \
--user-data-dir /tmp/my-project-profile \
--claims 'The dashboard loads without a login redirect'
Action visualization
When running in headed mode (--headed), the browser shows visual effects illustrating what n1 is doing:
- cursor-led click, scroll, drag, and typing effects
- a compact thought card when a tool-using model turn includes reasoning text
- read-only feedback for
extract_elements,extract_content, andfind, with a scan effect and short preview panel
To disable it, use --no-visualize:
frontend-visualqa verify http://localhost:3000 \
--headed --no-visualize \
--claims 'The API status indicator shows Active'
The MCP tool verify_visual_claims accepts a per-call visualize parameter to control this independently of the server's default.
Overlay elements are automatically hidden during screenshot capture so they never appear in evidence sent to n1 or saved artifacts.
Writing good claims
Claims should be observable, scoped, and provable from pixels.
| Good | Weak |
|---|---|
| The cart total is $261.37 | The cart works correctly |
| The product price shows $149.99 in monospace font | The page looks polished |
| At 375px width, the stat cards stack in a single column | The dashboard is responsive |
If a claim requires interaction first, use --navigation-hint instead of encoding steps in the claim text.
Result statuses
| Status | Meaning |
|---|---|
passed |
Claim matched the visual evidence |
failed |
Claim was visually false |
inconclusive |
Runner explored but couldn't determine confidently |
not_testable |
Environment blocked verification (server down, auth wall) |
For the CLI, frontend-visualqa verify exits 0 only when every claim passes. It exits 1 if any claim is failed, inconclusive, or not_testable. Usage errors still exit with argparse's standard 2.
Reporters
Output format for persisted artifacts. Does not affect CLI stdout or MCP tool responses (always native JSON).
| Reporter | File | Description |
|---|---|---|
native (default) |
run_result.json |
Full domain-specific schema with all fields |
ctrf |
ctrf-report.json |
CTRF standard JSON for CI/CD integration |
Each claim result contains:
finding— the verdict explanation (what was observed)proof— the decisive artifact paths, step number, and a compact extracted-text previewpage— URL and viewport where the claim was evaluatedtrace— the execution trace: actions taken, rich events, screenshot paths, and the saved trace path
Example claim result
{
"claim": "The monthly quota progress bar is completely filled",
"status": "failed",
"finding": "The quota label reads '100%' and '12,500 / 12,500 requests used', but the progress bar is visually only about 65% filled — the bar and the label disagree.",
"proof": {
"screenshot_path": "artifacts/run-.../claim-02/step-04.webp",
"step": 4,
"after_action": "extract_elements()",
"text": "Monthly Quota\n12,500 / 12,500 requests used 100%\n...",
"text_path": "artifacts/run-.../claim-02/step-04.txt"
},
"page": {
"url": "http://localhost:8000/analytics_dashboard.html",
"viewport": { "width": 1280, "height": 800, "device_scale_factor": 1.0 }
},
"trace": {
"steps_taken": 4,
"wrong_page_recovered": false,
"screenshot_paths": ["..."],
"actions": ["..."],
"trace_path": "artifacts/run-.../claim-02/trace.json"
}
}
proof.screenshot_path points to the screenshot n1 was examining when it rendered the verdict.
proof.text is intentionally compact for token efficiency; if proof.text_path is present, open that file for the full extracted DOM/content readout.
trace.trace_path points to trace.json, which contains the full machine-readable event trace with reasoning and verdict metadata. Events are excluded from the JSON output by default to keep it compact; access them programmatically via result.trace.events or read trace.json directly.
frontend-visualqa verify http://localhost:3000 \
--claims 'The checkout total matches the sum of line items' \
--reporter native --reporter ctrf
Development
uv sync
uv run playwright install chromium
uv run frontend-visualqa --help
Editable install:
uv pip install -e .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file frontend_visualqa-0.3.11.tar.gz.
File metadata
- Download URL: frontend_visualqa-0.3.11.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a117edb79a8a193cf3d076229ca716bffc96fa0a210b97ddb32634ae46a0c4d6
|
|
| MD5 |
5d9a7001e381d87ece392876799656c5
|
|
| BLAKE2b-256 |
906f6a7c98584534c561187d9891915a8a1b42c281e1d3c27499fab966599cbf
|
File details
Details for the file frontend_visualqa-0.3.11-py3-none-any.whl.
File metadata
- Download URL: frontend_visualqa-0.3.11-py3-none-any.whl
- Upload date:
- Size: 59.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42a310de7f1736d2edc6981c31f914af162164485928e4150d1cfbd58ef4bb5f
|
|
| MD5 |
2936bc32fef0e7505f6bf8c8579642c3
|
|
| BLAKE2b-256 |
3fcb1d96afb90ab21a5ed82ab1cec73f348e70b8b46bab889aae53e608e9491f
|