Visual QA for local frontends — gives coding agents a pixel-level verification loop powered by Yutori n1.
Project description
frontend-visualqa
Visual QA for local frontends — gives coding agents a pixel-level verification loop powered by Yutori n1.
What it does
- Verifies explicit visual claims against a running localhost frontend
- Captures screenshots for quick visual inspection
- Reuses browser sessions across MCP tool calls for multi-step debugging
- Works as a CLI (
frontend-visualqa verify) or MCP server (frontend-visualqa serve)
Does not start your dev server. If the URL is unreachable, claims return not_testable.
Why n1
Playwright MCP can click, type, and assert against the DOM — but it cannot see the page. It can run cleanly on the wrong page, assert modal.isVisible() on a modal rendered off-screen, or miss a layout that broke on mobile.
n1 is a pixels-to-actions model trained with RL on live websites. Two capabilities matter here:
- Self-correcting navigation — Send the tool to
/tasksinstead of/tasks/123and n1 recognizes the wrong page, clicks through to the right one, and reportswrong_page_recovered: true. A DOM-based tool would assert on the wrong page and report success. - Rich visual evaluation — After clicking "Mark Complete", n1 reported three changes: status badge blue→green, button label→"Completed", toast notification appeared. Playwright MCP would need three hand-written assertions.
Install
Quick install (recommended)
-
Log in to Yutori (provides the n1 vision model):
uvx yutori auth login
This opens your browser and saves your API key to
~/.yutori/config.json.Or, manually add your API key
Go to platform.yutori.com and add your key to the config file:
mkdir -p ~/.yutori cat > ~/.yutori/config.json << 'EOF' {"api_key": "yt-your-api-key"} EOF
-
Install the MCP server using add-mcp (works with all clients):
npx add-mcp -n frontend-visualqa "uvx frontend-visualqa serve"
Pick the clients you want to configure.
-
Install workflow skills using skills.sh:
npx skills add yutori-ai/frontend-visualqa -g
Adds the
/frontend-visualqaslash command for claim-based visual QA guidance.-ginstalls at user scope. Omit-gfor project-local install. -
Restart the agent client.
To list or remove later:
npx skills ls -g npx skills remove -g frontend-visualqa
add-mcphas no remove command. Delete thefrontend-visualqaentry from the.mcp.jsonit wrote to (project-level or~/.mcp.json).
Manual per-client setup
Claude Code
Plugin (recommended) — installs MCP tools + skill together:
/plugin marketplace add yutori-ai/frontend-visualqa
/plugin install frontend-visualqa@frontend-visualqa-plugins
MCP only (if you prefer not to use the plugin):
claude mcp add --scope user frontend-visualqa -- uvx frontend-visualqa serve
Codex
codex mcp add frontend-visualqa -- uvx frontend-visualqa serve
Skills can be installed via npx skills add above, or with $skill-installer inside Codex:
$skill-installer install https://github.com/yutori-ai/frontend-visualqa/tree/main/.agents/skills/frontend-visualqa
Cursor / VS Code / other MCP hosts
Use the checked-in .mcp.json, or point your client at uvx frontend-visualqa serve.
From source
uv sync
uv run playwright install chromium
Register the MCP server with your client using uvx --from /absolute/path/to/frontend-visualqa frontend-visualqa serve as the command.
Uninstall
Claude Code plugin
/plugin uninstall frontend-visualqa@frontend-visualqa-plugins -s user
Codex
Remove the MCP server entry from ~/.codex/config.toml, then delete the skill directory:
rm -rf ~/.agents/skills/frontend-visualqa
Restart Codex after removing.
Quick start
The repo includes a test page you can use immediately — no dev server required:
# From the repo root, serve the included test pages
cd /path/to/frontend-visualqa
lsof -ti:8000 | xargs kill 2>/dev/null; python3 -m http.server 8000 -d examples &
Self-correcting navigation — start on the wrong page and watch n1 find its way:
# n1 lands on the home page, clicks Tasks, then clicks Task #123
frontend-visualqa verify http://localhost:8000/multi_page_app.html \
--headed \
--claims "The task detail heading reads 'Task #123: Landing page polish'"
Catching regressions — mix passing and failing claims:
frontend-visualqa verify http://localhost:8000/comprehensive_test.html \
--headed \
--claims \
"The sidebar contains links labeled Dashboard, Tasks, and Settings" \
"The progress bar shows 100%"
# → first claim passes, second fails (actual value is 65%)
Use against your own frontend the same way — just swap the URL:
frontend-visualqa screenshot http://localhost:3000
frontend-visualqa verify http://localhost:3000/tasks/123 \
--claims "The Save button is visible without scrolling"
MCP tools
| Tool | Description |
|---|---|
verify_visual_claims |
Structured pass/fail visual checks with screenshot evidence |
take_screenshot |
Capture current page state |
manage_browser |
Inspect, reset, close, or resize the shared browser session |
Recommended agent workflow
- Ensure the local frontend is running
take_screenshotto confirm page state- Write 1–5 concrete visual claims
verify_visual_claims- Fix code, rerun claims until they pass
CLI reference
frontend-visualqa <command> [options]
| Command | Description |
|---|---|
verify |
Verify visual claims against a URL |
screenshot |
Capture a screenshot |
login |
Open a headed browser to log in and save the session |
serve |
Start the MCP stdio server |
status |
Show browser status as JSON |
verify options
frontend-visualqa verify <url> --claims "claim1" "claim2" [options]
| Flag | Default | Description |
|---|---|---|
--claims |
(required) | One or more visual claims |
--navigation-hint |
Interaction guidance before judging | |
--width / --height |
1280 / 800 | Viewport size |
--device-scale-factor |
1.0 | DPR |
--headed |
off | Show the browser |
--browser-mode |
ephemeral | ephemeral or persistent |
--user-data-dir |
Custom profile directory | |
--session-key |
default | Named browser session |
--max-steps-per-claim |
12 | Max actions per claim |
--claim-timeout-seconds |
120 | Per-claim timeout |
--run-timeout-seconds |
300 | Whole-run timeout |
--reporter |
native | Output reporter (native, ctrf). Repeat for multiple. |
More examples
Navigation hint for claims that require interaction:
frontend-visualqa verify http://localhost:8000/comprehensive_test.html \
--claims "The dropdown label reads 'Priority: High'" \
--navigation-hint "Open the Priority Selector dropdown and click High."
Mobile viewport:
frontend-visualqa verify http://localhost:8000/comprehensive_test.html \
--claims "A hamburger menu button is visible" \
--width 375 --height 812
Browser modes
| Mode | Flag | Cookies persist? | Use case |
|---|---|---|---|
| Ephemeral (default) | — | No | Public pages, CI |
| Persistent | --browser-mode persistent |
Yes | Auth-gated local dev |
Persistent profile setup
Log in once, reuse for all future runs:
# 1. One-time login — opens a headed browser, log in, press Enter to save
frontend-visualqa login http://localhost:3000/login
# 2. Subsequent runs reuse the saved session
frontend-visualqa verify http://localhost:3000/dashboard \
--browser-mode persistent \
--claims "The user avatar is visible in the header"
Profile stored at ~/.cache/frontend-visualqa/browser-profile/ by default. Override with --user-data-dir:
frontend-visualqa login http://localhost:3000/login \
--user-data-dir /tmp/my-project-profile
frontend-visualqa verify http://localhost:3000/dashboard \
--browser-mode persistent \
--user-data-dir /tmp/my-project-profile \
--claims "The dashboard loads without a login redirect"
Writing good claims
Claims should be observable, scoped, and provable from pixels.
| Good | Weak |
|---|---|
| The modal title reads "Edit Task" | The modal works correctly |
| The Save button is visible without scrolling | The page looks polished |
| At 375px width, navigation collapses behind a menu button | The UI is intuitive |
If a claim requires interaction first, use --navigation-hint instead of encoding steps in the claim text.
Result statuses
| Status | Meaning |
|---|---|
passed |
Claim matched the visual evidence |
failed |
Claim was visually false |
inconclusive |
Runner explored but couldn't determine confidently |
not_testable |
Environment blocked verification (server down, auth wall) |
Reporters
Output format for persisted artifacts. Does not affect CLI stdout or MCP tool responses (always native JSON).
| Reporter | File | Description |
|---|---|---|
native (default) |
run_result.json |
Full domain-specific schema with all fields |
ctrf |
ctrf-report.json |
CTRF standard JSON for CI/CD integration |
frontend-visualqa verify http://localhost:3000 \
--claims "The heading reads 'Dashboard'" \
--reporter native --reporter ctrf
Development
uv sync
uv run playwright install chromium
uv run frontend-visualqa --help
Editable install:
uv pip install -e .
Skill packaging
The canonical skill lives in skills/frontend-visualqa/SKILL.md.
skills/frontend-visualqa/is the source of truth..agents/skills/frontend-visualqa/is a compatibility wrapper for Codex and other OpenAI-compatible installers..claude-plugin/and.cursor-plugin/contain plugin marketplace manifests.docs/skill-ecosystem.mdrecords the packaging rationale.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file frontend_visualqa-0.1.0.tar.gz.
File metadata
- Download URL: frontend_visualqa-0.1.0.tar.gz
- Upload date:
- Size: 151.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cfbfa5cc3c5e3879ef5e313a059ebe47b9308ef10fa23788c95aab3cbad24560
|
|
| MD5 |
d358cbe2995e2aa1d0cf90d63b5fea3f
|
|
| BLAKE2b-256 |
780fdb12b8ff300ba5e147386f1045d975052ca8ac434ddd7bf2b37996d87a0e
|
File details
Details for the file frontend_visualqa-0.1.0-py3-none-any.whl.
File metadata
- Download URL: frontend_visualqa-0.1.0-py3-none-any.whl
- Upload date:
- Size: 41.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a535bd0fcfdadff744de452009f1d502c618d4075c8bd4ace2cdfedc03113fca
|
|
| MD5 |
f300fc89cee21944bc744bba24efc9f6
|
|
| BLAKE2b-256 |
1aa39437e6fdb7adbab55d1c19370fd74cf4c7b2f391d0a6e58dcebc9f0e48f5
|