CapusQA: persona-driven LLM agent testing for macOS and web apps, served as a local MCP daemon

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

DanielBirk04

These details have not been verified by PyPI

Project description

CapusQA

AI usability testing for real app workflows.

CapusQA lets Claude, Codex, Cursor, and other MCP-capable agents test local web apps and native macOS apps like realistic users: run persona sessions, click through workflows, file reproducible findings, and produce evidence bundles your coding agent can use to fix and verify issues.

Runs locally on 127.0.0.1. CapusQA stores artifacts, masks secrets, drives browsers or macOS windows, and does not make hidden LLM calls.

Start in 2 minutes | Recipes | Run the invoice demo | See the evidence bundle | Connect an agent

Why CapusQA

Traditional UI tests prove that selectors still work. CapusQA looks for the product failures scripted tests miss: dead controls, confusing flows, broken business rules, inconsistent copy, accessibility friction, and crashes.

Use CapusQA when you want an agent to explore the app like a user, collect evidence like a tester, and return findings a developer can reproduce.

Best for:

Local web apps, prototypes, dashboards, and product workflows.
MCP-driven testing with Claude, Codex, Cursor, or another coding agent.
Evidence-heavy usability, workflow, and business-rule checks.
Fast feedback before demos, releases, design reviews, and agent-assisted fix loops.

Not a replacement for:

Unit tests, API tests, or deterministic browser regression suites.
Production monitoring.
Unsupervised testing against live production accounts.

Guiding Principles

CapusQA is designed around a few constraints that make agent-driven UI testing useful, reproducible, and safe to hand to a coding agent:

Local-first: The daemon binds to 127.0.0.1 by default and stores run data on your machine.
Agent-native: Any MCP-capable coding agent can drive the same daemon, dashboard, traces, and reports.
Evidence-first: Findings are expected vs. observed behavior with screenshots, traces, oracle signals, and stable IDs.
Replayable: Traces are first-class artifacts so fixes can be checked against the workflow that found the issue.
No hidden reasoning: The daemon observes and acts. Your agent, or the optional runner, does the reasoning.

Quickstart

Install CapusQA with browser support:

uv tool install --python 3.12 'capusqa[browser]'
capusqa setup

Or the one-liner, which installs uv if needed and runs setup:

curl -fsSL https://raw.githubusercontent.com/DanielBirk04/capusqa/main/scripts/install.sh | sh

If you do not have uv yet:

curl -LsSf https://astral.sh/uv/install.sh | sh
uv tool update-shell

Windows

CapusQA runs the web/URL testing path on Windows (native macOS-app testing is, by nature, macOS-only — its dependencies are skipped automatically). In PowerShell:

powershell -ExecutionPolicy Bypass -c "irm https://astral.sh/uv/install.ps1 | iex"
uv tool install --python 3.12 'capusqa[browser]'
capusqa setup

Or the one-liner, which installs uv if needed and runs setup:

powershell -ExecutionPolicy Bypass -c "irm https://raw.githubusercontent.com/DanielBirk04/capusqa/main/scripts/install.ps1 | iex"

Open a new terminal if capusqa is not found after installation.

capusqa setup prepares browser support, can wire supported MCP clients, and normally starts the local daemon. To start it later:

capusqa serve --open

Dashboard:

http://127.0.0.1:7777/

MCP endpoint:

http://127.0.0.1:7777/mcp

Useful commands:

capusqa doctor                 # Check local setup.
capusqa capacity               # Estimate local browser capacity.
capusqa issues                 # List stored findings.
capusqa report RUN_ID          # Write report.html, report.md, feedback.json.
capusqa agents --run-id RUN_ID # Play queued sessions; needs Codex, Claude Code, OPENAI_API_KEY, or ANTHROPIC_API_KEY.

Tutorials And Recipes

Pick the path that matches what you are trying to do:

Goal	Start here
Run CapusQA for the first time	Quickstart
Prove the browser pipeline works	Try the invoice demo
Connect Claude, Codex, Cursor, Cline, Windsurf, VS Code, or Zed	client/mcp/CONNECT.md
Teach any MCP agent how to drive CapusQA	client/mcp/DRIVER.md
Use CapusQA from Codex	client/codex/AGENTS.md
Test a local web app	Start CapusQA, then point a run at `http://127.0.0.1:<port>` or a `file://` URL
Test a native macOS app	Read Targets, install the `vision` extra, and run `capusqa doctor --request`
Hand findings to a coding agent	Generate the evidence bundle

Common first prompts:

Use CapusQA to test my local app at http://127.0.0.1:3000. Act as realistic users,
report reproducible findings, and generate the CapusQA report artifacts.

Use CapusQA to run the invoice demo in examples/invoice_web with the scenario pack
at examples/invoice_web/spec.yaml. Report every planted bug with evidence.

Try the Demo

The bundled invoice app is a fast end-to-end proof: CapusQA should find four planted product bugs and generate report artifacts for the run.

Clone the repository to use the demo files:

git clone https://github.com/DanielBirk04/capusqa.git
cd capusqa

Demo files:

app: examples/invoice_web/index.html
scenario pack: examples/invoice_web/spec.yaml

Planted bugs:

Export PDF does nothing.
The promised 10 percent discount is never applied.
Sending an invoice confirms with the wrong message.
Invalid amounts are silently ignored.

Print a copy-pasteable file:// URL for the dashboard:

python3 -c 'from pathlib import Path; print(Path("examples/invoice_web/index.html").resolve().as_uri())'

Or ask a connected agent:

Use CapusQA to test examples/invoice_web/index.html with the scenario pack in
examples/invoice_web/spec.yaml. Report the findings and generate the CapusQA
report artifacts.

Source checkout only:

capusqa dev test-run --out /tmp/capusqa-invoice-web

A useful run should produce findings for dead controls, rule violations, inconsistent confirmation copy, and missing validation.

Evidence You Can Hand To A Coding Agent

Every run can produce a fix-ready evidence bundle: screenshots, traces, findings, expected vs. observed behavior, and machine-readable feedback.json for follow-up automation.

Default storage:

~/.capusqa/
  capusqa.db
  artifacts/<run-id>/
    report.html
    report.md
    feedback.json
    screenshots
    traces

Core artifacts:

Artifact	Use it for
`report.html`	Review screenshots, sessions, findings, and evidence in a browser.
`report.md`	Share a compact developer report.
`feedback.json`	Feed stable finding IDs, repro steps, expected vs. observed behavior, evidence, and status to a coding agent.
Traces	Replay action histories and verify fixes.

Example finding shape:

{
  "id": "CAP-001",
  "kind": "rule-violation",
  "title": "Volume discount is not applied above 100 EUR",
  "expected": "Subtotal above 100 EUR applies a 10 percent discount",
  "observed": "Subtotal and total remain identical after adding qualifying items",
  "evidence": ["screenshots", "repro_trace"]
}

Set CAPUSQA_DATA_DIR or pass --data-dir to store data somewhere else.

Connect an Agent

CapusQA is built for MCP clients. Point your agent at:

http://127.0.0.1:7777/mcp

Agent-specific guides:

client/mcp/CONNECT.md - connect MCP clients to CapusQA.
client/mcp/DRIVER.md - portable tester playbook for any MCP client.
client/codex/AGENTS.md - Codex driver guide.

Claude Code and Codex users can run capusqa setup to register the same local MCP server. Claude Code also gets the optional /capusqa command menu; the main loop there is /capusqa:test, /capusqa:runs, and /capusqa:issues.

Targets

Target	Use it for	Setup
Web URL or `file://`	Local web apps, demos, parallel runs, CI-style checks	`capusqa[browser]`; no Screen Recording or Accessibility permissions
Native macOS app	Desktop workflows, AppKit/Cocoa targets, real-window testing	Advanced path; requires Screen Recording and Accessibility permissions

Browser targets run in isolated Chromium contexts. Native targets use window screenshots, OCR/vision perception, and synthesized mouse and keyboard input.

For native macOS targets:

uv tool install --force --python 3.12 'capusqa[browser,vision]'
capusqa models download
capusqa doctor --request
export CAPUSQA_MACOS_EXPERIMENTAL=1
capusqa serve --open

Keep the machine free during native runs. Browser runs do not contend with your mouse.

How It Works

persona goals or scenario specs
        |
        v
MCP client or optional capusqa agents runner
        |
        v
CapusQA daemon on 127.0.0.1
        |
        +-- browser driver: isolated Chromium sessions
        +-- macOS driver: native window screenshots and input
        |
        v
dashboard, SQLite store, reports, feedback.json, replayable traces

The core loop is:

run_create -> task_claim -> session_start
           -> observe -> click/type/scroll/press/wait
           -> finding_report / checkpoint_mark / rule_mark
           -> session_end -> report_generate

The split is deliberate:

The client decides what a persona should try and how to interpret evidence.
The daemon observes, actuates, stores, masks secrets, reports, and replays.

Examples

examples/invoice_web - self-contained browser demo with planted bugs and a scenario pack.
examples/invoice_mini - native Cocoa invoice demo with matching product rules.
examples/collab_board - multi-user collaboration fixture.
examples/saas_mini - small SaaS-style target.

Security and Privacy

CapusQA runs locally and binds to 127.0.0.1 by default. The dashboard and MCP server assume a localhost trust boundary.

Set CAPUSQA_DASHBOARD_TOKEN before exposing the dashboard beyond localhost. Mutating dashboard routes and sensitive reads honor it as a Bearer token when the token is set.

Credentials for test accounts live in a local SQLite vault. Fields whose names look secret, such as password, secret, token, pin, key, otp, or code, are masked in traces and reports as {{secret:...}}. Replay resolves them locally.

Use dedicated test accounts. Do not point CapusQA at production systems unless you have explicitly designed the run, data, and account permissions for that risk.

Generated reports and traces may contain app content. Attach only sanitized artifacts to public issues.

CapusQA Intelligence and CapusQA Atlas are optional retrieval and hosted-assistance features. They are off by default and require explicit environment configuration plus local consent:

capusqa intelligence status
capusqa intelligence accept
capusqa intelligence export
capusqa intelligence withdraw

Development

From a source checkout:

uv venv --python 3.12 .venv
uv pip install --python .venv/bin/python -e '.[browser]'
.venv/bin/playwright install chromium
.venv/bin/capusqa doctor
.venv/bin/capusqa serve --open

Repository map:

Path	Purpose
`capusqad/`	Python daemon, MCP server, drivers, dashboard server, reports, and CLI.
`client/`	MCP prompts, connection guides, Codex guide, and Claude Code plugin assets.
`examples/`	Demo apps and scenario packs.
`scripts/install.sh`	Source-checkout installer and setup helper.
`pyproject.toml`	Package metadata, dependencies, extras, and build configuration.

Contributing

Keep contributions evidence-oriented:

Bug reports should include the target app, CapusQA version, install method, relevant run ID, logs or report artifacts, and expected vs. observed behavior.
Pull requests should include the smallest useful change plus the focused check or demo command that covers it.
Security-sensitive issues should not include live credentials, production data, or unredacted reports.

License

Apache-2.0. OmniParser v2 icon-detector weights are AGPL-3.0; review their license before redistributing a package or service that includes those weights.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

DanielBirk04

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2.3.0

Jun 22, 2026

2.2.0

Jun 22, 2026

2.1.2

Jun 22, 2026

2.1.1

Jun 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

capusqa-2.3.0.tar.gz (1.3 MB view details)

Uploaded Jun 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

capusqa-2.3.0-py3-none-any.whl (1.4 MB view details)

Uploaded Jun 22, 2026 Python 3

File details

Details for the file capusqa-2.3.0.tar.gz.

File metadata

Download URL: capusqa-2.3.0.tar.gz
Upload date: Jun 22, 2026
Size: 1.3 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for capusqa-2.3.0.tar.gz
Algorithm	Hash digest
SHA256	`db946df95cb2d6214bf38577790fb058ff29b174a4430fd3e9d7280f84530cb7`
MD5	`6a05eee320be9b1b0035d980dc64407a`
BLAKE2b-256	`6de8f864408f6062b6c529a5bc281f799d915df08a1754cd4104ef8862c8eea7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for capusqa-2.3.0.tar.gz:

Publisher: release.yml on DanielBirk04/capusqa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: capusqa-2.3.0.tar.gz
- Subject digest: db946df95cb2d6214bf38577790fb058ff29b174a4430fd3e9d7280f84530cb7
- Sigstore transparency entry: 1910457678
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: DanielBirk04/capusqa@30926687d5fe22436f65061f05beba592eb739f9
- Branch / Tag: refs/tags/v2.3.0
- Owner: https://github.com/DanielBirk04
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@30926687d5fe22436f65061f05beba592eb739f9
- Trigger Event: push

File details

Details for the file capusqa-2.3.0-py3-none-any.whl.

File metadata

Download URL: capusqa-2.3.0-py3-none-any.whl
Upload date: Jun 22, 2026
Size: 1.4 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for capusqa-2.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ff542cfdce60357eaff4daaf0eede8badfa6951af6dd0299ed1fd3f778c186a7`
MD5	`025ee35b1e20816701788da00911273f`
BLAKE2b-256	`d51d5eecd2a77cdb40d78f5a6fa1cf7cc007f5cec3bf7721d15da3dc307d0c61`

See more details on using hashes here.

Provenance

The following attestation bundles were made for capusqa-2.3.0-py3-none-any.whl:

Publisher: release.yml on DanielBirk04/capusqa

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: capusqa-2.3.0-py3-none-any.whl
- Subject digest: ff542cfdce60357eaff4daaf0eede8badfa6951af6dd0299ed1fd3f778c186a7
- Sigstore transparency entry: 1910457749
- Sigstore integration time: Jun 22, 2026
Source repository:
- Permalink: DanielBirk04/capusqa@30926687d5fe22436f65061f05beba592eb739f9
- Branch / Tag: refs/tags/v2.3.0
- Owner: https://github.com/DanielBirk04
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@30926687d5fe22436f65061f05beba592eb739f9
- Trigger Event: push

capusqa 2.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

CapusQA

Why CapusQA

Guiding Principles

Quickstart

Windows

Tutorials And Recipes

Try the Demo

Evidence You Can Hand To A Coding Agent

Connect an Agent

Targets

How It Works

Examples

Security and Privacy

Development

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance