aat-devqa

AWT (AI Watch Tester) — AI-powered E2E testing with self-healing DevQA loop

These details have not been verified by PyPI

Project links

Project description

AWT Logo
AWT — AI Watch Tester
Give it a URL. AWT tests your web app — no test code, no setup, no maintenance.

What is AWT?

AWT is a browser testing tool that writes and fixes its own tests.

You give it your web app's URL. AWT opens a real browser, figures out what's on the page (buttons, forms, links), writes test steps, runs them, and tells you what passed and what failed. If something breaks, the DevQA Loop kicks in — AI reads the error, updates the test or your code, and tries again.

No test code to write. No recording sessions. No manual updates when the UI changes.

Start in 5 Minutes

Option 1 — Cloud (no install, free)

1. Visit https://ai-watch-tester.vercel.app
2. Sign up (email or GitHub — takes 30 seconds)
3. Paste your app URL
4. Watch AWT test your site live

Option 2 — Local CLI (runs on your machine)

# Install (requires Python 3.11+)
pip install aat-devqa
playwright install chromium

# Run the visual dashboard
aat dashboard
# → Opens at http://localhost:9500

# Or test directly from the command line
aat devqa "test the login flow" --url https://your-app.com

That's it. AWT scans your page, writes a test plan, shows it to you for approval, then runs it in a real Chrome window.

How It Works

You give AWT a URL
        │
        ▼
  🔍 SCAN — AWT opens Chrome and reads every button, input, and link
        │
        ▼
  📝 GENERATE — AI writes a step-by-step test plan (you review & approve)
        │
        ▼
  ▶️  RUN — AWT clicks, types, and navigates like a real user
        │
        ├── ✅ All passed → screenshot report saved
        │
        └── ❌ Something failed
                    │
                    ▼
            🔄 DEVQA LOOP — AI reads the failure,
               fixes the test (or your code),
               and tries again (up to 5 times)

The DevQA Loop — AWT's Core Feature

Most testing tools stop when a test fails and wait for a human. AWT keeps going.

When a step fails, AWT:

Takes a screenshot of exactly what the browser shows
Reads the error message and the visible page content
Re-scans the page to check if anything moved or changed
Patches the specific failing step and retries

If the failure is a bug in your source code (not just a wrong selector), AWT can trace it — finding the route handler, component, or API endpoint that's misbehaving — and suggest or apply a fix.

# Watch the loop run live
aat devqa "checkout flow test" --url http://localhost:3000

# Or use it with your AI coding tool (Claude Code, Cursor, Copilot...)
# "Test the registration page" → AWT scans, generates, runs, fixes

Four Ways to Use AWT

	Cloud	Local CLI	Agent Skill	MCP Server
How to start	Sign up at ai-watch-tester.vercel.app	`pip install aat-devqa`	`npx skills add ksgisang/awt-skill`	`pip install aat-devqa mcp`
Browser	Headless (server)	Real Chrome on your machine	Real Chrome on your machine	Real Chrome on your machine
AI key needed	No (server-provided or BYOK)	Yes (your OpenAI / Anthropic / Ollama)	No — your AI tool is the brain	No
Best for	Quick tests, PMs, planners	Developers, CI/CD	AI-assisted development	Claude Desktop, Cursor, Windsurf
Price	Free (5/mo) · Pro $28.99 · Team $98.99	Free forever (MIT)	Free forever	Free forever

Agent Skill — Let your AI coding tool drive AWT

# One-line install
npx skills add ksgisang/awt-skill --skill awt -g

# Then ask your AI tool:
"Test the login flow on http://localhost:3000"
"Check if the signup form works"
"Run regression tests after my last commit"
# → AWT scans, generates test steps, runs them, and reports back

MCP Server — Protocol-native

# Add to Claude Code
claude mcp add awt -- python mcp/server.py

# Tools available: aat_run, aat_doctor, aat_list_scenarios, aat_validate, aat_cost

What AWT Is Great At

	Feature	Description
🤖	Zero-code test generation	Point at a URL — AI generates complete test steps with real selectors
🔄	Self-healing DevQA Loop	Tests fail? AI fixes and retries automatically (up to 5 attempts)
👁️	Visual verification	Screenshots before/after every action — not just DOM checks
🌐	Real browser	Chrome with human-like mouse movement and typing speed
📱	Flutter support	Native CanvasKit + Semantics detection — tests Flutter web apps too
📄	Document-based generation	Feed a PDF/DOCX spec — AI generates tests from requirements
⚡	Speed modes	`fast` for React/Next.js · `slow` for Flutter/animations
📸	Smart screenshots	`all` / `before-after` / `on-failure` — choose your audit level
🔌	Plugin architecture	Swap engines, matchers, AI providers via simple registries

AWT vs Other Tools

vs Playwright / Cypress

Playwright and Cypress are excellent — and AWT is built on top of Playwright. The difference is who writes the tests:

	AWT	Playwright / Cypress
Who writes tests	AI (from your URL)	You (code)
Maintenance when UI changes	AI auto-heals	You update selectors manually
Learning curve	Zero — just paste a URL	Moderate (framework API + JS/TS)
Flexibility	High (YAML scenarios)	Maximum (full code control)

Use Playwright/Cypress when you want full programmatic control. Use AWT when you want tests without writing them.

vs testRigor

	AWT	testRigor
Test authoring	AI generates from URL — you write nothing	Plain English (you write commands)
Self-healing	DevQA Loop (AI re-generates automatically)	Built-in auto-maintenance
Pricing	Free (MIT, self-host)	Enterprise (~$800+/mo)
Open source	✅ MIT License	❌

vs Applitools

Applitools specializes in visual regression (pixel-by-pixel screenshot comparison). AWT specializes in functional testing (does the login actually work?). They complement each other — run AWT for functional tests, add Applitools for pixel-perfect visual checks.

Speed & Screenshot Modes

Control the trade-off between thoroughness and speed:

# CI/CD — fastest, minimal storage
aat run --verbosity=concise --screenshots=on-failure scenarios/

# Standard QA — balanced (recommended)
aat run --verbosity=concise --screenshots=before-after scenarios/

# Full audit — every step recorded
aat run --verbosity=detailed --screenshots=all scenarios/

Mode	Steps	Screenshots	~Time	Use For
`concise` + `on-failure`	12–15	0–1	~1 min	CI/CD gates
`concise` + `before-after`	12–15	24	~2 min	Daily QA
`detailed` + `all`	60–80	68	~5 min	Compliance / audit

Supported AI Providers

Provider	Models	Cost	Setup
OpenAI	gpt-4o, gpt-4o-mini	Pay-per-use	`export OPENAI_API_KEY=sk-...`
Anthropic	Claude Sonnet 4	Pay-per-use	`export ANTHROPIC_API_KEY=sk-ant-...`
Ollama	codellama, llama3, mistral	Free (local)	`ollama serve`

# aat.yaml
ai:
  provider: openai        # openai | anthropic | ollama
  model: gpt-4o
  api_key: ${OPENAI_API_KEY}

Architecture

aat devqa / aat run / aat dashboard
              │
              ▼
    ┌─────────────────────────────────────┐
    │           CLI (Typer)               │
    ├─────────────────────────────────────┤
    │         Core Orchestrator           │
    │  Executor · Comparator · DevQALoop  │
    ├────────────┬──────────┬─────────────┤
    │   Engine   │ Matcher  │  AI Adapter │
    │ web/desktop│ocr/cv/ai │ openai/etc. │
    ├────────────┴──────────┴─────────────┤
    │  Pydantic v2 Models · SQLite Learn  │
    └─────────────────────────────────────┘

All modules follow a plugin registry pattern — add a new engine, matcher, or AI provider by implementing one base class and registering it in __init__.py.

Development

Prerequisites

Python 3.11+
Tesseract OCR: brew install tesseract / apt install tesseract-ocr

Commands

Command	What it does
`make dev`	Install all dependencies + Playwright + pre-commit
`make lint`	Check code style (ruff)
`make format`	Auto-fix formatting
`make typecheck`	Strict type checking (mypy)
`make test`	Run all tests (pytest)
`make test-cov`	Tests + coverage report

git clone https://github.com/ksgisang/AI-Watch-Tester.git
cd AI-Watch-Tester
python -m venv .venv && source .venv/bin/activate
make dev
make test        # verify everything works
aat dashboard    # launch at http://localhost:9500

Contributing

See CONTRIBUTING.md — contributions, bug reports, and new plugins are welcome.

git checkout -b feat/my-feature
make format && make lint && make typecheck && make test
git commit -m "feat(scope): description"

FAQ

Do I need to know how to code?

No. The Cloud version at ai-watch-tester.vercel.app needs nothing — just a browser. The local CLI needs one terminal command to install.

The only thing AWT needs from you is a URL and (optionally) a description of what to test.

What does "self-healing" mean?

When a web app changes — a button moves, a label changes, a new form field appears — traditional tests break and stay broken until someone manually updates them.

AWT's DevQA Loop re-scans the page after a failure, finds the updated element, and patches the test step automatically. You don't have to touch the test files.

How do I install it?

Cloud (no install): ai-watch-tester.vercel.app

Local:

pip install aat-devqa
playwright install chromium
aat dashboard     # opens at http://localhost:9500

From source:

git clone https://github.com/ksgisang/AI-Watch-Tester.git
cd AI-Watch-Tester
make dev && aat dashboard

What's the difference between aat devqa and aat loop?

	`aat devqa`	`aat loop`
Starting point	Just a description + URL	Existing scenario file
Test generation	Automatic (scans and writes)	Uses your file
Failure fixing	Patches the test YAML	AI patches your source code
Best for	First run, quick testing	Iterative dev with code fixes

Use aat devqa when starting from scratch. Use aat loop when you want AWT to also fix your application code.

How do I control speed and screenshot output?

--verbosity — how many steps run:

detailed (default): all steps including wait/assert/screenshot
concise: core actions only (navigate, click, type) — faster

--screenshots — how many images are saved:

all (default): after every step
before-after: before + after each click/type/navigate (~70% fewer files)
on-failure: only when a step fails (great for CI/CD)

# Recommended for daily QA
aat run --verbosity=concise --screenshots=before-after scenarios/

# For CI/CD pipelines
aat run --verbosity=concise --screenshots=on-failure scenarios/

Which AI providers are supported?

Provider	Models	Cost
OpenAI	gpt-4o, gpt-4o-mini	Pay-per-use
Anthropic	Claude Sonnet 4	Pay-per-use
Ollama	codellama, llama3, mistral	Free (local GPU)

Cloud BYOK keys are encrypted at rest (Fernet/AES-128-CBC).

How much does the Cloud version cost?

Plan	Price	Tests/month
Free	$0	5
Pro	$28.99/mo	100
Team	$98.99/mo	500

The local CLI is free forever with no limits.

Can I use it in CI/CD?

Yes. For local runs, use the --screenshots=on-failure flag to keep output minimal. For cloud, the API accepts a POST request:

curl -X POST https://your-awt-server.com/api/v1/run \
  -H "X-API-Key: awt_your_key" \
  -H "Content-Type: application/json" \
  -d '{"target_url": "https://staging.example.com"}'

See the CI/CD Guide for GitHub Actions and GitLab CI examples.

Is my data secure?

All traffic encrypted via HTTPS/TLS
BYOK API keys: Fernet-encrypted (AES-128-CBC + HMAC-SHA256) at rest
Screenshots: auto-deleted after 7 days
Local mode: nothing leaves your machine
See our Privacy Policy

License

MIT — free for personal and commercial use.

_{Built with Playwright, OpenCV, and a lot of AI. Made by @ksgisang.}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.6.2

Apr 15, 2026

This version

1.6.1

Mar 31, 2026

1.6.0

Mar 31, 2026

1.5.5

Mar 23, 2026

1.5.4

Mar 23, 2026

1.5.3

Mar 23, 2026

1.5.2

Mar 23, 2026

1.5.1

Mar 23, 2026

1.5.0

Mar 22, 2026

1.4.0

Mar 22, 2026

1.3.3

Mar 22, 2026

1.3.2

Mar 21, 2026

1.3.1

Mar 21, 2026

1.3.0

Mar 21, 2026

1.2.1

Mar 21, 2026

1.2.0

Mar 20, 2026

1.1.0

Mar 16, 2026

1.0.0

Mar 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aat_devqa-1.6.1.tar.gz (11.2 MB view details)

Uploaded Mar 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aat_devqa-1.6.1-py3-none-any.whl (221.0 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file aat_devqa-1.6.1.tar.gz.

File metadata

Download URL: aat_devqa-1.6.1.tar.gz
Upload date: Mar 31, 2026
Size: 11.2 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for aat_devqa-1.6.1.tar.gz
Algorithm	Hash digest
SHA256	`d9ce57c769d0f3c21d9a41f08683ecc44fc16cfc1f55412feb3ce986bbeed5d8`
MD5	`b4801a50b36227c598018e7b6ea3a34b`
BLAKE2b-256	`bf5cd1c84a150e3a76795224010a5f8f4a5116cf55394f970895d63435d92fb5`

See more details on using hashes here.

File details

Details for the file aat_devqa-1.6.1-py3-none-any.whl.

File metadata

Download URL: aat_devqa-1.6.1-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 221.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for aat_devqa-1.6.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9c5e177b12254cacf696c3caac9cef897b53a65c2598d4b25fc69e951e0aadf1`
MD5	`433766bd2a6f3327c4b79f42af01e08e`
BLAKE2b-256	`e461993eab3f3bf66db1953d456c4a5c387375e80572dd5b77918490849bb687`

See more details on using hashes here.

aat-devqa 1.6.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

What is AWT?

Start in 5 Minutes

Option 1 — Cloud (no install, free)

Option 2 — Local CLI (runs on your machine)

How It Works

The DevQA Loop — AWT's Core Feature

Four Ways to Use AWT

Agent Skill — Let your AI coding tool drive AWT

MCP Server — Protocol-native

What AWT Is Great At

AWT vs Other Tools

vs Playwright / Cypress

vs testRigor

vs Applitools

Speed & Screenshot Modes

Supported AI Providers

Architecture

Development

Prerequisites

Commands

Contributing

FAQ

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes