AWT (AI Watch Tester) — AI-powered E2E testing with self-healing DevQA loop
Project description
AWT — AI Watch Tester
Enter a URL. AI generates, executes, and heals E2E tests — automatically.
Demo
Enter a URL → AI scans your site → generates test scenarios → executes with live screenshots → reports results.
Why AWT?
Most E2E testing tools still require you to write code or record flows before you can run a single test. AWT flips that model:
- You provide a URL (and optionally a spec document).
- AI analyzes the page structure, forms, navigation, and auth flows.
- AI generates complete YAML test scenarios with selectors, test data, and assertions.
- Playwright executes the scenarios in a real browser with humanized input.
- If a test fails, the DevQA Loop kicks in — AI reads the failure, fixes the scenario or source code, and re-runs.
No test code. No recording. No manual maintenance.
Four Ways to Use AWT
Pick one — or use all four.
| Cloud | Local CLI | Agent Skill | MCP Server | |
|---|---|---|---|---|
| URL | ai-watch-tester.vercel.app | aat dashboard → localhost:9500 |
Works inside your AI coding tool | Works inside MCP-compatible tools |
| Install | None — just sign up | pip install aat-devqa |
npx skills add ksgisang/awt-skill |
pip install aat-devqa mcp |
| Browser | Headless Chromium on server | Real Chromium on your machine | Real Chromium on your machine | Real Chromium on your machine |
| AI key | Server-provided or BYOK | Your own key (OpenAI / Anthropic / Ollama) | None needed — your AI tool is the brain | None needed |
| Best for | PMs, planners, quick tests | Developers, CI/CD, offline use | AI-assisted dev with integrated testing | Claude Desktop, Cursor, Windsurf |
| Pricing | Free (5/mo) · Pro $28.99 · Team $98.99 | Free forever (MIT, unlimited) | Free forever | Free forever |
| Data | Stored on our servers | Never leaves your machine | Never leaves your machine | Never leaves your machine |
Cloud — Start in 30 seconds
1. Visit https://ai-watch-tester.vercel.app
2. Sign up (email or GitHub)
3. Enter your target URL
4. Watch AI generate and execute tests
Local CLI — Full control
pip install aat-devqa
playwright install chromium
# Option 1: Web dashboard
aat dashboard # http://localhost:9500
# Option 2: CLI
aat start # guided mode
aat generate --url https://example.com --provider openai
aat run scenarios/
Agent Skill — Let your AI coding tool drive
AWT is also available as an Agent Skill for AI coding tools like Claude Code, Cursor, Codex, and 11+ others. Your AI tool writes YAML scenarios and runs AWT — no extra AI API key needed.
# Install globally (one-line)
npx skills add ksgisang/awt-skill --skill awt -g
# Then just ask your AI coding tool:
# "Test the login flow on https://mysite.com"
# → It writes scenarios, runs aat, reads results, and fixes failures automatically.
MCP Server — Protocol-native integration
AWT is available as an MCP server for tools that support the Model Context Protocol — Claude Code, Claude Desktop, Cursor, Windsurf, and more.
# Claude Code (one-line)
claude mcp add awt -- python mcp/server.py
# Claude Desktop / Cursor / Windsurf — see mcp/README.md for config
6 tools exposed: aat_run, aat_run_skill_mode, aat_doctor, aat_list_scenarios, aat_validate, aat_cost
From Source
git clone https://github.com/ksgisang/AI-Watch-Tester.git
cd AI-Watch-Tester
python -m venv .venv && source .venv/bin/activate
make dev # install deps + playwright + pre-commit
make test # verify everything works
aat dashboard # launch web UI
Features
| Feature | Description | |
|---|---|---|
| :robot: | AI Scenario Generation | Upload a URL or spec doc (PDF/DOCX/MD) — AI creates E2E test scenarios |
| :globe_with_meridians: | Real Browser Testing | Playwright-driven Chromium with Bezier mouse curves and variable-speed typing |
| :recycle: | Self-Healing DevQA Loop | AI analyzes failures, patches code or scenarios, and re-runs automatically |
| :cloud: | Cloud + Local | Cloud mode (no install, browser dashboard) or local mode (real browser, full control) |
| :bar_chart: | Live Dashboard | Real-time screenshot streaming, step-by-step progress, event log |
| :page_facing_up: | Document-Based Generation | Feed PDF/DOCX/Markdown specs — AI generates scenarios from requirements |
| :key: | BYOK | Bring your own AI API key (OpenAI, Anthropic, Ollama) — encrypted at rest |
| :test_tube: | CI/CD Ready | One-line curl integration with any pipeline |
| :jigsaw: | Plugin Architecture | Engines, matchers, AI adapters, and reporters are all swappable via registries |
| :wrench: | Agent Skill | Use AWT inside Claude Code, Cursor, Codex, and 11+ AI coding tools — no extra AI key needed |
| :electric_plug: | MCP Server | Protocol-native integration for Claude Desktop, Cursor, Windsurf via Model Context Protocol |
Supported AI Providers
| Provider | Models | Cost | Setup |
|---|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini | Pay-per-use | export OPENAI_API_KEY=sk-... |
| Anthropic | Claude Sonnet 4 | Pay-per-use | export ANTHROPIC_API_KEY=sk-ant-... |
| Ollama | codellama, llama3, mistral | Free (local GPU) | ollama serve |
Configure in aat.yaml or via environment variables:
ai:
provider: openai # openai | anthropic | ollama
model: gpt-4o
api_key: ${OPENAI_API_KEY}
Cloud users can bring their own API key (BYOK) via Settings > AI Provider.
How It Compares
vs testRigor
| AWT | testRigor | |
|---|---|---|
| Test authoring | AI generates from URL/docs — zero input | Plain English commands (you write) |
| Self-healing | DevQA Loop (AI re-generates) | Built-in auto-maintenance |
| Pricing | Free (MIT, self-host) | Enterprise pricing (~$800+/mo) |
| Open source | Yes | No |
| Setup time | Seconds (enter URL) | Minutes (write English scripts) |
Choose AWT if you want fully automated test generation with no scripting at all, or need a self-hostable open-source tool. Choose testRigor if you prefer writing plain-English test specs with enterprise support.
vs Applitools
| AWT | Applitools | |
|---|---|---|
| Primary focus | Functional E2E test generation + execution | Visual regression + cross-browser comparison |
| AI role | Generates entire test scenarios | Compares screenshots for visual differences |
| Standalone | Yes (full pipeline) | No (requires Cypress/Playwright/Selenium) |
| Pricing | Free (MIT) | Free tier + paid plans |
Choose AWT for AI-driven functional testing where you need scenarios generated automatically. Choose Applitools when pixel-perfect visual consistency across browsers is the priority. They complement each other — AWT generates and runs tests, Applitools can validate visual output.
vs Playwright / Cypress
These are excellent browser automation frameworks that AWT is built on top of. The difference is who writes the tests: you (Playwright/Cypress) or AI (AWT). If your team wants full programmatic control, use them directly. If you want AI to handle test creation and maintenance, AWT fills that gap.
See docs/COMPARISON.md for a detailed breakdown against Playwright, Cypress, Testim, Katalon, and Mabl.
Architecture
aat start / aat dashboard
│
▼
┌─────────────────────────────────────────────┐
│ CLI (Typer) │
├─────────────────────────────────────────────┤
│ Core Orchestrator │
│ ┌──────────┐ ┌──────────┐ ┌─────────────┐ │
│ │ Executor │ │Comparator│ │ DevQA Loop │ │
│ └────┬─────┘ └────┬─────┘ └──────┬──────┘ │
├───────┼─────────────┼──────────────┼────────┤
│ ┌────▼────┐ ┌─────▼─────┐ ┌────▼─────┐ │
│ │ Engine │ │ Matcher │ │ Adapter │ │
│ │Registry │ │ Registry │ │ Registry │ │
│ └─────────┘ └───────────┘ └──────────┘ │
│ web|desktop template|ocr openai|claude │
│ feature|hybrid ollama │
├─────────────────────────────────────────────┤
│ Models (Pydantic v2) │ Config (Settings) │
└─────────────────────────────────────────────┘
All modules follow ABC + plugin registry pattern — extend the base class, register in __init__.py, done.
Development
Prerequisites
- Python 3.11+
- Tesseract OCR —
brew install tesseract/apt install tesseract-ocr - Git
Make Commands
| Command | Description |
|---|---|
make dev |
Install all deps + Playwright + pre-commit |
make lint |
ruff check |
make format |
ruff format + auto-fix |
make typecheck |
mypy strict |
make test |
pytest |
make test-cov |
pytest + coverage report |
make clean |
Remove caches and build artifacts |
Contributing
Contributions are welcome! See CONTRIBUTING.md for:
- Development environment setup
- Code style (ruff + mypy strict)
- Test writing guidelines
- Pull request process
- Plugin development (adding new engines, matchers, or AI adapters)
git checkout -b feat/my-feature
# make changes
make format && make lint && make typecheck && make test
git commit -m "feat(scope): description"
# open PR
Documentation
| Document | Description |
|---|---|
| Quick Start Guide | Install, configure, run your first test |
| API Reference | REST API + WebSocket documentation |
| Comparison | AWT vs Playwright, Cypress, Testim, Katalon, Mabl |
| FAQ | Common questions |
| CI/CD Guide | Pipeline integration (GitHub Actions, GitLab CI) |
| Cloud Backend | Self-hosting the cloud backend |
FAQ
What is AWT?
AWT (AI Watch Tester) is an open-source, AI-powered E2E testing tool. You give it a URL, and it automatically generates test scenarios, executes them in a real browser (Playwright), and reports results — no test code required.
How do I install it?
Cloud (no install): Visit ai-watch-tester.vercel.app and enter a URL.
Local:
pip install aat-devqa
playwright install chromium
aat dashboard
From source:
git clone https://github.com/ksgisang/AI-Watch-Tester.git
cd AI-Watch-Tester
make dev && aat dashboard
Which AI providers are supported?
| Provider | Models | Cost |
|---|---|---|
| OpenAI | gpt-4o, gpt-4o-mini | Pay-per-use |
| Anthropic | Claude Sonnet 4 | Pay-per-use |
| Ollama | codellama, llama3, mistral | Free (local GPU) |
Cloud users can bring their own API key (BYOK) via the Settings page. Keys are Fernet-encrypted at rest.
How much does it cost?
| Plan | Price | Tests/month | Concurrent |
|---|---|---|---|
| Free | $0 | 5 | 1 |
| Pro | $28.99/mo | 100 | 3 |
| Team | $98.99/mo | 500 | 10 |
The open-source local mode is completely free with no limits — you just need your own AI API key.
Is it open source?
Yes. AWT is licensed under the MIT License — free for personal and commercial use. You can self-host, modify, and distribute it. Contributions are welcome!
Can I use it in CI/CD?
Yes. Pro and Team plans include API keys for CI/CD integration:
curl -X POST https://your-awt-server.com/api/v1/run \
-H "X-API-Key: awt_your_key" \
-H "Content-Type: application/json" \
-d '{"target_url": "https://staging.example.com"}'
See the CI/CD Guide for GitHub Actions and GitLab CI examples.
Is my data secure?
- All traffic is encrypted via HTTPS/TLS
- BYOK API keys are Fernet-encrypted (AES-128-CBC + HMAC-SHA256) at rest
- Screenshots are auto-deleted after 7 days
- Database hosted on Supabase (AWS Seoul region)
- See our Privacy Policy for full details
License
MIT — free for personal and commercial use.
Built with Playwright, OpenCV, and a lot of AI. Made by @ksgisang.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aat_devqa-1.2.1.tar.gz.
File metadata
- Download URL: aat_devqa-1.2.1.tar.gz
- Upload date:
- Size: 11.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
047719898844b49f40e79410cc9607cb58ea5d1a527b11505427481c06328cb2
|
|
| MD5 |
012a0214b58400e1af732a1ae0d68cb2
|
|
| BLAKE2b-256 |
5390ce7d9fd1451b2ad8fe0f5b974db263d2b66345277d118f3a8e9708ac976d
|
File details
Details for the file aat_devqa-1.2.1-py3-none-any.whl.
File metadata
- Download URL: aat_devqa-1.2.1-py3-none-any.whl
- Upload date:
- Size: 175.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d856bb12dc0ad8a5bbe877ed34fd69eb9f68e7b098622f9d7926e2ceaa172a0a
|
|
| MD5 |
e8771c2d27a1ab5b8907b248d47b031e
|
|
| BLAKE2b-256 |
77c05fbca902dc44a5b1ac249087893daa950ebe6ba4d9674529a88b0dc7c61a
|