AI-driven UI automation testing framework with pluggable platform adapters.
Project description
vibe-tester
AI-driven UI automation testing for desktop and (soon) web apps — Cucumber-style tests, pluggable platform adapters, ships with the AI assets your coding agent needs to author and run them.
Status: alpha. Public API may change. The Windows desktop adapter is scaffolded but not yet implemented.
What it does
- Lets you describe a UI test in natural language (in Copilot
Chat, Claude CLI, Cursor, …) and generates a runnable Gherkin
.featurefile using real element locators from your project's element store. - Executes scenarios at any granularity (one feature, one app, everything, or a tag expression) and produces a Markdown report plus optional JSON output for the AI to parse.
- Walks your app interactively with you to record UI element paths into a YAML store the executor can resolve.
The framework ships AI assets (agents, skills, an AGENTS.md
template) and a deterministic CLI (vibe-tester). It does not
embed an LLM and does not run an MCP server — your AI tool of choice
provides the intelligence, the CLI is the integration surface.
Do I need an AI agent?
No. The framework is a Cucumber/behave runner with a UI-automation adapter and a YAML element vocabulary — you can author and run tests entirely by hand. The shipped agents are productivity multipliers, not runtime dependencies.
| Capability | AI needed? |
|---|---|
Run tests (vibe-tester run …) |
No |
Write .feature files by hand using elements.yaml |
No |
Element collection — basic capture (vibe-tester collect …) |
No |
| Element collection — interactive "navigate to the next page" loop | Recommended (agent) |
Per-SUT customizations (hooks/handlers.py, hooks/steps.py) |
No |
| Visual regression baselines + assertions | No |
@setup: / @clean: tag-driven scenario isolation |
No |
| Markdown / JSON reports | No |
Translating a natural-language request → .feature |
Yes (Test Writer) |
| Structured root-cause analysis on a failed scenario | Yes (Test Debugger) |
Auto-proposing @clean: tags + handler stubs from element role: |
Yes (Test Writer) |
| Detecting unmapped step phrases + scaffolding custom-step stubs | Yes (Test Writer) |
Bottom line: CLI + framework run standalone. The agents add
natural-language authoring and structured failure triage. If you don't
have Copilot / Claude CLI / Cursor available, skip the .github/agents/
prompts and write .feature files directly — every step phrase the
runner accepts is documented in the uia-assertions and
element-locators skill files (also shipped to your project, plain
Markdown, readable without an LLM).
Install
# default — every adapter that ships today
pip install vibe-tester
# pick one (smaller install)
pip install vibe-tester[windows-desktop]
# pick several
pip install vibe-tester[windows-desktop,web]
| Extra | Drives | Status |
|---|---|---|
windows-desktop |
WinUI3 / Win32 / WPF / WebView2 / tray / shell menu | In progress |
web |
Browser SUTs | Stub |
macos |
macOS-native SUTs | Stub |
Quickstart
# 1. Create a fresh test project (or scaffold into an existing folder)
mkdir my-app-tests
cd my-app-tests
vibe-tester init
# 2. Capture your first SUT (interactive — your app should be running)
vibe-tester collect --app my-app
# 3. Ask your AI agent (Copilot Chat / Claude CLI / …) to write a test:
# "Write a smoke test that opens Settings and verifies the title."
# The Test Writer agent uses elements.yaml + the framework's CLI.
# 4. Run it
vibe-tester run --app my-app
After step 1 your project looks like:
my-app-tests/
├── AGENTS.md # AI instructions for this project
├── .github/
│ ├── agents/ # element-collector, test-writer, test-runner, test-debugger
│ └── skills/ # element-locators, uia-assertions, image-testing, failure-diagnosis
└── features/
├── environment.py # framework glue — do not edit
└── steps/
└── _framework.py # framework glue — do not edit
After step 2 a SUT subfolder is added:
features/
└── my-app/
├── elements.yaml # the element vocabulary your tests use
├── *.feature # Gherkin tests (the AI writes these)
├── baselines/ # visual regression PNGs (optional)
└── hooks/
├── environment.py # per-SUT cleanup (optional)
└── steps.py # per-SUT custom step defs (optional)
CLI reference
| Command | What it does |
|---|---|
vibe-tester init [--target] [--adapter] [--overwrite] [--json] |
Scaffold a project from shipped assets |
vibe-tester list adapters [--json] |
Show installed adapters |
vibe-tester list features [--json] |
List .feature files and their @app: tag |
vibe-tester list elements --app <name> [--details] [--json] |
Print the element vocabulary for one SUT |
vibe-tester collect --app <name> [--kind] |
Interactive element capture |
vibe-tester run [--feature|--app|--tag] [--scenario] [--json] |
Execute behave + emit Markdown / JSON report |
All commands accept --json for machine-readable output (intended for
the AI agent to parse). Default output is human-friendly Rich tables
and Markdown reports under ./results/.
How the AI assets work
vibe-tester init drops four agents and four skills into .github/
plus an AGENTS.md at the project root. Any AI coding tool that
follows the AGENTS.md convention — Copilot,
Claude CLI, Cursor, etc. — will pick them up automatically.
Agents (one each):
| Agent | Use when |
|---|---|
| Element Collector | Adding a new SUT or new pages to an existing one |
| Test Writer | Authoring .feature files from a natural-language ask |
| Test Runner | Executing tests and producing a Markdown report |
| Test Debugger | A test failed and you want a structured RCA |
Skills:
| Skill | Topic |
|---|---|
| element-locators | Locator syntax, dot-notation, element store schema |
| uia-assertions | All assertion types the framework supports |
| image-testing | Visual regression / baseline strategy |
| failure-diagnosis | RCA methodology + known-issues catalog |
Architecture (one paragraph)
A user project has one element store (elements.yaml) per SUT.
Its app.kind (e.g. windows-desktop) tells the executor which
adapter to use. The CLI dispatches to that adapter for collect /
launch / click / screenshot operations; the core layer is
adapter-agnostic and never imports an adapter directly. New platforms
plug in by adding a sub-package under
vibe_tester/adapters/. See
doc/design/architecture.md for the full
picture.
Contributing
This repo is the framework itself. See AGENTS.md for dev-context guidance (rules, layout, common tasks). Bug reports and PRs welcome at https://github.com/Haroldlei/vibe-tester.
License: MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vibe_tester-0.1.0rc2.tar.gz.
File metadata
- Download URL: vibe_tester-0.1.0rc2.tar.gz
- Upload date:
- Size: 158.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13c3a746165a43c18942d00d70c5eabd6afb4954fce316b503b13ffeddc95a51
|
|
| MD5 |
55ac94a5b1c86de1205dc56513b1ec16
|
|
| BLAKE2b-256 |
a781b29da3453c56396a4a314528297e2dabeb03bbb099b386934bec878b3da7
|
File details
Details for the file vibe_tester-0.1.0rc2-py3-none-any.whl.
File metadata
- Download URL: vibe_tester-0.1.0rc2-py3-none-any.whl
- Upload date:
- Size: 181.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77f838c2fa8ea043bd7b9a7fa6244de3860f3b5694da3a62a2f23aac6c20d0cd
|
|
| MD5 |
f626e655646c588b94cfd2f449259c43
|
|
| BLAKE2b-256 |
3c8fab96d969d7c581afcadd69a2fb4ed4ad88dac15b7d2267b29a5b0a7e6e5b
|