LLM-driven BDD test authoring for Robot Framework — turn an intention + live app into a .robot suite

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

bayeslearner

These details have not been verified by PyPI

Project description

aitester-bdd

📖 Docs: bayeslearner.github.io/aitester-bdd · 📦 PyPI: aitester-bdd

A general end-to-end testing framework for any web app — where you describe the test in English and an agent writes a real, runnable test file by actually using your app.

Not bolted onto a specific product, framework, or stack. Point it at example.com, your own SaaS, an internal SPA, a customer site — same library, same workflow.

The pitch

You built a web app. Every time you change something you click through the same flows in a browser to check it still works. You know automating this is called end-to-end testing, but every time you've looked into it you've found yourself wiring up Playwright, learning a test framework, and writing a hundred lines of code to check what your eyes could check in ten seconds.

aitester-bdd is the missing middle: describe what should happen in English, get back a real .robot test file. Run it as many times as you want for free.

pip install aitester-bdd
aitester init-browser   # one-time Playwright setup

aitester author \
  --story "Open the homepage, search for 'BDD', confirm the article heading appears and a paragraph mentions BDD." \
  --base-url https://en.wikipedia.org \
  --out wiki_test.robot

aitester run wiki_test.robot

About a minute later you have wiki_test.robot — checked-in, human-readable, runs in ~3 seconds with no LLM in the loop.

Two kinds of test line, one file

Most lines in an authored suite are pinned — strict, deterministic, no LLM at runtime:

Given I am on "/"
When I fill "input[name='search']" with "BDD"
And I click "button.cdx-search-input__end-button"
Then I see "Behavior-driven development"

But some parts of an app are too volatile to pin (AI chat replies, dynamic dashboards, search results). For those, you drop in a fluid line:

When I explore "ask the chat 'what does the pro plan cost' and check the answer mentions a dollar amount"

At runtime that one line spins up a small LLM agent that uses the same browser (same tab, same cookies, same login session), does the semantic check, and returns pass/fail. Use it sparingly — the rest of your suite stays cheap.

You choose per line whether you want strictness or flexibility.

Promotion path: explore now, pin later

A third keyword does both at once — runs the story as a fluid test, and if it succeeds, writes a pinned .robot from the experience:

When I explore and author "log in, navigate to settings, change the theme to dark, verify it stuck" output=settings.robot

Use it when you're not sure a flow is stable enough to commit to a pinned test yet. Prototype with I explore, promote to pinned when the flow settles.

How it works

Authoring (one-time, ~30s–2min, uses an LLM).

A DeepAgents + LangGraph agent (Claude by default, any OpenAI-compatible model works) reads your story, opens your app in a real browser, takes structured snapshots of every interactive element on the page, picks one, acts on it, snapshots the new state, and loops. Every selector it writes is one it touched during exploration — never guessed from the story. If the app is too broken to test (login dead, page won't load), it writes a markdown bug report to triage/ instead of inventing a fake test.

Running (every time, ~seconds, no LLM for pinned lines).

aitester run is just Robot Framework executing the suite. Run it 1,000 times in CI, costs nothing for the pinned portion. Fluid I explore lines pay for an LLM call at runtime — that's the explicit trade you opted into when you wrote them.

Failures get a diagnosis.

When a rule fails, an aspect hands the trajectory (what was clicked, page state before and after) to an LLM and asks "is this a test bug or an app bug?" You get a short natural-language explanation attached to the failure — not a raw stack trace.

Quick speed reference

Authoring is headless DeepAgents on Claude Opus 4.7. Typical wall-time:

Story	Steps	Wall time
example.com smoke (heading + link)	9	~27s
en.wikipedia.org search + article check (5 assertions)	27	~70s
Real SPA login + chat + multi-rule verification	50–80	2–3 min

The agent batches multiple browser ops per LLM round-trip, so most remaining wall-time is SUT-bound — waiting for the app's own LLM to stream a response — not authoring overhead.

Backends

AITESTER_BROWSER= picks the driver at runtime:

Backend	Default?	Setup	Best for
`playwright`	✓	`aitester init-browser` once	consistent engine for pinned + fluid, reliable text reads, native Playwright waits, in-process speed
`agent-browser`		`npm i -g agent-browser`	zero install friction, same CLI used during authoring
`nodriver`		`pip install aitester-bdd[stealth]` + Edge/Chrome	bot-detected sites (DataDome / Cloudflare BM)

Same .robot runs on any of the three because everything is CSS selectors. With the default playwright backend, pinned and fluid lines share one in-process browser session.

Status

Alpha. Verified end-to-end on public sites (example.com, en.wikipedia.org, the-internet.herokuapp.com) and on a real internal SPA (login + chat + tool-rendering verification).

Architecture, one paragraph

The LLM is the author, not the runtime — except where you explicitly ask for it. Authoring drives the live target via Playwright, snapshots real DOM, and emits a .robot file. Runtime is Robot Framework executing pinned rules deterministically; fluid I explore rules invoke an LLM at runtime against the same browser session as the pinned rules. Failures fire an AOP diagnose aspect that produces a natural-language explanation. Backends are pluggable; the walker is engine-agnostic.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

bayeslearner

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.2

Jul 7, 2026

0.4.1

Jul 7, 2026

0.4.0

Jul 7, 2026

0.3.0

Jun 2, 2026

0.2.0

May 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aitester_bdd-0.4.2.tar.gz (415.6 kB view details)

Uploaded Jul 7, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aitester_bdd-0.4.2-py3-none-any.whl (131.4 kB view details)

Uploaded Jul 7, 2026 Python 3

File details

Details for the file aitester_bdd-0.4.2.tar.gz.

File metadata

Download URL: aitester_bdd-0.4.2.tar.gz
Upload date: Jul 7, 2026
Size: 415.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aitester_bdd-0.4.2.tar.gz
Algorithm	Hash digest
SHA256	`401e6f75463e0895c982159451e9b6d6c681ee29f4698f92d57be25f55cd69a5`
MD5	`7e686ef40cfcf592bdc400845de11c50`
BLAKE2b-256	`9a8aafdf28e21dd6f496195c691315b04446bc388a4bac9b7773693f4d95976c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for aitester_bdd-0.4.2.tar.gz:

Publisher: publish-pypi.yml on bayeslearner/aitester-bdd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: aitester_bdd-0.4.2.tar.gz
- Subject digest: 401e6f75463e0895c982159451e9b6d6c681ee29f4698f92d57be25f55cd69a5
- Sigstore transparency entry: 2104705418
- Sigstore integration time: Jul 7, 2026
Source repository:
- Permalink: bayeslearner/aitester-bdd@269a493f7000b63e323928cbbf75da44fd130152
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/bayeslearner
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@269a493f7000b63e323928cbbf75da44fd130152
- Trigger Event: push

File details

Details for the file aitester_bdd-0.4.2-py3-none-any.whl.

File metadata

Download URL: aitester_bdd-0.4.2-py3-none-any.whl
Upload date: Jul 7, 2026
Size: 131.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aitester_bdd-0.4.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4d744a4efc93f7b52c4cfed1f9b60d01c72203e749373e9a72319ed43ffab70d`
MD5	`e3f639f206382ad7c044b6665a5f0440`
BLAKE2b-256	`668a92d6c1b1b2f69673c2546e2be154131f349eaafe71491540ea1e4ed9316e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for aitester_bdd-0.4.2-py3-none-any.whl:

Publisher: publish-pypi.yml on bayeslearner/aitester-bdd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: aitester_bdd-0.4.2-py3-none-any.whl
- Subject digest: 4d744a4efc93f7b52c4cfed1f9b60d01c72203e749373e9a72319ed43ffab70d
- Sigstore transparency entry: 2104705569
- Sigstore integration time: Jul 7, 2026
Source repository:
- Permalink: bayeslearner/aitester-bdd@269a493f7000b63e323928cbbf75da44fd130152
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/bayeslearner
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@269a493f7000b63e323928cbbf75da44fd130152
- Trigger Event: push

aitester-bdd 0.4.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

aitester-bdd

The pitch

Two kinds of test line, one file

Promotion path: explore now, pin later

How it works

Quick speed reference

Backends

Status

Architecture, one paragraph

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance