Skip to main content

AI-driven browser test automation framework

Project description

Skiritai

AI-Powered Test Automation Agent

Named after the Skiritai — Sparta's elite reconnaissance troops who scouted the path ahead of the main army.


Version Python 3.11+ Playwright License: MIT

Test Status Publish

English | 中文


What is Skiritai?

Skiritai is an AI-driven browser test automation framework that scouts automation paths before executing them.

Like the ancient Skiritai who reconnoitered the terrain before the Spartan army advanced, Skiritai's agent first * explores* the target application — navigating pages, discovering UI elements, and figuring out the correct sequence of actions — then generates replayable scripts that can execute the same path at 30x speed without any AI inference.

Explore Mode (Scout the path)
  AI Agent → analyze page → decide actions → generate scripts
         ↓
Replay Mode (Execute the proven path)
  Script → direct execution → no AI needed → 30x faster

Key Features

Feature Description
Explore → Replay Loop AI explores and generates scripts on first run; replays them instantly on subsequent runs
30x Performance Replay mode skips AI inference entirely — 74s → 2.5s
Python-native Cases Define test cases as Python classes with decorators
Auto-Solidification Successful explorations are automatically saved as replayable scripts
Multi-level Fallback fillclick_forceeval_js for resilient element interaction
Flexible LLM Supports OpenAI, Anthropic, Qwen, and any compatible API
Optional Web UI FastAPI backend with REST + WebSocket for external frontends
Visual Reports Standalone HTML report built with Vue 3 + Ant Design — screenshots, assertions, step details
CLI skiritai run/serve/list/browser commands

How It Works

from skiritai import BaseCase, step_mode


class SearchTest(BaseCase):
    async def setup(self):
        await self.launch_browser()

    async def teardown(self):
        await self.close_browser()

    async def open_site(self):
        await self.ai.action("Navigate to https://example.com")

    @step_mode("explore")  # Force AI exploration for this step
    async def search(self):
        await self.ai.action("Search for 'automation testing'")

    async def verify(self):
        await self.ai.action("Verify search results are displayed")

First run — AI explores each step, generates scripts:

[Step] open_site   (explore)  → 20s  → scripts/open_site.py   ✓
[Step] search      (explore)  → 30s  → scripts/search.py      ✓
[Step] verify      (explore)  → 24s  → scripts/verify.py      ✓
Total: 74s

Second run — scripts replay directly, no AI:

[Step] open_site   (replay)   → 0.8s → direct execution       ✓
[Step] search      (replay)   → 0.8s → direct execution       ✓
[Step] verify      (replay)   → 0.8s → direct execution       ✓
Total: 2.5s

Quick Start

1. Install

pip install skiritai
playwright install chromium

2. Configure

# .env
OPENAI_API_KEY=your-api-key
OPENAI_BASE_URL=https://api.openai.com/v1
LLM_MODEL=gpt-4o

3. Run

# Run an example case
skiritai run examples/tutorial/minimal

# List available cases
skiritai list examples/

Or programmatically:

import asyncio
from pathlib import Path
from skiritai import run_case

report = asyncio.run(run_case(Path("examples/minimal")))
print(report)

4. (Optional) Start Web Server

pip install skiritai[web]
skiritai serve --host 0.0.0.0 --port 8000

Project Structure

skiritai/
├── core/                      # Core engine (always installed)
│   ├── agent_loop.py          # LangGraph ReAct Agent
│   ├── ai_context.py          # Explore/Replay execution context
│   ├── base_case.py           # Test case base class
│   ├── runner.py              # Case discovery and execution
│   ├── tools.py               # Playwright tool set (14 tools)
│   ├── browser.py             # Browser lifecycle management
│   └── ...
├── llm/                       # LLM provider abstraction
│   ├── openai_provider.py
│   └── anthropic_provider.py
├── events/                    # Event bus
├── web/                       # [optional] FastAPI server (pip install skiritai[web])
│   ├── app.py                 # Application factory
│   ├── routers/               # REST + WebSocket endpoints
│   └── ws_manager.py          # Event → WebSocket bridge
└── cli.py                     # CLI entry point

report/                        # Visual report project (Vue 3 + Ant Design)
├── src/                       #   Components: ReportHeader, SummaryBar, StepCard, ScreenshotViewer
├── dist/                      #   Build output (single-file HTML, data injected by _render_html)
└── package.json               #   skiritai-report

examples/                      # Sample test cases
├── tutorial/                  # Teaching examples (learn framework features)
│   ├── minimal/               #   Pure Playwright, no AI needed
│   ├── step_modes/            #   auto/explore/replay execution modes
│   ├── failure_policies/      #   ABORT/SKIP/RETRY failure strategies
│   ├── hooks_demo/            #   before_step/after_step/on_step_error hooks
│   └── context_demo/          #   Cross-step context sharing via self.ctx
├── baidu_search/              # [First Try] Full E2E AI-driven test + replay scripts
└── ktovoz_blog/               # [Advanced] 11-step long-range blog test

tests/                         # Framework tests
├── unit/
├── functional/
├── acceptance/
└── e2e/

Examples

Examples are organized into three tiers:

Teaching (learn framework features)

Example What It Teaches
minimal/ BaseCase structure — pure Playwright, no LLM required
step_modes/ auto / explore / replay execution modes via @step_mode
failure_policies/ @on_failure(SKIP) / @on_failure(RETRY) error handling
hooks_demo/ before_step / after_step / on_step_error lifecycle hooks
context_demo/ Cross-step data sharing with self.ctx.store

First Try (real-world end-to-end)

Example Description
baidu_search/ Complete E2E: open Baidu → search → verify results. Demonstrates Explore→Replay loop in a real scenario.

Advanced (long-range testing)

Example Description
ktovoz_blog/ 11-step blog test: homepage, articles, tags, about, footer, search, summary. Demonstrates the framework's capability for complex multi-step scenarios.
# Start with a teaching example (no AI needed)
skiritai run examples/tutorial/minimal

# Try a real-world test (needs LLM configured)
skiritai run examples/baidu_search

# Advanced long-range test
skiritai run examples/ktovoz_blog

Roadmap

Vision Perception Layer

Current AI exploration relies on DOM analysis and CSS selectors. The next evolution adds visual perception — the agent will "see" the page like a human tester, enabling:

  • Visual-based AI exploration — interpret screenshots, identify UI elements by appearance, and interact with canvas/WebGL-based interfaces that lack accessible DOM
  • Multimodal model support — leverage vision-language models (GPT-4o, Claude 3.5 Sonnet, Gemini) and native multimodal models for richer page understanding
  • Visual regression detection — compare screenshots across runs to catch unexpected UI changes

Multi-Platform Testing

Skiritai currently supports Web (Playwright/Chromium). We plan to extend to:

Platform Planned Approach Status
Mobile (iOS/Android) Appium / browser-use mobile integration Planned
API Testing HTTP request tools for the AI agent Planned
Desktop (Electron, native) Playwright Electron / OS-level automation Under investigation

The goal is a unified test framework where the same Explore → Replay workflow works across Web, Mobile, and API — write once, test everywhere.


CLI Commands

skiritai run <case_dir>               # Run a test case
skiritai serve [--host] [--port]       # Start web server
skiritai list [cases_root]            # List available cases
skiritai browser status [case_dir]    # Check persistent browser session
skiritai browser cleanup [case_dir]   # Kill orphan browser process

Tool Set

14 Playwright tools available to the AI agent:

Tool Description
navigate Navigate to URL
click Click element
click_force Force click (for hidden elements)
fill Fill input field
type_text Type character by character
focus Focus on element
get_text Get element text content
get_page_info Get page title, URL, and text summary
wait_for Wait for element to appear
scroll Scroll page
eval_js Execute JavaScript
select_option Select dropdown option
hover Hover over element
screenshot Capture page screenshot

Execution Modes

Control how each step executes via ai.action() or the @step_mode decorator:

Mode Behavior Use Case
auto (default) Replay if script exists, otherwise explore Most steps
explore Always use AI, overwrite existing script New features, re-exploration
replay Always replay, error if no script CI/CD regression
# Via decorator
@step_mode("explore")
async def my_step(self, ai):
    await ai.action("...")


# Via parameter (overrides decorator)
await ai.action("...", mode="replay")

Author

Joe Shen

GitHub


License

MIT

Contributing

PRs Welcome

Contributions, issues, and feature requests are welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skiritai-0.0.5a1.tar.gz (62.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skiritai-0.0.5a1-py3-none-any.whl (69.5 kB view details)

Uploaded Python 3

File details

Details for the file skiritai-0.0.5a1.tar.gz.

File metadata

  • Download URL: skiritai-0.0.5a1.tar.gz
  • Upload date:
  • Size: 62.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for skiritai-0.0.5a1.tar.gz
Algorithm Hash digest
SHA256 0b63fbc03ebeeda3f65ca2cb6f5ca301d9439d41f919e783dd98e7360bd69c89
MD5 37e1e28c348a160d0b15cf50bea336e3
BLAKE2b-256 1beb2caa7cfb9ccf27d3a6fc30f0b9600cbbe4a087089fde8adc30daf5e1a7c7

See more details on using hashes here.

File details

Details for the file skiritai-0.0.5a1-py3-none-any.whl.

File metadata

  • Download URL: skiritai-0.0.5a1-py3-none-any.whl
  • Upload date:
  • Size: 69.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for skiritai-0.0.5a1-py3-none-any.whl
Algorithm Hash digest
SHA256 145742067fef399fa16a5ee934d27911e766e05652773616570b8f6e5e8fd90e
MD5 d9eeaff02be06f41aee318093e0702d2
BLAKE2b-256 1ad589732e60a6148d08e650d513a209de7da5c85f9cac798b8862a1ee7b0808

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page