Skip to main content

AI Agent layer for RodSki test automation framework

Project description

rodski-agent

rodski-agent is the AI Agent layer for the RodSki test automation framework.

Harness Agent (Claude Code / CI/CD)
        | CLI + JSON stdout
        v
  rodski-agent (this project)     <-- test design + execution + smart repair
        | CLI call
        v
    rodski (execution engine)     <-- XML parse -> keyword execution -> result

Installation

# From PyPI
pip install rodski-agent

# Development (editable + test tools)
pip install -e "rodski-agent/[dev]"

Dependencies

Package Purpose Required
langgraph Workflow orchestration (StateGraph) Yes
langchain-anthropic Claude LLM integration Yes
langchain-openai OpenAI LLM integration Yes
click CLI framework Yes
pyyaml Configuration files Yes
requests OmniParser HTTP client Yes
pillow Image processing Yes

Quick Start

1. Execute a test case

# Run a test module directory
rodski-agent run --case path/to/test_module/ --format json

# Specify browser and headed mode
rodski-agent run --case path/to/test_module/ --browser firefox --no-headless

# Set max retry count
rodski-agent run --case path/to/test_module/ --max-retry 5

2. Design a test case from requirements

# Generate test case from natural language
rodski-agent design \
  --requirement "Test login with username/password" \
  --output output/login/

# Enable visual exploration with a target URL
rodski-agent design \
  --requirement "Test login" \
  --url "https://app.example.com/login" \
  --output output/login/

3. Full Pipeline (design + validate + execute)

# Basic pipeline
rodski-agent pipeline \
  --requirement "Test user registration" \
  --url "https://app.example.com/register" \
  --output output/register/ \
  --format json

# Parallel execution with custom retry settings
rodski-agent pipeline \
  --requirement "Test checkout flow" \
  --url "https://app.example.com" \
  --output output/checkout/ \
  --parallel --max-workers 4 \
  --max-retry 5 --max-fix-attempts 3

4. Diagnose failed tests

# Diagnose from result directory
rodski-agent diagnose --result output/login/result/

# Diagnose from specific result file
rodski-agent diagnose --result output/login/execution_summary.json --format json

5. View configuration

rodski-agent config show

Output Format

All commands support --format json (default: human). JSON output follows a unified contract:

{
  "status": "success | failure | error",
  "command": "run | design | pipeline | diagnose",
  "output": { ... },
  "error": null
}

run output example

{
  "status": "success",
  "command": "run",
  "output": {
    "total": 3,
    "passed": 3,
    "failed": 0,
    "cases": [
      {"id": "c001", "status": "PASS", "time": 2.1},
      {"id": "c002", "status": "PASS", "time": 1.8}
    ]
  }
}

design output example

{
  "status": "success",
  "command": "design",
  "output": {
    "cases": ["case/c001.xml"],
    "models": ["model/model.xml"],
    "data": ["data/data.xml"],
    "summary": "Generated 3 file(s)"
  }
}

Configuration

rodski-agent looks for agent_config.yaml in this order:

  1. $RODSKI_AGENT_CONFIG env var
  2. ./agent_config.yaml (current directory)
  3. <project_root>/config/agent_config.yaml
  4. Built-in defaults

Environment variables override config file values: RODSKI_AGENT_LLM__DESIGN__MODEL=gpt-4o.

LLM Configuration

Design and Execution agents use separate LLM configurations:

llm:
  design:
    provider: claude
    model: claude-sonnet-4-20250514
    base_url: "http://code.casstime.ai"
    api_key_env: ANTHROPIC_API_KEY
    temperature: 0.7
    max_tokens: 4096
  execution:
    provider: claude
    model: claude-sonnet-4-20250514
    base_url: "http://code.casstime.ai"
    api_key_env: ANTHROPIC_API_KEY
    temperature: 0.1
    max_tokens: 2048

Architecture

Design Agent (LangGraph)

analyze_req -> explore_page -> identify_elem -> plan_cases -> design_data -> generate_xml -> validate_xml
                                                                                  ^              |
                                                                                  +-- (fail) ----+
Node Description
analyze_req LLM extracts test scenarios from requirements
explore_page Playwright screenshot + OmniParser element detection
identify_elem LLM Vision adds semantic labels to detected elements
plan_cases Plans test case structure (phases, steps, models)
design_data Designs test data tables
generate_xml Generates case/model/data XML files
validate_xml Validates with rodski validate, retries on failure

Execution Agent (LangGraph)

pre_check -> execute -> parse_result -[pass]-> report
                                     -[fail]-> diagnose -> retry_decide -[retry]-> apply_fix -> execute
                                                                        -[give_up]-> report
Node Description
pre_check Validate case path, directory structure, rodski version
execute Call rodski run to execute tests
parse_result Parse execution results
diagnose LLM diagnoses failure root cause
retry_decide Decide retry based on diagnosis confidence
apply_fix Smart repair (wait/locator/data strategies)
report Generate final report

Smart Repair Strategies

Strategy Trigger Fix Action
Wait Timeout / element not ready Insert wait step before failing step
Locator Element not found LLM suggests new locator, update model XML
Data Data mismatch LLM suggests new data value, update data XML

Pipeline Orchestrator

The pipeline command chains Design -> Validation Gate -> Execution:

  1. Design Phase: Generate test case XML from requirements
  2. Validation Gate: Run rodski validate on all generated XML; fail-fast if invalid
  3. Execution Phase: Execute generated cases (sequential or parallel with --parallel)

Development

# Install dev dependencies
pip install -e "rodski-agent/[dev]"

# Run tests (424 tests)
cd rodski-agent
PYTHONPATH=src python3 -m pytest tests/ -v

# Run with coverage
PYTHONPATH=src python3 -m pytest tests/ --cov=rodski_agent

Project Structure

rodski-agent/
├── pyproject.toml
├── README.md
├── config/
│   └── agent_config.yaml     # Default configuration
├── schemas/
│   └── output_schema.json    # JSON output schema
├── src/rodski_agent/
│   ├── cli.py                # CLI entry point (Click)
│   ├── common/
│   │   ├── config.py         # Configuration management
│   │   ├── contracts.py      # JSON output contract
│   │   ├── errors.py         # Error classification
│   │   ├── formatters.py     # Output formatting
│   │   ├── llm_bridge.py     # LLM abstraction (langchain)
│   │   ├── omniparser_client.py  # OmniParser HTTP client
│   │   ├── result_parser.py  # Result parser
│   │   ├── rodski_knowledge.py   # RodSki constraint KB
│   │   ├── rodski_tools.py   # RodSki CLI wrappers
│   │   ├── state.py          # LangGraph state definitions
│   │   └── xml_builder.py    # XML generator
│   ├── design/
│   │   ├── graph.py          # Design Agent graph
│   │   ├── nodes.py          # Design workflow nodes
│   │   ├── prompts.py        # LLM prompts
│   │   └── visual.py         # Visual exploration
│   ├── execution/
│   │   ├── graph.py          # Execution Agent graph
│   │   ├── nodes.py          # Execution workflow nodes
│   │   ├── prompts.py        # Diagnosis prompts
│   │   └── fixer.py          # Smart repair strategies
│   └── pipeline/
│       └── orchestrator.py   # Design -> Execution pipeline
└── tests/                    # 424 unit tests

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rodski_agent-2.2.0.tar.gz (116.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rodski_agent-2.2.0-py3-none-any.whl (79.3 kB view details)

Uploaded Python 3

File details

Details for the file rodski_agent-2.2.0.tar.gz.

File metadata

  • Download URL: rodski_agent-2.2.0.tar.gz
  • Upload date:
  • Size: 116.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for rodski_agent-2.2.0.tar.gz
Algorithm Hash digest
SHA256 e7c7780c4bf2c4b81ee16f4364b2715c92eb1e0f46100088b653c0c76a90261e
MD5 198cc93cdb3c18ae15d1af305416f18f
BLAKE2b-256 95bafd4f884d2dccac19a179210d2f53aef4f582682fc58a9256fb99bc09afe9

See more details on using hashes here.

File details

Details for the file rodski_agent-2.2.0-py3-none-any.whl.

File metadata

  • Download URL: rodski_agent-2.2.0-py3-none-any.whl
  • Upload date:
  • Size: 79.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for rodski_agent-2.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b4cf9e87fd8ef308e0e1b424d98f5f0f568e93ba421a6a346b6cb2ccc042093e
MD5 a9c6aefb06107dab6b993b2e712e088e
BLAKE2b-256 9366d7750a5eb2741db95456b87abd373da728e96b9e4cf6b43a4847cfeb68f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page