AI Agent layer for RodSki test automation framework

These details have not been verified by PyPI

Project links

Project description

rodski-agent

rodski-agent is the AI Agent layer for the RodSki test automation framework.

Harness Agent (Claude Code / CI/CD)
        | CLI + JSON stdout
        v
  rodski-agent (this project)     <-- test design + execution + smart repair
        | CLI call
        v
    rodski (execution engine)     <-- XML parse -> keyword execution -> result

Installation

# From PyPI
pip install rodski-agent

# Development (editable + test tools)
pip install -e "rodski-agent/[dev]"

Dependencies

Package	Purpose	Required
langgraph	Workflow orchestration (StateGraph)	Yes
langchain-anthropic	Claude LLM integration	Yes
langchain-openai	OpenAI LLM integration	Yes
click	CLI framework	Yes
pyyaml	Configuration files	Yes
requests	OmniParser HTTP client	Yes
pillow	Image processing	Yes

Quick Start

1. Execute a test case

# Run a test module directory
rodski-agent run --case path/to/test_module/ --format json

# Specify browser and headed mode
rodski-agent run --case path/to/test_module/ --browser firefox --no-headless

# Set max retry count
rodski-agent run --case path/to/test_module/ --max-retry 5

2. Design a test case from requirements

# Generate test case from natural language
rodski-agent design \
  --requirement "Test login with username/password" \
  --output output/login/

# Enable visual exploration with a target URL
rodski-agent design \
  --requirement "Test login" \
  --url "https://app.example.com/login" \
  --output output/login/

3. Full Pipeline (design + validate + execute)

# Basic pipeline
rodski-agent pipeline \
  --requirement "Test user registration" \
  --url "https://app.example.com/register" \
  --output output/register/ \
  --format json

# Parallel execution with custom retry settings
rodski-agent pipeline \
  --requirement "Test checkout flow" \
  --url "https://app.example.com" \
  --output output/checkout/ \
  --parallel --max-workers 4 \
  --max-retry 5 --max-fix-attempts 3

4. Diagnose failed tests

# Diagnose from result directory
rodski-agent diagnose --result output/login/result/

# Diagnose from specific result file
rodski-agent diagnose --result output/login/execution_summary.json --format json

5. View configuration

rodski-agent config show

Output Format

All commands support --format json (default: human). JSON output follows a unified contract:

{
  "status": "success | failure | error",
  "command": "run | design | pipeline | diagnose",
  "output": { ... },
  "error": null
}

run output example

{
  "status": "success",
  "command": "run",
  "output": {
    "total": 3,
    "passed": 3,
    "failed": 0,
    "cases": [
      {"id": "c001", "status": "PASS", "time": 2.1},
      {"id": "c002", "status": "PASS", "time": 1.8}
    ]
  }
}

design output example

{
  "status": "success",
  "command": "design",
  "output": {
    "cases": ["case/c001.xml"],
    "models": ["model/model.xml"],
    "data": ["data/data.xml"],
    "summary": "Generated 3 file(s)"
  }
}

Configuration

rodski-agent looks for agent_config.yaml in this order:

$RODSKI_AGENT_CONFIG env var
./agent_config.yaml (current directory)
<project_root>/config/agent_config.yaml
Built-in defaults

Environment variables override config file values: RODSKI_AGENT_LLM__DESIGN__MODEL=gpt-4o.

LLM Configuration

Design and Execution agents use separate LLM configurations:

llm:
  design:
    provider: claude
    model: claude-sonnet-4-20250514
    base_url: "http://code.casstime.ai"
    api_key_env: ANTHROPIC_API_KEY
    temperature: 0.7
    max_tokens: 4096
  execution:
    provider: claude
    model: claude-sonnet-4-20250514
    base_url: "http://code.casstime.ai"
    api_key_env: ANTHROPIC_API_KEY
    temperature: 0.1
    max_tokens: 2048

Architecture

Design Agent (LangGraph)

analyze_req -> explore_page -> identify_elem -> plan_cases -> design_data -> generate_xml -> validate_xml
                                                                                  ^              |
                                                                                  +-- (fail) ----+

Node	Description
analyze_req	LLM extracts test scenarios from requirements
explore_page	Playwright screenshot + OmniParser element detection
identify_elem	LLM Vision adds semantic labels to detected elements
plan_cases	Plans test case structure (phases, steps, models)
design_data	Designs test data tables
generate_xml	Generates case/model/data XML files
validate_xml	Validates with `rodski validate`, retries on failure

Execution Agent (LangGraph)

pre_check -> execute -> parse_result -[pass]-> report
                                     -[fail]-> diagnose -> retry_decide -[retry]-> apply_fix -> execute
                                                                        -[give_up]-> report

Node	Description
pre_check	Validate case path, directory structure, rodski version
execute	Call `rodski run` to execute tests
parse_result	Parse execution results
diagnose	LLM diagnoses failure root cause
retry_decide	Decide retry based on diagnosis confidence
apply_fix	Smart repair (wait/locator/data strategies)
report	Generate final report

Smart Repair Strategies

Strategy	Trigger	Fix Action
Wait	Timeout / element not ready	Insert wait step before failing step
Locator	Element not found	LLM suggests new locator, update model XML
Data	Data mismatch	LLM suggests new data value, update data XML

Pipeline Orchestrator

The pipeline command chains Design -> Validation Gate -> Execution:

Design Phase: Generate test case XML from requirements
Validation Gate: Run rodski validate on all generated XML; fail-fast if invalid
Execution Phase: Execute generated cases (sequential or parallel with --parallel)

Development

# Install dev dependencies
pip install -e "rodski-agent/[dev]"

# Run tests (424 tests)
cd rodski-agent
PYTHONPATH=src python3 -m pytest tests/ -v

# Run with coverage
PYTHONPATH=src python3 -m pytest tests/ --cov=rodski_agent

Project Structure

rodski-agent/
├── pyproject.toml
├── README.md
├── config/
│   └── agent_config.yaml     # Default configuration
├── schemas/
│   └── output_schema.json    # JSON output schema
├── src/rodski_agent/
│   ├── cli.py                # CLI entry point (Click)
│   ├── common/
│   │   ├── config.py         # Configuration management
│   │   ├── contracts.py      # JSON output contract
│   │   ├── errors.py         # Error classification
│   │   ├── formatters.py     # Output formatting
│   │   ├── llm_bridge.py     # LLM abstraction (langchain)
│   │   ├── omniparser_client.py  # OmniParser HTTP client
│   │   ├── result_parser.py  # Result parser
│   │   ├── rodski_knowledge.py   # RodSki constraint KB
│   │   ├── rodski_tools.py   # RodSki CLI wrappers
│   │   ├── state.py          # LangGraph state definitions
│   │   └── xml_builder.py    # XML generator
│   ├── design/
│   │   ├── graph.py          # Design Agent graph
│   │   ├── nodes.py          # Design workflow nodes
│   │   ├── prompts.py        # LLM prompts
│   │   └── visual.py         # Visual exploration
│   ├── execution/
│   │   ├── graph.py          # Execution Agent graph
│   │   ├── nodes.py          # Execution workflow nodes
│   │   ├── prompts.py        # Diagnosis prompts
│   │   └── fixer.py          # Smart repair strategies
│   └── pipeline/
│       └── orchestrator.py   # Design -> Execution pipeline
└── tests/                    # 424 unit tests

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.2.0

Apr 17, 2026

This version

2.1.0

Apr 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rodski_agent-2.1.0.tar.gz (112.6 kB view details)

Uploaded Apr 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rodski_agent-2.1.0-py3-none-any.whl (74.0 kB view details)

Uploaded Apr 16, 2026 Python 3

File details

Details for the file rodski_agent-2.1.0.tar.gz.

File metadata

Download URL: rodski_agent-2.1.0.tar.gz
Upload date: Apr 16, 2026
Size: 112.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for rodski_agent-2.1.0.tar.gz
Algorithm	Hash digest
SHA256	`7fb2a36a451d81e6fe36c611e1e6279736285e753292423ee4e29cf89b539e50`
MD5	`7f7498e8343b79c1601dc4f677df4007`
BLAKE2b-256	`dfdd58fc317e306a1cdecd40a74f18c24a52175801fd993017ac3bbe4f9ff413`

See more details on using hashes here.

File details

Details for the file rodski_agent-2.1.0-py3-none-any.whl.

File metadata

Download URL: rodski_agent-2.1.0-py3-none-any.whl
Upload date: Apr 16, 2026
Size: 74.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for rodski_agent-2.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7d599e4d80cfa085b9db9905d47bc0419e6ecf7e93ea8d4b74e24686b8fa6bd8`
MD5	`b34101356da10afcabb91f75dcb90fd1`
BLAKE2b-256	`1582fcfc1786f4e4a64fddaf51dbf51341344757a553d5074de5fc7bac9e35c6`

See more details on using hashes here.

rodski-agent 2.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

rodski-agent

Installation

Dependencies

Quick Start

1. Execute a test case

2. Design a test case from requirements

3. Full Pipeline (design + validate + execute)

4. Diagnose failed tests

5. View configuration

Output Format

run output example

design output example

Configuration

LLM Configuration

Architecture

Design Agent (LangGraph)

Execution Agent (LangGraph)

Smart Repair Strategies

Pipeline Orchestrator

Development

Project Structure

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes