AI Agent layer for RodSki test automation framework
Project description
rodski-agent
rodski-agent is the AI Agent layer for the RodSki test automation framework.
Harness Agent (Claude Code / CI/CD)
| CLI + JSON stdout
v
rodski-agent (this project) <-- test design + execution + smart repair
| CLI call
v
rodski (execution engine) <-- XML parse -> keyword execution -> result
Installation
# From PyPI
pip install rodski-agent
# Development (editable + test tools)
pip install -e "rodski-agent/[dev]"
Dependencies
| Package | Purpose | Required |
|---|---|---|
| langgraph | Workflow orchestration (StateGraph) | Yes |
| langchain-anthropic | Claude LLM integration | Yes |
| langchain-openai | OpenAI LLM integration | Yes |
| click | CLI framework | Yes |
| pyyaml | Configuration files | Yes |
| requests | OmniParser HTTP client | Yes |
| pillow | Image processing | Yes |
Quick Start
1. Execute a test case
# Run a test module directory
rodski-agent run --case path/to/test_module/ --format json
# Specify browser and headed mode
rodski-agent run --case path/to/test_module/ --browser firefox --no-headless
# Set max retry count
rodski-agent run --case path/to/test_module/ --max-retry 5
2. Design a test case from requirements
# Generate test case from natural language
rodski-agent design \
--requirement "Test login with username/password" \
--output output/login/
# Enable visual exploration with a target URL
rodski-agent design \
--requirement "Test login" \
--url "https://app.example.com/login" \
--output output/login/
3. Full Pipeline (design + validate + execute)
# Basic pipeline
rodski-agent pipeline \
--requirement "Test user registration" \
--url "https://app.example.com/register" \
--output output/register/ \
--format json
# Parallel execution with custom retry settings
rodski-agent pipeline \
--requirement "Test checkout flow" \
--url "https://app.example.com" \
--output output/checkout/ \
--parallel --max-workers 4 \
--max-retry 5 --max-fix-attempts 3
4. Diagnose failed tests
# Diagnose from result directory
rodski-agent diagnose --result output/login/result/
# Diagnose from specific result file
rodski-agent diagnose --result output/login/execution_summary.json --format json
5. View configuration
rodski-agent config show
Output Format
All commands support --format json (default: human). JSON output follows a unified contract:
{
"status": "success | failure | error",
"command": "run | design | pipeline | diagnose",
"output": { ... },
"error": null
}
run output example
{
"status": "success",
"command": "run",
"output": {
"total": 3,
"passed": 3,
"failed": 0,
"cases": [
{"id": "c001", "status": "PASS", "time": 2.1},
{"id": "c002", "status": "PASS", "time": 1.8}
]
}
}
design output example
{
"status": "success",
"command": "design",
"output": {
"cases": ["case/c001.xml"],
"models": ["model/model.xml"],
"data": ["data/data.xml"],
"summary": "Generated 3 file(s)"
}
}
Configuration
rodski-agent looks for agent_config.yaml in this order:
$RODSKI_AGENT_CONFIGenv var./agent_config.yaml(current directory)<project_root>/config/agent_config.yaml- Built-in defaults
Environment variables override config file values: RODSKI_AGENT_LLM__DESIGN__MODEL=gpt-4o.
LLM Configuration
Design and Execution agents use separate LLM configurations:
llm:
design:
provider: claude
model: claude-sonnet-4-20250514
base_url: "http://code.casstime.ai"
api_key_env: ANTHROPIC_API_KEY
temperature: 0.7
max_tokens: 4096
execution:
provider: claude
model: claude-sonnet-4-20250514
base_url: "http://code.casstime.ai"
api_key_env: ANTHROPIC_API_KEY
temperature: 0.1
max_tokens: 2048
Architecture
Design Agent (LangGraph)
analyze_req -> explore_page -> identify_elem -> plan_cases -> design_data -> generate_xml -> validate_xml
^ |
+-- (fail) ----+
| Node | Description |
|---|---|
| analyze_req | LLM extracts test scenarios from requirements |
| explore_page | Playwright screenshot + OmniParser element detection |
| identify_elem | LLM Vision adds semantic labels to detected elements |
| plan_cases | Plans test case structure (phases, steps, models) |
| design_data | Designs test data tables |
| generate_xml | Generates case/model/data XML files |
| validate_xml | Validates with rodski validate, retries on failure |
Execution Agent (LangGraph)
pre_check -> execute -> parse_result -[pass]-> report
-[fail]-> diagnose -> retry_decide -[retry]-> apply_fix -> execute
-[give_up]-> report
| Node | Description |
|---|---|
| pre_check | Validate case path, directory structure, rodski version |
| execute | Call rodski run to execute tests |
| parse_result | Parse execution results |
| diagnose | LLM diagnoses failure root cause |
| retry_decide | Decide retry based on diagnosis confidence |
| apply_fix | Smart repair (wait/locator/data strategies) |
| report | Generate final report |
Smart Repair Strategies
| Strategy | Trigger | Fix Action |
|---|---|---|
| Wait | Timeout / element not ready | Insert wait step before failing step |
| Locator | Element not found | LLM suggests new locator, update model XML |
| Data | Data mismatch | LLM suggests new data value, update data XML |
Pipeline Orchestrator
The pipeline command chains Design -> Validation Gate -> Execution:
- Design Phase: Generate test case XML from requirements
- Validation Gate: Run
rodski validateon all generated XML; fail-fast if invalid - Execution Phase: Execute generated cases (sequential or parallel with
--parallel)
Development
# Install dev dependencies
pip install -e "rodski-agent/[dev]"
# Run tests (424 tests)
cd rodski-agent
PYTHONPATH=src python3 -m pytest tests/ -v
# Run with coverage
PYTHONPATH=src python3 -m pytest tests/ --cov=rodski_agent
Project Structure
rodski-agent/
├── pyproject.toml
├── README.md
├── config/
│ └── agent_config.yaml # Default configuration
├── schemas/
│ └── output_schema.json # JSON output schema
├── src/rodski_agent/
│ ├── cli.py # CLI entry point (Click)
│ ├── common/
│ │ ├── config.py # Configuration management
│ │ ├── contracts.py # JSON output contract
│ │ ├── errors.py # Error classification
│ │ ├── formatters.py # Output formatting
│ │ ├── llm_bridge.py # LLM abstraction (langchain)
│ │ ├── omniparser_client.py # OmniParser HTTP client
│ │ ├── result_parser.py # Result parser
│ │ ├── rodski_knowledge.py # RodSki constraint KB
│ │ ├── rodski_tools.py # RodSki CLI wrappers
│ │ ├── state.py # LangGraph state definitions
│ │ └── xml_builder.py # XML generator
│ ├── design/
│ │ ├── graph.py # Design Agent graph
│ │ ├── nodes.py # Design workflow nodes
│ │ ├── prompts.py # LLM prompts
│ │ └── visual.py # Visual exploration
│ ├── execution/
│ │ ├── graph.py # Execution Agent graph
│ │ ├── nodes.py # Execution workflow nodes
│ │ ├── prompts.py # Diagnosis prompts
│ │ └── fixer.py # Smart repair strategies
│ └── pipeline/
│ └── orchestrator.py # Design -> Execution pipeline
└── tests/ # 424 unit tests
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rodski_agent-2.2.0.tar.gz.
File metadata
- Download URL: rodski_agent-2.2.0.tar.gz
- Upload date:
- Size: 116.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7c7780c4bf2c4b81ee16f4364b2715c92eb1e0f46100088b653c0c76a90261e
|
|
| MD5 |
198cc93cdb3c18ae15d1af305416f18f
|
|
| BLAKE2b-256 |
95bafd4f884d2dccac19a179210d2f53aef4f582682fc58a9256fb99bc09afe9
|
File details
Details for the file rodski_agent-2.2.0-py3-none-any.whl.
File metadata
- Download URL: rodski_agent-2.2.0-py3-none-any.whl
- Upload date:
- Size: 79.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4cf9e87fd8ef308e0e1b424d98f5f0f568e93ba421a6a346b6cb2ccc042093e
|
|
| MD5 |
a9c6aefb06107dab6b993b2e712e088e
|
|
| BLAKE2b-256 |
9366d7750a5eb2741db95456b87abd373da728e96b9e4cf6b43a4847cfeb68f9
|