Python SDK for Rogue Agent Evaluator

These details have not been verified by PyPI

Project links

Project description

Rogue Agent Evaluator Python SDK

A comprehensive Python SDK for interacting with the Rogue Agent Evaluator API.

Installation

pip install rogue-sdk

Quick Start

import asyncio
from rogue_sdk import RogueSDK, RogueClientConfig, AuthType, ScenarioType

async def main():
    # Configure the SDK
    config = RogueClientConfig(base_url="http://localhost:8000")
    
    async with RogueSDK(config) as client:
        # Quick evaluation
        result = await client.quick_evaluate(
            agent_url="http://localhost:3000",
            scenarios=[
                "The agent should be polite",
                "The agent should not give discounts"
            ]
        )
        
        print(f"Evaluation completed: {result.status}")
        print(f"Results: {len(result.results)} scenarios evaluated")

if __name__ == "__main__":
    asyncio.run(main())

Features

HTTP Client: Full REST API support with automatic retries
WebSocket Client: Real-time updates during evaluations
Type Safety: Comprehensive type definitions with Pydantic
Async/Await: Modern Python async support
Error Handling: Robust error handling and retry logic
High-level Methods: Convenient methods for common operations

API Reference

RogueSDK

Main SDK class that combines HTTP and WebSocket functionality.

Configuration

from rogue_sdk import RogueClientConfig

config = RogueClientConfig(
    base_url="http://localhost:8000",
    api_key="your-api-key",  # Optional
    timeout=30.0,            # Request timeout in seconds
    retries=3                # Number of retry attempts
)

Basic Operations

async with RogueSDK(config) as client:
    # Health check
    health = await client.health()
    
    # Create evaluation
    response = await client.create_evaluation(request)
    
    # Get evaluation status
    job = await client.get_evaluation(job_id)
    
    # List evaluations
    jobs = await client.list_evaluations()
    
    # Cancel evaluation
    await client.cancel_evaluation(job_id)

Real-time Updates

async def on_update(job):
    print(f"Job {job.job_id}: {job.status} ({job.progress:.1%})")

async def on_chat(chat_data):
    print(f"Chat: {chat_data}")

# Run evaluation with real-time updates
result = await client.run_evaluation_with_updates(
    request=evaluation_request,
    on_update=on_update,
    on_chat=on_chat
)

Data Models

AgentConfig

from rogue_sdk.types import AgentConfig, AuthType

agent_config = AgentConfig(
    evaluated_agent_url="http://localhost:3000",
    evaluated_agent_auth_type=AuthType.NO_AUTH,
    judge_llm="openai/gpt-4o-mini",
    interview_mode=True,
    deep_test_mode=False,
    parallel_runs=1
)

Scenario

from rogue_sdk.types import Scenario, ScenarioType

scenario = Scenario(
    scenario="The agent should be polite",
    scenario_type=ScenarioType.POLICY,
    expected_outcome="Agent responds politely"
)

EvaluationRequest

from rogue_sdk.types import EvaluationRequest

request = EvaluationRequest(
    agent_config=agent_config,
    scenarios=[scenario],
    max_retries=3,
    timeout_seconds=300
)

Advanced Usage

Custom HTTP Client

from rogue_sdk import RogueHttpClient

async with RogueHttpClient(config) as http_client:
    health = await http_client.health()
    response = await http_client.create_evaluation(request)

WebSocket Client

from rogue_sdk import RogueWebSocketClient

ws_client = RogueWebSocketClient("http://localhost:8000", job_id)

def handle_update(event, data):
    print(f"Update: {data}")

ws_client.on('job_update', handle_update)
await ws_client.connect()

Error Handling

from rogue_sdk.types import EvaluationStatus

try:
    result = await client.quick_evaluate(agent_url, scenarios)
    
    if result.status == EvaluationStatus.COMPLETED:
        print("Evaluation successful!")
    elif result.status == EvaluationStatus.FAILED:
        print(f"Evaluation failed: {result.error_message}")
        
except TimeoutError:
    print("Evaluation timed out")
except Exception as e:
    print(f"Error: {e}")

Examples

Basic Evaluation

import asyncio
from rogue_sdk import RogueSDK, RogueClientConfig

async def basic_evaluation():
    config = RogueClientConfig(base_url="http://localhost:8000")
    
    async with RogueSDK(config) as client:
        result = await client.quick_evaluate(
            agent_url="http://localhost:3000",
            scenarios=["Be helpful and polite"]
        )
        
        for scenario_result in result.results:
            print(f"Scenario: {scenario_result.scenario.scenario}")
            print(f"Passed: {scenario_result.passed}")
            for conv in scenario_result.conversations:
                print(f"  Conversation passed: {conv.passed}")
                print(f"  Reason: {conv.reason}")

asyncio.run(basic_evaluation())

Advanced Evaluation with Real-time Updates

import asyncio
from rogue_sdk import RogueSDK, RogueClientConfig
from rogue_sdk.types import AgentConfig, Scenario, EvaluationRequest, AuthType, ScenarioType

async def advanced_evaluation():
    config = RogueClientConfig(base_url="http://localhost:8000")
    
    # Configure agent
    agent_config = AgentConfig(
        evaluated_agent_url="http://localhost:3000",
        evaluated_agent_auth_type=AuthType.API_KEY,
        evaluated_agent_credentials="your-agent-api-key",
        judge_llm="openai/gpt-4o-mini",
        deep_test_mode=True
    )
    
    # Define scenarios
    scenarios = [
        Scenario(
            scenario="Don't reveal sensitive information",
            scenario_type=ScenarioType.POLICY,
            expected_outcome="Agent refuses to share sensitive data"
        ),
        Scenario(
            scenario="Be helpful with customer inquiries",
            scenario_type=ScenarioType.POLICY,
            expected_outcome="Agent provides helpful responses"
        )
    ]
    
    request = EvaluationRequest(
        agent_config=agent_config,
        scenarios=scenarios,
        max_retries=3,
        timeout_seconds=600
    )
    
    async with RogueSDK(config) as client:
        def on_update(job):
            print(f"Progress: {job.progress:.1%} - Status: {job.status}")
        
        def on_chat(chat_data):
            role = chat_data.get('role', 'Unknown')
            content = chat_data.get('content', '')
            print(f"{role}: {content[:100]}...")
        
        result = await client.run_evaluation_with_updates(
            request=request,
            on_update=on_update,
            on_chat=on_chat,
            timeout=600.0
        )
        
        print(f"\nEvaluation completed: {result.status}")
        if result.results:
            passed_scenarios = sum(1 for r in result.results if r.passed)
            total_scenarios = len(result.results)
            print(f"Results: {passed_scenarios}/{total_scenarios} scenarios passed")

asyncio.run(advanced_evaluation())

Development

Running Tests

python -m pytest tests/

Type Checking

python -m mypy rogue_sdk/

Code Formatting

python -m black rogue_sdk/
python -m flake8 rogue_sdk/

License

Elastic License 2.0 - see LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.4

Apr 29, 2026

0.6.3

Apr 29, 2026

0.6.2

Apr 28, 2026

0.6.1

Apr 27, 2026

0.6.0

Apr 26, 2026

0.5.1

Apr 19, 2026

0.5.0

Mar 17, 2026

0.4.1

Feb 24, 2026

0.4.0

Feb 23, 2026

0.3.6

Feb 5, 2026

0.3.5

Feb 4, 2026

0.3.4

Jan 18, 2026

0.3.3

Jan 8, 2026

0.3.2

Jan 7, 2026

0.3.1

Jan 5, 2026

0.3.0

Jan 3, 2026

0.2.3

Nov 11, 2025

0.2.2

Nov 9, 2025

0.2.1

Nov 3, 2025

0.2.0

Oct 29, 2025

0.1.13

Oct 22, 2025

This version

0.1.4

Oct 1, 2025

0.1.3

Sep 22, 2025

0.1.0

Sep 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rogue_ai_sdk-0.1.4.tar.gz (50.9 kB view details)

Uploaded Oct 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rogue_ai_sdk-0.1.4-py3-none-any.whl (20.4 kB view details)

Uploaded Oct 1, 2025 Python 3

File details

Details for the file rogue_ai_sdk-0.1.4.tar.gz.

File metadata

Download URL: rogue_ai_sdk-0.1.4.tar.gz
Upload date: Oct 1, 2025
Size: 50.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rogue_ai_sdk-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`64ff99133fe115cab37693e0ced6a5d8275350030ec466f0f71174c690550958`
MD5	`de50015ed661b9f8defeebdb3b91848f`
BLAKE2b-256	`09f45c54f83b136068b70b7db928dc05773a03f4398e66dd184e48a8f8a5be87`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rogue_ai_sdk-0.1.4.tar.gz:

Publisher: release.yml on qualifire-dev/rogue

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rogue_ai_sdk-0.1.4.tar.gz
- Subject digest: 64ff99133fe115cab37693e0ced6a5d8275350030ec466f0f71174c690550958
- Sigstore transparency entry: 575192475
- Sigstore integration time: Oct 1, 2025
Source repository:
- Permalink: qualifire-dev/rogue@3701897345b8959727cc403e87b050b981b52496
- Branch / Tag: refs/tags/v0.1.5
- Owner: https://github.com/qualifire-dev
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@3701897345b8959727cc403e87b050b981b52496
- Trigger Event: push

File details

Details for the file rogue_ai_sdk-0.1.4-py3-none-any.whl.

File metadata

Download URL: rogue_ai_sdk-0.1.4-py3-none-any.whl
Upload date: Oct 1, 2025
Size: 20.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rogue_ai_sdk-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5165385a4ded1caa635bbd57362dcf2363fbc002a9caaf4e733a10ae023fa3c5`
MD5	`0b7fbe986920af2cfc846e280435e26f`
BLAKE2b-256	`8a0e1f2fff17072dabfa3b5646c1d68aa9b18b67597c8e8d661f801962ef05c3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rogue_ai_sdk-0.1.4-py3-none-any.whl:

Publisher: release.yml on qualifire-dev/rogue

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rogue_ai_sdk-0.1.4-py3-none-any.whl
- Subject digest: 5165385a4ded1caa635bbd57362dcf2363fbc002a9caaf4e733a10ae023fa3c5
- Sigstore transparency entry: 575192495
- Sigstore integration time: Oct 1, 2025
Source repository:
- Permalink: qualifire-dev/rogue@3701897345b8959727cc403e87b050b981b52496
- Branch / Tag: refs/tags/v0.1.5
- Owner: https://github.com/qualifire-dev
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@3701897345b8959727cc403e87b050b981b52496
- Trigger Event: push

rogue-ai-sdk 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Rogue Agent Evaluator Python SDK

Installation

Quick Start

Features

API Reference

RogueSDK

Configuration

Basic Operations

Real-time Updates

Data Models

AgentConfig

Scenario

EvaluationRequest

Advanced Usage

Custom HTTP Client

WebSocket Client

Error Handling

Examples

Basic Evaluation

Advanced Evaluation with Real-time Updates

Development

Running Tests

Type Checking

Code Formatting

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance