Skip to main content

Rogue agent evaluator by Qualifire

Project description

Rogue — AI Agent Evaluator & Red Team Platform

Tests

Stress-test your AI agents before attackers do.

Discord Community · Quick Start · Documentation


Two Ways to Harden Your Agent

🎯 Automatic Evaluation

Test your agent against business policies and expected behaviors.

  • Define scenarios & expected outcomes
  • Verify compliance with business rules
  • Watch live conversations as Rogue probes your agent
  • Get detailed pass/fail reports with reasoning

Best for: Regression testing, behavior validation, policy compliance

🔴 Red Teaming

Simulate adversarial attacks to find security vulnerabilities.

  • 75+ vulnerabilities across 12 security categories
  • 20 attack techniques (encoding, social engineering, injection)
  • CVSS-based risk scoring
  • 8 compliance frameworks (OWASP, MITRE, NIST, GDPR, EU AI Act)

Best for: Security audits, penetration testing, compliance reporting


Architecture

Rogue operates on a client-server architecture with multiple interfaces:

Component Description
Server Core evaluation & red team logic
TUI Modern terminal interface (Go + Bubble Tea)
CLI Non-interactive mode for CI/CD pipelines

https://github.com/user-attachments/assets/b5c04772-6916-4aab-825b-6a7476d77787

Supported Protocols

Protocol Transport Description
A2A HTTP Google's Agent-to-Agent protocol
MCP SSE, STREAMABLE_HTTP Model Context Protocol via send_message tool

See examples in examples/ for reference implementations.


🔥 Quick Start

Prerequisites

  • uvxInstall uv
  • Python 3.10+
  • LLM API key (OpenAI, Anthropic, or Google)

Installation

# TUI (recommended)
uvx rogue-ai

# CLI / CI/CD
uvx rogue-ai cli

Try It With the Example Agent

# All-in-one: starts both Rogue and a sample T-shirt store agent
uvx rogue-ai --example=tshirt_store

Configure in the UI:

  • Agent URL: http://localhost:10001
  • Mode: Choose Automatic Evaluation or Red Teaming

Running Modes

Mode Command Description
Default uvx rogue-ai Server + TUI
Server uvx rogue-ai server Backend only
TUI uvx rogue-ai tui Terminal client
CLI uvx rogue-ai cli Non-interactive (CI/CD)

Server Options

uvx rogue-ai server --host 0.0.0.0 --port 8000 --debug

CLI Options

uvx rogue-ai cli \
  --evaluated-agent-url http://localhost:10001 \
  --judge-llm openai/gpt-4o-mini \
  --business-context-file ./.rogue/business_context.md
Option Description
--config-file Path to config JSON
--evaluated-agent-url Agent endpoint (required)
--judge-llm LLM for evaluation (required)
--business-context Context string or --business-context-file
--input-scenarios-file Scenarios JSON
--output-report-file Report output path
--deep-test-mode Extended testing

Red Teaming

Scan Types

Type Vulnerabilities Attacks Time
Basic 5 curated 6 ~2-3 min
Full 75+ 40+ ~30-45 min
Custom User-selected User-selected Varies

Compliance Frameworks

  • OWASP LLM Top 10 — Prompt injection, sensitive data exposure, excessive agency
  • MITRE ATLAS — Adversarial threat landscape for AI systems
  • NIST AI RMF — AI risk management framework
  • ISO/IEC 42001 — AI management system standard
  • EU AI Act — European AI regulation compliance
  • GDPR — Data protection requirements
  • OWASP API Top 10 — API security best practices

Attack Categories

Category Examples
Encoding Base64, ROT13, Leetspeak
Social Engineering Roleplay, trust building
Injection Prompt injection, SQL injection
Semantic Goal redirection, context poisoning
Technical Gray-box probing, permission escalation

Risk Scoring (CVSS-based)

Each vulnerability receives a 0-10 risk score based on:

  • Impact — Severity if exploited
  • Exploitability — Success rate likelihood
  • Human Factor — Manual exploitation potential
  • Complexity — Attack difficulty

Reproducible Scans

# Use random seeds for reproducible results
uvx rogue-ai cli --random-seed 42

Perfect for regression testing and validating security fixes.


Configuration

Environment Variables

OPENAI_API_KEY="sk-..."
ANTHROPIC_API_KEY="sk-..."
GOOGLE_API_KEY="..."

Config File (.rogue/user_config.json)

{
  "evaluated_agent_url": "http://localhost:10001",
  "judge_llm": "openai/gpt-4o-mini"
}

Key Features

Feature Description
🔄 Dynamic Scenarios Auto-generate tests from business context
👀 Live Monitoring Watch agent conversations in real-time
📊 Comprehensive Reports Markdown, CSV, JSON exports
🔍 Multi-Faceted Testing Policy compliance + security vulnerabilities
🤖 Model Support OpenAI, Anthropic, Google (via LiteLLM)
🛡️ CVSS Scoring Industry-standard risk assessment
🔁 Reproducible Deterministic scans with random seeds

Documentation


Contributing

  1. Fork the repository
  2. Create a branch (git checkout -b feature/amazing-feature)
  3. Commit changes (git commit -m 'Add amazing feature')
  4. Push (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

Licensed under a proprietary license — see LICENSE.

Free for personal and internal use. Commercial hosting requires licensing. Contact: admin@qualifire.ai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rogue_ai-0.3.1.tar.gz (13.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rogue_ai-0.3.1-py3-none-any.whl (293.8 kB view details)

Uploaded Python 3

File details

Details for the file rogue_ai-0.3.1.tar.gz.

File metadata

  • Download URL: rogue_ai-0.3.1.tar.gz
  • Upload date:
  • Size: 13.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rogue_ai-0.3.1.tar.gz
Algorithm Hash digest
SHA256 a8f854781d3b3e36517892997dcb81953763efe98c07ec5a830388868f55a89d
MD5 1d3f04a4db0a2f359069e014a953755a
BLAKE2b-256 6468fa42be133319e41dd823a8f4e3e341615fb70344770f16f35276de51d187

See more details on using hashes here.

Provenance

The following attestation bundles were made for rogue_ai-0.3.1.tar.gz:

Publisher: release.yml on qualifire-dev/rogue

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rogue_ai-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: rogue_ai-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 293.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rogue_ai-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2b8af1d3aeff6329317b86c3db8c49462ed6ce0daf8b6c1c6cce3659c9bf61f0
MD5 7b05a8141e50dae0ee265744275e7c4c
BLAKE2b-256 ab22b5f7632b590c47ea74600b31142677b0d2511c55c624c40b7360674860f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for rogue_ai-0.3.1-py3-none-any.whl:

Publisher: release.yml on qualifire-dev/rogue

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page