Rogue agent evaluator by Rogue Security
Project description
Rogue — AI Agent Evaluator & Red Team Platform
Two Ways to Harden Your Agent
🎯 Automatic EvaluationTest your agent against business policies and expected behaviors.
Best for: Regression testing, behavior validation, policy compliance |
🔴 Red TeamingSimulate adversarial attacks to find security vulnerabilities.
Best for: Security audits, penetration testing, compliance reporting |
Architecture
Rogue operates on a client-server architecture with multiple interfaces:
| Component | Description |
|---|---|
| Server | Core evaluation & red team logic |
| TUI | Modern terminal interface (Go + Bubble Tea) |
| CLI | Non-interactive mode for CI/CD pipelines |
https://github.com/user-attachments/assets/b5c04772-6916-4aab-825b-6a7476d77787
Supported Protocols
| Protocol | Transport | Description |
|---|---|---|
| A2A | HTTP | Google's Agent-to-Agent protocol |
| MCP | SSE, STREAMABLE_HTTP | Model Context Protocol via send_message tool |
| Python | — | Direct Python function calls (no network protocol) |
See examples in examples/ for reference implementations.
Python Entrypoint
For agents implemented as Python functions without A2A or MCP:
- Create a Python file with a
call_agentfunction:
def call_agent(messages: list[dict]) -> str:
"""
Process conversation and return response.
Args:
messages: List of {"role": "user"|"assistant", "content": "..."}
Returns:
Agent's response as a string
"""
# Your agent logic here
latest = messages[-1]["content"]
return f"Response to: {latest}"
- Run Rogue with Python protocol:
uvx rogue-ai cli \
--protocol python \
--python-entrypoint-file ./my_agent.py \
--judge-llm openai/gpt-4o-mini
Or via TUI: select "Python" as the protocol and enter the file path.
See examples/python_entrypoint_stub.py for a complete example.
🔥 Quick Start
Prerequisites
uvx— Install uv- Python 3.10+
- LLM API key (OpenAI, Anthropic, or Google)
Installation
# TUI (recommended)
uvx rogue-ai
# CLI / CI/CD
uvx rogue-ai cli
Try It With the Example Agent
# All-in-one: starts both Rogue and a sample T-shirt store agent
uvx rogue-ai --example=tshirt_store
Configure in the UI:
- Agent URL:
http://localhost:10001 - Mode: Choose
Automatic EvaluationorRed Teaming
Running Modes
| Mode | Command | Description |
|---|---|---|
| Default | uvx rogue-ai |
Server + TUI |
| Server | uvx rogue-ai server |
Backend only |
| TUI | uvx rogue-ai tui |
Terminal client |
| CLI | uvx rogue-ai cli |
Non-interactive (CI/CD) |
Server Options
uvx rogue-ai server --host 0.0.0.0 --port 8000 --debug
CLI Options
uvx rogue-ai cli \
--evaluated-agent-url http://localhost:10001 \
--judge-llm openai/gpt-4o-mini \
--business-context-file ./.rogue/business_context.md
| Option | Description |
|---|---|
--config-file |
Path to config JSON |
--evaluated-agent-url |
Agent endpoint (required) |
--judge-llm |
LLM for evaluation (required) |
--business-context |
Context string or --business-context-file |
--input-scenarios-file |
Scenarios JSON |
--output-report-file |
Report output path |
Red Teaming
Scan Types
| Type | Vulnerabilities | Attacks | Time |
|---|---|---|---|
| Basic | 5 curated | 6 | ~2-3 min |
| Full | 75+ | 40+ | ~30-45 min |
| Custom | User-selected | User-selected | Varies |
Compliance Frameworks
- OWASP LLM Top 10 — Prompt injection, sensitive data exposure, excessive agency
- MITRE ATLAS — Adversarial threat landscape for AI systems
- NIST AI RMF — AI risk management framework
- ISO/IEC 42001 — AI management system standard
- EU AI Act — European AI regulation compliance
- GDPR — Data protection requirements
- OWASP API Top 10 — API security best practices
Attack Categories
| Category | Examples |
|---|---|
| Encoding | Base64, ROT13, Leetspeak |
| Social Engineering | Roleplay, trust building |
| Injection | Prompt injection, SQL injection |
| Semantic | Goal redirection, context poisoning |
| Technical | Gray-box probing, permission escalation |
Risk Scoring (CVSS-based)
Each vulnerability receives a 0-10 risk score based on:
- Impact — Severity if exploited
- Exploitability — Success rate likelihood
- Human Factor — Manual exploitation potential
- Complexity — Attack difficulty
Reproducible Scans
# Use random seeds for reproducible results
uvx rogue-ai cli --random-seed 42
Perfect for regression testing and validating security fixes.
Configuration
Environment Variables
OPENAI_API_KEY="sk-..."
ANTHROPIC_API_KEY="sk-..."
GOOGLE_API_KEY="..."
Config File (.rogue/user_config.json)
{
"evaluated_agent_url": "http://localhost:10001",
"judge_llm": "openai/gpt-4o-mini"
}
Key Features
| Feature | Description |
|---|---|
| 🔄 Dynamic Scenarios | Auto-generate tests from business context |
| 👀 Live Monitoring | Watch agent conversations in real-time |
| 📊 Comprehensive Reports | Markdown, CSV, JSON exports |
| 🔍 Multi-Faceted Testing | Policy compliance + security vulnerabilities |
| 🤖 Model Support | OpenAI, Anthropic, Google (via LiteLLM) |
| 🛡️ CVSS Scoring | Industry-standard risk assessment |
| 🔁 Reproducible | Deterministic scans with random seeds |
Documentation
- Quick Reference — One-page cheat sheet
- Red Team Workflow — Technical deep-dive
- Implementation Status — Feature breakdown
- Attack Mapping — Vulnerability coverage
Contributing
- Fork the repository
- Create a branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push (
git push origin feature/amazing-feature) - Open a Pull Request
License
Licensed under a proprietary license — see LICENSE.
Free for personal and internal use. Commercial hosting requires licensing.
Contact: hello@rogue.security
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rogue_ai-0.6.3.tar.gz.
File metadata
- Download URL: rogue_ai-0.6.3.tar.gz
- Upload date:
- Size: 14.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9c97e18c016801c7bf921d9fc176a4f57303fe0af2e11218ea79b2d8eebd9961
|
|
| MD5 |
94438b3fc5f2e77394176678f625cec7
|
|
| BLAKE2b-256 |
86e6969ab991d8c1390db78341fa048fb9cb635d0899be3dbe6caaa6cb36b8f0
|
Provenance
The following attestation bundles were made for rogue_ai-0.6.3.tar.gz:
Publisher:
release.yml on qualifire-dev/rogue
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rogue_ai-0.6.3.tar.gz -
Subject digest:
9c97e18c016801c7bf921d9fc176a4f57303fe0af2e11218ea79b2d8eebd9961 - Sigstore transparency entry: 1400604701
- Sigstore integration time:
-
Permalink:
qualifire-dev/rogue@394435851c88225c3d75175577e1442faa02c825 -
Branch / Tag:
refs/tags/v0.6.3 - Owner: https://github.com/qualifire-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@394435851c88225c3d75175577e1442faa02c825 -
Trigger Event:
push
-
Statement type:
File details
Details for the file rogue_ai-0.6.3-py3-none-any.whl.
File metadata
- Download URL: rogue_ai-0.6.3-py3-none-any.whl
- Upload date:
- Size: 342.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
231ec3059999d79fa28c17167dfc9717e74d9a9520d0001baeff6c2e0ba27ebb
|
|
| MD5 |
b50c45ce0416b4fbe14a5d15db802c0b
|
|
| BLAKE2b-256 |
ff08d728132e556f20a269217077c4f63022a2df5bb398cf03442437c8636e0a
|
Provenance
The following attestation bundles were made for rogue_ai-0.6.3-py3-none-any.whl:
Publisher:
release.yml on qualifire-dev/rogue
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rogue_ai-0.6.3-py3-none-any.whl -
Subject digest:
231ec3059999d79fa28c17167dfc9717e74d9a9520d0001baeff6c2e0ba27ebb - Sigstore transparency entry: 1400604800
- Sigstore integration time:
-
Permalink:
qualifire-dev/rogue@394435851c88225c3d75175577e1442faa02c825 -
Branch / Tag:
refs/tags/v0.6.3 - Owner: https://github.com/qualifire-dev
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@394435851c88225c3d75175577e1442faa02c825 -
Trigger Event:
push
-
Statement type: