Dreadnode SDK
Project description
Dreadnode Strikes SDK
Strikes is a comprehensive platform for building, experimenting with, and evaluating AI security agents.
Key Features
- Agents - Build multi-step reasoning agents with tools, hooks, and scoring
- Tasks & Runs - Structure experiments with tracked inputs, outputs, and metrics
- Evaluations - Run agents against datasets with composable scorers
- AIRT - Adversarial AI Research Tools for security testing (TAP, GOAT, image attacks)
- Observability - OpenTelemetry-based tracing with span hierarchy
- Datasets & Models - HuggingFace integration with local CAS storage
- Deployment - Serve agents via FastAPI, Cloudflare Workers, or Ray
Quick Example
import dreadnode as dn
dn.configure()
# Define a tool
@dn.tool
def search_database(query: str) -> list[str]:
"""Search the vulnerability database."""
return ["CVE-2024-1234", "CVE-2024-5678"]
# Create an agent with tools
@dn.agent(model="openai/gpt-4o", tools=[search_database])
def security_analyst():
"""You are a security analyst. Find and analyze vulnerabilities."""
# Run the agent - tracing is automatic
async def main():
trajectory = await security_analyst.run(
"Analyze recent vulnerabilities in the database"
)
print(f"Completed in {len(trajectory.steps)} steps")
print(f"Token usage: {trajectory.usage.total_tokens}")
Agents
Create agents with tools, hooks, and real-time scoring:
import dreadnode as dn
from dreadnode import tool
from dreadnode.core.agents.reactions import Finish, Continue
# Tools with type hints
@tool
def scan_ports(host: str) -> list[int]:
"""Scan for open ports on a host."""
return [22, 80, 443] # Simplified example
# Agent with configuration
@dn.agent(
model="anthropic/claude-3-5-sonnet",
tools=[scan_ports],
max_steps=10,
)
def pentester():
"""You are a penetration tester. Find security issues."""
# Hooks for control flow
@pentester.hook
async def check_progress(event):
if "found vulnerability" in str(event):
return Finish("Vulnerability discovered")
return Continue()
# Run the agent
trajectory = await pentester.run("Test the web application at localhost:8080")
print(f"Completed in {len(trajectory.steps)} steps")
print(f"Token usage: {trajectory.usage.total_tokens}")
Evaluations
Run systematic evaluations with datasets and scorers:
from dreadnode import Evaluation
from dreadnode.scorers import contains, llm_judge, and_, not_
# Compose scorers
quality = and_(
contains("vulnerability", case_sensitive=False),
not_(contains("error")),
)
judge = llm_judge(
model="openai/gpt-4o-mini",
rubric="Rate the security analysis from 1-10 based on thoroughness.",
)
# Create evaluation
evaluation = Evaluation(
name="security-eval",
task=pentester.as_task(),
dataset=[
{"target": "webapp-1", "goal": "Find SQL injection"},
{"target": "webapp-2", "goal": "Find XSS vulnerabilities"},
{"target": "api-server", "goal": "Test authentication"},
],
scorers=[quality, judge],
concurrency=3,
)
# Run evaluation
result = await evaluation.run()
print(f"Average score: {result.metrics['judge'].mean()}")
AIRT (Adversarial AI Research Tools)
Security testing tools for LLMs and classifiers:
from dreadnode.airt import LLMTarget
from dreadnode.airt.attacks import tap_attack, prompt_attack
from dreadnode.airt.transforms.cipher import rot13_cipher, caesar_cipher
# Define target
target = LLMTarget(model="openai/gpt-4o-mini")
# TAP attack (Tree of Attacks)
attack = tap_attack(
goal="Extract the system prompt",
target=target,
attacker_model="openai/gpt-4o-mini",
evaluator_model="openai/gpt-4o-mini",
beam_width=10,
)
result = await attack.run(max_iterations=20)
print(f"Best score: {result.best_score}")
# Text transforms for evasion
rot13 = rot13_cipher()
caesar = caesar_cipher(offset=3)
combined = rot13 | caesar # Compose transforms
Datasets & Models
HuggingFace integration with local storage:
from dreadnode.datasets import Dataset
from dreadnode.models import Model
# Load dataset
dataset = Dataset.from_hf("squad", split="train[:100]")
# Transform and filter
dataset = dataset.map(lambda x: {"input": x["question"]})
dataset = dataset.filter(lambda x: len(x["input"]) > 10)
# Save locally
dataset.save("my-dataset")
# Load models
model = Model.from_hf("bert-base-uncased")
Tracing & Observability
Agents have built-in observability. For lower-level task workflows, use explicit tracing:
import dreadnode as dn
# Agents trace automatically
trajectory = await security_analyst.run("Analyze the target")
# All steps, tool calls, and generations are traced
# For custom task workflows, use explicit runs
@dn.task
async def analyze(target: str) -> dict:
dn.log_input("target", target)
result = {"status": "complete"}
dn.log_output("result", result)
dn.log_metric("quality", 0.95)
return result
with dn.run(name="custom-analysis"):
await analyze("webapp")
Deployment
Serve agents as HTTP endpoints:
from dreadnode.core.integrations.serve import Serve, AuthMode
# Configure server
server = (
Serve()
.with_auth(AuthMode.API_KEY)
.add(security_analyst, path="/analyze")
.add(pentester, path="/pentest")
)
# Run server
server.run(host="0.0.0.0", port=8000)
# Or get FastAPI app for custom configuration
app = server.app()
Installation
Install from PyPI:
pip install -U dreadnode
With optional features:
# Multimodal support (audio, video, images)
pip install -U "dreadnode[multimodal]"
# Training integration (transformers callbacks)
pip install -U "dreadnode[training]"
# All optional features
pip install -U "dreadnode[all]"
From source:
git clone https://github.com/dreadnode/sdk
cd sdk
uv sync --all-extras
Notebooks
Comprehensive Jupyter notebook tutorials are available in notebooks/:
| Category | Notebooks |
|---|---|
| Getting Started | 01: SDK Basics, 02: Tasks & Runs |
| Agents | 03-07: Basics, Tools, Hooks, Scoring, Advanced |
| Evaluation | 08-11: Scorers, Evaluations |
| Data | 12-14: Datasets, Models, Data Types |
| Security (AIRT) | 15-17: Targets, Attacks, Transforms |
| Advanced | 18-24: Search, Generators, Data Designer, Deployment, Tracing, Packaging, Training |
| Developers | 25-27: Config System, Context Injection, Custom Components |
| Environments | 28: Docker, Jupyter Kernel, Kubernetes Sandbox |
| Studies | 29: Optimization Studies, Agent Tuning, Search Strategies |
Documentation
- Installation Guide - Setup options
- Introduction - Getting started guide
- API Reference - Complete API documentation
Examples
Check out dreadnode/example-agents for real-world use cases.
License
See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dreadnode-2.0.0-py3-none-any.whl.
File metadata
- Download URL: dreadnode-2.0.0-py3-none-any.whl
- Upload date:
- Size: 772.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5f794f806d798e34cc47b941899efc72f120100ceb0565a168065b52f924edf
|
|
| MD5 |
8b1c6ef2e25052aca925f1373ecfd864
|
|
| BLAKE2b-256 |
decb914f9cd9fe759b023d44ad79f04b08b76aa5fa6937a4ebbdb3b2cbdc7ba1
|