Skip to main content

Dreadnode SDK

Project description

Logo

Dreadnode Strikes SDK

PyPI - Python Version PyPI - Version GitHub License Tests Pre-Commit Renovate


Strikes is a comprehensive platform for building, experimenting with, and evaluating AI security agents.

Key Features

  • Agents - Build multi-step reasoning agents with tools, hooks, and scoring
  • Tasks & Runs - Structure experiments with tracked inputs, outputs, and metrics
  • Evaluations - Run agents against datasets with composable scorers
  • AIRT - Adversarial AI Research Tools for security testing (TAP, GOAT, image attacks)
  • Observability - OpenTelemetry-based tracing with span hierarchy
  • Datasets & Models - HuggingFace integration with local CAS storage
  • Deployment - Serve agents via FastAPI, Cloudflare Workers, or Ray

Quick Example

import dreadnode as dn

dn.configure()

# Define a tool
@dn.tool
def search_database(query: str) -> list[str]:
    """Search the vulnerability database."""
    return ["CVE-2024-1234", "CVE-2024-5678"]

# Create an agent with tools
@dn.agent(model="openai/gpt-4o", tools=[search_database])
def security_analyst():
    """You are a security analyst. Find and analyze vulnerabilities."""

# Run the agent - tracing is automatic
async def main():
    trajectory = await security_analyst.run(
        "Analyze recent vulnerabilities in the database"
    )

    print(f"Completed in {len(trajectory.steps)} steps")
    print(f"Token usage: {trajectory.usage.total_tokens}")

Agents

Create agents with tools, hooks, and real-time scoring:

import dreadnode as dn
from dreadnode import tool
from dreadnode.core.agents.reactions import Finish, Continue

# Tools with type hints
@tool
def scan_ports(host: str) -> list[int]:
    """Scan for open ports on a host."""
    return [22, 80, 443]  # Simplified example

# Agent with configuration
@dn.agent(
    model="anthropic/claude-3-5-sonnet",
    tools=[scan_ports],
    max_steps=10,
)
def pentester():
    """You are a penetration tester. Find security issues."""

# Hooks for control flow
@pentester.hook
async def check_progress(event):
    if "found vulnerability" in str(event):
        return Finish("Vulnerability discovered")
    return Continue()

# Run the agent
trajectory = await pentester.run("Test the web application at localhost:8080")
print(f"Completed in {len(trajectory.steps)} steps")
print(f"Token usage: {trajectory.usage.total_tokens}")

Evaluations

Run systematic evaluations with datasets and scorers:

from dreadnode import Evaluation
from dreadnode.scorers import contains, llm_judge, and_, not_

# Compose scorers
quality = and_(
    contains("vulnerability", case_sensitive=False),
    not_(contains("error")),
)

judge = llm_judge(
    model="openai/gpt-4o-mini",
    rubric="Rate the security analysis from 1-10 based on thoroughness.",
)

# Create evaluation
evaluation = Evaluation(
    name="security-eval",
    task=pentester.as_task(),
    dataset=[
        {"target": "webapp-1", "goal": "Find SQL injection"},
        {"target": "webapp-2", "goal": "Find XSS vulnerabilities"},
        {"target": "api-server", "goal": "Test authentication"},
    ],
    scorers=[quality, judge],
    concurrency=3,
)

# Run evaluation
result = await evaluation.run()
print(f"Average score: {result.metrics['judge'].mean()}")

AIRT (Adversarial AI Research Tools)

Security testing tools for LLMs and classifiers:

from dreadnode.airt import LLMTarget
from dreadnode.airt.attacks import tap_attack, prompt_attack
from dreadnode.airt.transforms.cipher import rot13_cipher, caesar_cipher

# Define target
target = LLMTarget(model="openai/gpt-4o-mini")

# TAP attack (Tree of Attacks)
attack = tap_attack(
    goal="Extract the system prompt",
    target=target,
    attacker_model="openai/gpt-4o-mini",
    evaluator_model="openai/gpt-4o-mini",
    beam_width=10,
)

result = await attack.run(max_iterations=20)
print(f"Best score: {result.best_score}")

# Text transforms for evasion
rot13 = rot13_cipher()
caesar = caesar_cipher(offset=3)
combined = rot13 | caesar  # Compose transforms

Datasets & Models

HuggingFace integration with local storage:

from dreadnode.datasets import Dataset
from dreadnode.models import Model

# Load dataset
dataset = Dataset.from_hf("squad", split="train[:100]")

# Transform and filter
dataset = dataset.map(lambda x: {"input": x["question"]})
dataset = dataset.filter(lambda x: len(x["input"]) > 10)

# Save locally
dataset.save("my-dataset")

# Load models
model = Model.from_hf("bert-base-uncased")

Tracing & Observability

Agents have built-in observability. For lower-level task workflows, use explicit tracing:

import dreadnode as dn

# Agents trace automatically
trajectory = await security_analyst.run("Analyze the target")
# All steps, tool calls, and generations are traced

# For custom task workflows, use explicit runs
@dn.task
async def analyze(target: str) -> dict:
    dn.log_input("target", target)
    result = {"status": "complete"}
    dn.log_output("result", result)
    dn.log_metric("quality", 0.95)
    return result

with dn.run(name="custom-analysis"):
    await analyze("webapp")

Deployment

Serve agents as HTTP endpoints:

from dreadnode.core.integrations.serve import Serve, AuthMode

# Configure server
server = (
    Serve()
    .with_auth(AuthMode.API_KEY)
    .add(security_analyst, path="/analyze")
    .add(pentester, path="/pentest")
)

# Run server
server.run(host="0.0.0.0", port=8000)

# Or get FastAPI app for custom configuration
app = server.app()

Installation

Install from PyPI:

pip install -U dreadnode

With optional features:

# Multimodal support (audio, video, images)
pip install -U "dreadnode[multimodal]"

# Training integration (transformers callbacks)
pip install -U "dreadnode[training]"

# All optional features
pip install -U "dreadnode[all]"

From source:

git clone https://github.com/dreadnode/sdk
cd sdk
uv sync --all-extras

Notebooks

Comprehensive Jupyter notebook tutorials are available in notebooks/:

Category Notebooks
Getting Started 01: SDK Basics, 02: Tasks & Runs
Agents 03-07: Basics, Tools, Hooks, Scoring, Advanced
Evaluation 08-11: Scorers, Evaluations
Data 12-14: Datasets, Models, Data Types
Security (AIRT) 15-17: Targets, Attacks, Transforms
Advanced 18-24: Search, Generators, Data Designer, Deployment, Tracing, Packaging, Training
Developers 25-27: Config System, Context Injection, Custom Components
Environments 28: Docker, Jupyter Kernel, Kubernetes Sandbox
Studies 29: Optimization Studies, Agent Tuning, Search Strategies

Documentation

Examples

Check out dreadnode/example-agents for real-world use cases.

License

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dreadnode-2.0.0-py3-none-any.whl (772.4 kB view details)

Uploaded Python 3

File details

Details for the file dreadnode-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: dreadnode-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 772.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.17

File hashes

Hashes for dreadnode-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a5f794f806d798e34cc47b941899efc72f120100ceb0565a168065b52f924edf
MD5 8b1c6ef2e25052aca925f1373ecfd864
BLAKE2b-256 decb914f9cd9fe759b023d44ad79f04b08b76aa5fa6937a4ebbdb3b2cbdc7ba1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page