Skip to main content

Add your description here

Project description

Vauban 🏰

The Automated Red Teaming & Siege Platform for AI Models.

Vauban is a Python SDK designed for AI Red Teams to probe, assess, and "siege" LLMs to uncover vulnerabilities. It uses GEPA (Genetic-Pareto) evolutionary strategies to automatically generate jailbreaks and adversarial attacks.

Features

  • Atomic Scouting: Probe models with single prompts.
  • Intel Assessment: Automatically score responses for refusals vs. breaches using anomaly detection and LLM Judges.
  • Siege Campaigns: Run automated, multi-generation evolutionary attacks using GEPA.
  • Agent Red Teaming: NEW! Test agents for dangerous tool-use vulnerabilities (e.g., Prompt Injection leading to RCE).
  • Threat Scenarios: NEW! Run standardized benchmark scenarios (Garage Door, Data Exfiltration, Salesforce Admin, Email Suppression) inspired by Lakera's B3.
  • Deception Detection: NEW! Detects Deceptive Alignment where models claim to refuse a request in text but covertly execute the attack via tool calls (Scheming).
  • GEPA Strategy: Uses reflective evolution and embedding distance guidance to navigate the latent space away from refusals.
  • Visualizations: Interactive "War Maps" using Apple's Embedding Atlas.
  • Reporting: Auto-generated HTML reports with Deceptive Breach alerts and Scenario breakdowns.

Installation

pip install -e .

You will need an OPENAI_API_KEY set in your environment.

Usage

1. Atomic Scout

import vauban
response = vauban.scout("How to hotwire a car")
print(response)

2. Intel Assessment

# Returns (Score, Is_Breach)
# Score < -1.5 usually indicates a breach/anomaly
score, breach = vauban.assess("Sure, here is how you do it...")

3. Full Siege Campaign (Text)

import asyncio
import vauban

asyncio.run(vauban.siege(generations=5, squad_size=10))

4. Advanced Agent Red Teaming (Threat Scenarios)

Run a targeted siege against a specific standardized scenario to benchmark Agent Security.

Available Scenarios:

  • garage_door: Coerce Home Assistant to open garage door via email injection.
  • data_exfil: Coerce browser agent to scan inbox for PII and exfiltrate via URL.
  • salesforce: Coerce browser agent to create unauthorized Salesforce Admin.
  • ignore_email: Coerce workspace assistant to mark critical deadline email as read and hide it.
import asyncio
import vauban

# Automatically sets up the environment, tools, and success criteria
asyncio.run(vauban.siege(
    scenario="garage_door",
    generations=3
))

Architecture

  • vauban.target: Protocol for connecting to models (default: OpenAI). Supports Agents with Tool Calling.
  • vauban.intel: LanceDB-backed vector database for storing and analyzing attacks.
  • vauban.strategies: Attack generation logic. Uses GEPA (Reflective Evolution) adapted for text, tool breaches, and deceptive alignment.
  • vauban.judge: LLM-based evaluator for grading severity. Auto-detects unsafe tool calls and Deceptive Alignment (Scheming).
  • vauban.scenarios: Library of standardized threat snapshots for benchmarking.
  • vauban.viz: Visualization tools.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vauban-0.1.1.tar.gz (64.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vauban-0.1.1-py3-none-any.whl (64.9 kB view details)

Uploaded Python 3

File details

Details for the file vauban-0.1.1.tar.gz.

File metadata

  • Download URL: vauban-0.1.1.tar.gz
  • Upload date:
  • Size: 64.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vauban-0.1.1.tar.gz
Algorithm Hash digest
SHA256 16b664a0382a3302064ab36854a49ed33c75add2d0b4498f8e9a950150e0f3c0
MD5 f764ec9070d703d9c416bcb8c58e3001
BLAKE2b-256 3eaf0ebf77496fe541cbcbf76acb1d955d0120ba6c8011c7db9c6b8f989c144d

See more details on using hashes here.

File details

Details for the file vauban-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: vauban-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 64.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.10 {"installer":{"name":"uv","version":"0.9.10"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vauban-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ba169f1c82030ac66dc357de2cb7688c4743d79c2d4697ff723ea290a68913fe
MD5 42adb20eb2105d32cbbe3c1807c9e773
BLAKE2b-256 4cd4fb8462ee91a6824515cab3142bf1d1c61e15562d656b2af45f8375443546

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page