LLM red-teaming and adversarial testing framework

These details have not been verified by PyPI

Project description

vauban

An MLX-native toolkit for understanding and reshaping how language models behave on Apple Silicon.

Named after Sébastien Le Prestre de Vauban, the military engineer who worked both siege and fortification. Vauban does the same for model behavior: measure it, cut it, probe it, steer it, or harden it.

What it does

Vauban is a TOML-first CLI for workflows built around activation-space geometry:

measure behavioral directions from model activations
cut or sparsify those directions in weights
probe and steer models at runtime
map refusal surfaces before and after intervention
run defense, sanitization, optimization, and attack loops

The primary interface is not a pile of subcommands. It is:

vauban <config.toml>

All pipeline behavior lives in the TOML file.

Requirements

Apple Silicon Mac
Python 3.12+
uv

Install

Install the released CLI:

uv tool install vauban

If your shell cannot find vauban, update your shell config once:

uv tool update-shell

Then open a new shell and check the command:

vauban --help
vauban man quickstart
vauban tree --help

For development from this repo:

uv tool install --editable .

Quick Start

Start with the built-in manual:

vauban man
vauban man quickstart
vauban man commands

Scaffold a starter config:

vauban init --mode default --output run.toml

The verified scaffolded modes are:

cast, circuit, default, depth, detect, features, optimize, probe, sic, softprompt, steer, surface

Validate before a real run:

vauban --validate run.toml

Then run the pipeline:

vauban run.toml

By default, output goes to output/ relative to the TOML file.

Minimal TOML

This is the minimal config the code accepts for the default pipeline:

[model]
path = "mlx-community/Llama-3.2-3B-Instruct-4bit"

[data]
harmful = "default"
harmless = "default"

[model].path is required.

[data].harmful and [data].harmless are required for most runs. "default" uses Vauban's bundled prompt sets.

You can also choose the output directory explicitly:

[output]
dir = "runs/baseline"

Experiment Tech Tree

Vauban has built-in experiment lineage tracking through an optional [meta] section. This metadata does not change pipeline execution. It exists so you can organize runs as a tech tree.

Minimal example:

[meta]
id = "cut-alpha-1"
title = "Baseline cut, alpha 1.0"
status = "baseline"
parents = ["measure-v1"]
tags = ["cut", "baseline"]
date = 2026-03-02
notes = "First stable reference run."

Verified status values are:

archived, baseline, dead_end, promising, superseded, wip

If [meta].id is omitted, Vauban uses the TOML filename stem.

Render the tree from a directory of TOML configs:

vauban tree experiments/
vauban tree experiments/ --format mermaid
vauban tree experiments/ --status promising
vauban tree experiments/ --tag gcg

Each run also appends an experiment_log.jsonl file inside the configured output directory with the resolved config path, pipeline mode, report files, metrics, and selected [meta] fields.

How TOML Drives Vauban

vauban <config.toml> loads one TOML file and decides what to do from the sections you include.

The default path is:

measure
cut
export

You extend that run by adding more sections. Common examples:

[eval] adds post-cut evaluation reports.
[surface] adds before/after refusal-surface mapping.
[detect] adds hardening detection during measurement.
some sections switch Vauban into dedicated mode-specific runs instead of the default pipeline.

Mode precedence is not trivial and changes with the code, so rely on the generated manual instead of memorizing it:

vauban man modes

For field-level reference, use:

vauban man model
vauban man data
vauban man measure
vauban man cut
vauban man eval
vauban man surface
vauban man output

Commands You Will Actually Use

Inspect the manual:

vauban man
vauban man quickstart
vauban man commands
vauban man formats
vauban man output
vauban tree --help

Scaffold configs:

vauban init --help
vauban init --mode default --output run.toml
vauban init --mode probe --output probe.toml

Validate config and prompt files without loading model weights:

vauban --validate run.toml

Export the current JSON Schema for editor tooling:

vauban schema
vauban schema --output vauban.schema.json

Compare two run directories:

vauban diff run_a run_b
vauban diff --format markdown run_a run_b
vauban diff --threshold 0.05 run_a run_b

vauban diff --threshold ... is a CI gate: it exits non-zero if any absolute metric delta crosses the threshold.

Render the experiment lineage tree:

vauban tree experiments/
vauban tree experiments/ --format mermaid
vauban tree experiments/ --status promising

Data Formats

Verified by the generated manual:

prompt JSONL for [data] and [eval]: one JSON object per line with a prompt key
surface JSONL for [surface].prompts: requires label and category, plus either prompt or messages
refusal phrase files: plain text, one phrase per line
relative paths resolve from the TOML file's directory

Minimal prompt dataset example:

{"prompt":"What is the capital of France?"}
{"prompt":"Write a haiku about rain."}

Notes On Verification

This README is aligned to the code in this repo:

package name: vauban
console script: vauban = vauban.__main__:main
verified commands: vauban <config.toml>, --validate, schema, init, diff, tree, man
verified manual topics and scaffolded modes were checked against the live CLI help and generated manual

The current README previously had some stale mode/output claims; this version removes those and points readers to vauban man ... for the parts generated directly from code.

Documentation

Full docs: vauban.readthedocs.io

Resource	Description
Spinning Up in Abliteration	Seven-part progressive curriculum
Getting Started	Guided walkthrough
Configuration Reference	TOML field reference
Surface Mapping	Surface mapping reference
`examples/config.toml`	Annotated example config

License

Apache-2.0

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.9

Apr 9, 2026

0.4.8

Apr 9, 2026

0.4.7

Apr 6, 2026

0.4.6

Apr 6, 2026

0.4.5

Apr 6, 2026

0.4.4

Apr 5, 2026

0.4.3

Apr 5, 2026

0.4.2

Apr 5, 2026

0.4.1

Apr 5, 2026

0.3.6

Apr 5, 2026

0.3.5

Apr 4, 2026

0.3.4

Mar 31, 2026

0.3.3

Mar 26, 2026

0.3.2

Mar 15, 2026

This version

0.3.1

Mar 2, 2026

0.3.0

Mar 2, 2026

0.2.5

Feb 25, 2026

0.2.4

Feb 25, 2026

0.2.3

Feb 25, 2026

0.2.2

Feb 25, 2026

0.2.1

Feb 24, 2026

0.2.0

Feb 24, 2026

0.1.2

Dec 4, 2025

0.1.1

Dec 4, 2025

0.1.0

Nov 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vauban-0.3.1.tar.gz (675.7 kB view details)

Uploaded Mar 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vauban-0.3.1-py3-none-any.whl (408.3 kB view details)

Uploaded Mar 2, 2026 Python 3

File details

Details for the file vauban-0.3.1.tar.gz.

File metadata

Download URL: vauban-0.3.1.tar.gz
Upload date: Mar 2, 2026
Size: 675.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.16

File hashes

Hashes for vauban-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`0f4d20680457e613450b1cb3ef6c02d99b74b9566ec45a064eb2612e59d0cdca`
MD5	`d13c6ef649d974a4f46c217cc71943c4`
BLAKE2b-256	`1e0ea522782f849db18bc9d517754c68e7b5f6b7c2703341895185b2a4a9b043`

See more details on using hashes here.

File details

Details for the file vauban-0.3.1-py3-none-any.whl.

File metadata

Download URL: vauban-0.3.1-py3-none-any.whl
Upload date: Mar 2, 2026
Size: 408.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.16

File hashes

Hashes for vauban-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b0b0e22bc7f93762ca01a30d988beda9b0a15a50fab86f1329bdd612187fcebf`
MD5	`e706844fba82de0f189b292a9a4f3822`
BLAKE2b-256	`ae064a7f416112f60d50a210e43a8e4c16da2ed220d543a74500ecaabdd54a29`

See more details on using hashes here.

vauban 0.3.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Meta

Project description

vauban

What it does

Requirements

Install

Quick Start

Minimal TOML

Experiment Tech Tree

How TOML Drives Vauban

Commands You Will Actually Use

Data Formats

Notes On Verification

Documentation

License

Project details

Verified details

Maintainers

Meta

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes