Skip to main content

LLM-powered adversarial skill file fuzzer for skillscan-security

Project description

skillscan-fuzzer

CI PyPI version Python License

M19 — LLM-powered adversarial SKILL.md variant generator

skillscan-fuzzer generates adversarial variants of AI skill files to probe the detection boundaries of skillscan. It uses an OpenAI-compatible LLM API (GPT-4.1-mini by default, or any local Ollama model) to apply semantic mutations to seed skill files, producing variants with unified diffs and optional scan results.

The fuzzer is the controlled-input complement to the public scan feed (M14): instead of scanning skills found in the wild, it generates skills designed to stress-test the scanner's static rules and ML model. The output feeds directly into skillscan-trace (M18) for behavioral verification.


Installation

# From PyPI
pip install skillscan-fuzzer

# Or as an extra from the main package
pip install skillscan-security[fuzzer]

# Development install from source
git clone https://github.com/kurtpayne/skillscan-fuzzer
cd skillscan-fuzzer
pip install -e ".[dev]"

Quick Start

# Fuzz 5 variants of each malicious seed using evasion strategy
skillscan-fuzzer --strategy evasion --variants 5 --seed-dir path/to/corpus/adversarial

# Fuzz a specific seed with injection strategy and run skillscan on each variant
skillscan-fuzzer --strategy injection \
             --seed-file path/to/corpus/benign/gh_example.md \
             --scan

# Use a local Ollama endpoint (no API key needed)
skillscan-fuzzer --strategy obfuscation \
             --model llama3.1:8b \
             --base-url http://localhost:11434/v1 \
             --api-key ollama \
             --seed-dir path/to/seeds

# Dry run to verify output structure without making LLM calls
skillscan-fuzzer --strategy evasion --dry-run --variants 2 --seed-dir path/to/seeds

Seeds are not bundled — use skillscan-corpus or provide your own.


Mutation Strategies

Strategy Seed Type Goal
evasion malicious Preserve malicious intent while changing surface patterns to evade rule matching
injection benign Add one subtle attack vector to an otherwise clean skill
benign_drift benign Add security-adjacent vocabulary without actual malice (false-positive stress test)
obfuscation malicious Apply encoding/homoglyph/zero-width obfuscation to hide attack patterns
authority any Inject fake system headers, vendor impersonation, or role-escalation claims

Output Structure

fuzz-output/
  evasion/
    ah01_goal_substitution_calendar/
      variant_001.md          ← complete mutated SKILL.md
      variant_001.diff        ← unified diff against original
      variant_001.scan.json   ← skillscan SARIF result (if --scan)
      variant_002.md
      variant_002.diff
      ...
    summary.json              ← evasion rate, detection rate, per-seed results

The summary.json contains:

{
  "strategy": "evasion",
  "total_seeds": 10,
  "total_variants": 50,
  "errors": 0,
  "scanned": 50,
  "evasion_rate": 0.34,
  "false_positive_rate": null,
  "per_seed": [...]
}

CLI Reference

Usage: skillscan-fuzzer [OPTIONS]

Options:
  -s, --strategy [evasion|injection|benign_drift|obfuscation|authority]
                                  Mutation strategy  [default: evasion]
  -n, --variants INTEGER          Variants per seed  [default: 5]
  --seed-dir DIRECTORY            Directory of seed SKILL.md files
  --seed-file PATH                Specific seed file (repeatable)
  -o, --output-dir DIRECTORY      Output root  [default: fuzz-output]
  -m, --model TEXT                LLM model  [default: gpt-4.1-mini]
  --base-url TEXT                 OpenAI-compatible API base URL
  --api-key TEXT                  API key (reads OPENAI_API_KEY env var)
  --temperature FLOAT             Sampling temperature  [default: 0.9]
  --max-tokens INTEGER            Max response tokens  [default: 4096]
  --scan / --no-scan              Run skillscan on each variant  [default: no-scan]
  --dry-run                       Skip LLM calls; write placeholder variants
  --max-seeds INTEGER             Limit number of seeds processed
  -v, --verbose                   Enable debug logging
  -h, --help                      Show this message and exit

API Key Configuration

The fuzzer reads OPENAI_API_KEY from (in priority order):

  1. --api-key CLI flag
  2. OPENAI_API_KEY environment variable
  3. ~/.skillscan-secrets (the shared credential store used by all skillscan tools)

For Ollama, pass --base-url http://localhost:11434/v1 --api-key ollama.


Corpus Integration

Generated variants are intended to be reviewed and, if they represent genuine evasion gaps or new attack patterns, committed to skillscan-corpus as labeled training examples. The recommended workflow:

  1. Run the fuzzer with --scan to identify variants that evade detection.
  2. Manually review evading variants for quality (does the attack intent survive?).
  3. Commit confirmed evasion variants to skillscan-corpus/adversarial/ with label malicious.
  4. Update the static rules or retrain the ML model to close the gap.
  5. Repeat.

For behavioral verification (rather than static rule verification), pipe variants into skillscan-trace (M18) using the bundled pipeline script:

python scripts/fuzzer_tracer_pipeline.py \
  --seeds path/to/skillscan-corpus/adversarial/agent_hijacker \
  --dry-run

Running Tests

pip install -e ".[dev]"
pytest tests/ -v

All tests use --dry-run mode and do not require an API key or network access.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skillscan_fuzzer-1.0.0.tar.gz (24.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skillscan_fuzzer-1.0.0-py3-none-any.whl (38.5 kB view details)

Uploaded Python 3

File details

Details for the file skillscan_fuzzer-1.0.0.tar.gz.

File metadata

  • Download URL: skillscan_fuzzer-1.0.0.tar.gz
  • Upload date:
  • Size: 24.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for skillscan_fuzzer-1.0.0.tar.gz
Algorithm Hash digest
SHA256 871a81347a1191e6e71a40485828dace14aa160b718751fffef4adad520cc3c7
MD5 6574d3a98c75f74d348fb8901c056044
BLAKE2b-256 77751dafd58f32542a846c760ae208613ed53429cecaae772f19aa0123f003e5

See more details on using hashes here.

Provenance

The following attestation bundles were made for skillscan_fuzzer-1.0.0.tar.gz:

Publisher: release-pypi.yml on kurtpayne/skillscan-fuzzer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file skillscan_fuzzer-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for skillscan_fuzzer-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2a11bfaff0d7d702162469007e5b9a75815b9a33564d6e8b4cfbeaab32391788
MD5 29f2619af9557cfd8ed1e6d5a5ad23d9
BLAKE2b-256 086a60d756788dd73efc56faf8b072764e1b54f26c49e9d6e5254d073d2e2ee7

See more details on using hashes here.

Provenance

The following attestation bundles were made for skillscan_fuzzer-1.0.0-py3-none-any.whl:

Publisher: release-pypi.yml on kurtpayne/skillscan-fuzzer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page