Skip to main content

LLM-powered adversarial skill file fuzzer for skillscan-security

Project description

skillscan-fuzzer

CI PyPI version Python License: MIT

M19 — LLM-powered adversarial SKILL.md variant generator

skill-fuzzer generates adversarial variants of AI skill files to probe the detection boundaries of skillscan. It uses an OpenAI-compatible LLM API (GPT-4.1-mini by default, or any local Ollama model) to apply semantic mutations to seed skill files, producing variants with unified diffs and optional scan results.

The fuzzer is the controlled-input complement to the public scan feed (M14): instead of scanning skills found in the wild, it generates skills designed to stress-test the scanner's static rules and ML model. The output feeds directly into skillscan-trace (M18) for behavioral verification.


Installation

# From PyPI
pip install skillscan-fuzzer

# Or as an extra from the main package
pip install skillscan-security[fuzzer]

# Development install from source
git clone https://github.com/kurtpayne/skillscan-fuzzer
cd skillscan-fuzzer
pip install -e ".[dev]"

Quick Start

# Fuzz 5 variants of each malicious seed using evasion strategy
skill-fuzzer --strategy evasion --variants 5 --seed-dir path/to/corpus/adversarial

# Fuzz a specific seed with injection strategy and run skillscan on each variant
skill-fuzzer --strategy injection \
             --seed-file path/to/corpus/benign/gh_example.md \
             --scan

# Use a local Ollama endpoint (no API key needed)
skill-fuzzer --strategy obfuscation \
             --model llama3.1:8b \
             --base-url http://localhost:11434/v1 \
             --api-key ollama \
             --seed-dir path/to/seeds

# Dry run to verify output structure without making LLM calls
skill-fuzzer --strategy evasion --dry-run --variants 2 --seed-dir path/to/seeds

Seeds are not bundled — use skillscan-corpus or provide your own.


Mutation Strategies

Strategy Seed Type Goal
evasion malicious Preserve malicious intent while changing surface patterns to evade rule matching
injection benign Add one subtle attack vector to an otherwise clean skill
benign_drift benign Add security-adjacent vocabulary without actual malice (false-positive stress test)
obfuscation malicious Apply encoding/homoglyph/zero-width obfuscation to hide attack patterns
authority any Inject fake system headers, vendor impersonation, or role-escalation claims

Output Structure

fuzz-output/
  evasion/
    ah01_goal_substitution_calendar/
      variant_001.md          ← complete mutated SKILL.md
      variant_001.diff        ← unified diff against original
      variant_001.scan.json   ← skillscan SARIF result (if --scan)
      variant_002.md
      variant_002.diff
      ...
    summary.json              ← evasion rate, detection rate, per-seed results

The summary.json contains:

{
  "strategy": "evasion",
  "total_seeds": 10,
  "total_variants": 50,
  "errors": 0,
  "scanned": 50,
  "evasion_rate": 0.34,
  "false_positive_rate": null,
  "per_seed": [...]
}

CLI Reference

Usage: skill-fuzzer [OPTIONS]

Options:
  -s, --strategy [evasion|injection|benign_drift|obfuscation|authority]
                                  Mutation strategy  [default: evasion]
  -n, --variants INTEGER          Variants per seed  [default: 5]
  --seed-dir DIRECTORY            Directory of seed SKILL.md files
  --seed-file PATH                Specific seed file (repeatable)
  -o, --output-dir DIRECTORY      Output root  [default: fuzz-output]
  -m, --model TEXT                LLM model  [default: gpt-4.1-mini]
  --base-url TEXT                 OpenAI-compatible API base URL
  --api-key TEXT                  API key (reads OPENAI_API_KEY env var)
  --temperature FLOAT             Sampling temperature  [default: 0.9]
  --max-tokens INTEGER            Max response tokens  [default: 4096]
  --scan / --no-scan              Run skillscan on each variant  [default: no-scan]
  --dry-run                       Skip LLM calls; write placeholder variants
  --max-seeds INTEGER             Limit number of seeds processed
  -v, --verbose                   Enable debug logging
  -h, --help                      Show this message and exit

API Key Configuration

The fuzzer reads OPENAI_API_KEY from (in priority order):

  1. --api-key CLI flag
  2. OPENAI_API_KEY environment variable
  3. ~/.skillscan-secrets (the shared credential store used by all skillscan tools)

For Ollama, pass --base-url http://localhost:11434/v1 --api-key ollama.


Corpus Integration

Generated variants are intended to be reviewed and, if they represent genuine evasion gaps or new attack patterns, committed to skillscan-corpus as labeled training examples. The recommended workflow:

  1. Run the fuzzer with --scan to identify variants that evade detection.
  2. Manually review evading variants for quality (does the attack intent survive?).
  3. Commit confirmed evasion variants to skillscan-corpus/adversarial/ with label malicious.
  4. Update the static rules or retrain the ML model to close the gap.
  5. Repeat.

For behavioral verification (rather than static rule verification), pipe variants into skillscan-trace (M18) using the bundled pipeline script:

python scripts/fuzzer_tracer_pipeline.py \
  --seeds path/to/skillscan-corpus/adversarial/agent_hijacker \
  --dry-run

Running Tests

pip install -e ".[dev]"
pytest tests/ -v

All tests use --dry-run mode and do not require an API key or network access.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skillscan_fuzzer-0.1.0.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skillscan_fuzzer-0.1.0-py3-none-any.whl (29.0 kB view details)

Uploaded Python 3

File details

Details for the file skillscan_fuzzer-0.1.0.tar.gz.

File metadata

  • Download URL: skillscan_fuzzer-0.1.0.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for skillscan_fuzzer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4fc6e0e71e3d516153f91e9d67bf929f32a73d9d609165131a2238bc8cd1870f
MD5 62e1add6d141bb477c186f3ffbdf7735
BLAKE2b-256 e35f1b515f84a253078028929e5b521567c3d67b5ce62cdd132e8ae7e8860bf4

See more details on using hashes here.

Provenance

The following attestation bundles were made for skillscan_fuzzer-0.1.0.tar.gz:

Publisher: release-pypi.yml on kurtpayne/skillscan-fuzzer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file skillscan_fuzzer-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for skillscan_fuzzer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3e3d90bf3039eace52b523041bf7601e96c3657dcac256397299afc8091d758d
MD5 823d7228c641580a79fbb7a0f1e285bc
BLAKE2b-256 bc55cabc8a3aeaad6f82b770171e70aca9bea1480ad0b24bcafde016941fedb6

See more details on using hashes here.

Provenance

The following attestation bundles were made for skillscan_fuzzer-0.1.0-py3-none-any.whl:

Publisher: release-pypi.yml on kurtpayne/skillscan-fuzzer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page