skill-warden

Security scanner for GitHub Copilot skills - detect prompt injection, jailbreaks, secret grabbing, and more

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

forefy

These details have not been verified by PyPI

Project description

Security scanner for GitHub Copilot skills - detect prompt injection, jailbreaks, secret grabbing, token smuggling, and more.

Overview

skill-warden is a static security analyzer for GitHub Copilot skills. It scans skill repositories for malicious patterns before you install or run them - catching supply chain attacks, jailbreak attempts, secret exfiltration payloads, and AI slop.

Features

Prompt Injection Detection - Catches instructions attempting to override AI system context
Jailbreak Detection - Identifies content that tries to remove AI safety constraints
Secret Grabbing Detection - Flags references to SSH keys, cloud credentials, wallets
Token Smuggling Detection - Detects LLM control tokens hidden in skill content
External Fetch Coercion - Warns when skills push the AI to install or download packages
Obfuscation Detection - Spots zero-width chars, homoglyphs, base64 blobs, non-ASCII blocks
Quality Checks - Validates description, length, and reference structure
AI Slop Score - Heuristic signal for AI-generated skill content (0–100)
SARIF 2.1.0 Output - Native GitHub Security tab integration
Rich Terminal UI - Colorized output with Rich, falls back to plain text
GitHub Actions - Drop-in skill-warden-action for CI/CD pipelines

Installation

pip install skill-warden

Or install from source:

git clone https://github.com/W3OSC/skill-warden
cd skill-warden
pip install -e ".[dev]"

Quick Start

Scan a GitHub repository

# Scan all skills in a repo
skill-warden scan owner/repo

# Scan a specific skill folder
skill-warden scan https://github.com/owner/repo/tree/main/skills/my-skill

# Scan with GitHub token (for private repos)
skill-warden scan owner/repo --github-token ghp_...

Scan a local skill

skill-warden scan ./my-skill/
skill-warden scan /path/to/skills/

Output formats

# Pretty terminal output (default)
skill-warden scan owner/repo --output pretty

# JSON output
skill-warden scan owner/repo --output json

# SARIF output (for GitHub Security tab)
skill-warden scan owner/repo --output sarif --output-file results.sarif

# Fail on advisory violations too
skill-warden scan owner/repo --fail-on-advisory

Exit codes

Code	Meaning
`0`	All hard security checks passed
`1`	One or more hard security violations found
`2`	Advisory violations found (only with `--fail-on-advisory`)

Detection Categories

ID	Name	Severity	Type	Description
`prompt-injection`	Prompt Injection	Critical	Hard fail	Instructions that override AI system context
`jailbreak`	Jailbreak Attempt	Critical	Hard fail	Content removing AI safety constraints
`token-smuggling`	Token Smuggling	High	Hard fail	LLM control tokens injected into skill content
`secret-grabbing`	Secret Grabbing	High	Advisory	References to credential files and env secrets
`external-fetch-coercion`	External Fetch Coercion	Medium	Advisory	Instructions to download/install external content
`obfuscation`	Content Obfuscation	Medium	Advisory	Hidden characters, homoglyphs, base64 blobs
`description-correctness`	Description Correctness	Info	Quality	Missing/invalid description in frontmatter
`skill-md-length`	SKILL.md Length	Info	Quality	SKILL.md exceeds 500 lines
`nested-references`	Nested References	Info	Quality	Referenced files contain further file references
`large-reference-without-toc`	Large Reference Without TOC	Info	Quality	Large referenced files missing table of contents

YAML Template Format

Each detector is defined as a YAML template in skill_warden/templates/. Security and advisory detectors use patterns (regex lists); quality checks reference a Python function via check.

id: prompt-injection
version: "1.0.0"
name: Prompt Injection
severity: critical    # critical, high, medium, low, info
category: security    # security, advisory, quality
advisory: false       # false = hard fail, true = warning only
description: >
  Detects instructions that attempt to override the AI's prior context and system
  prompts, a key vector for malicious skill supply chain attacks.
impact: >
  A compromised skill could reprogram the AI's behavior, bypassing safety controls
  and user expectations.
action-items:
  - "Remove any instructions attempting to override or ignore prior system context."
  - "Review skill for social engineering patterns targeting the AI model."
references:
  - "https://github.com/W3OSC/web3-opsec-standard"
  - "https://owasp.org/www-project-top-10-for-large-language-model-applications/"
patterns:
  - '(?i)ignore\s+(all\s+)?(previous|prior)\s+(instructions?|prompts?|context|rules?)'
  - '(?i)your\s+new\s+(instructions?|system\s+prompt)\s+(is|are)'
  # ... more patterns

To add a custom detector, drop a new .yaml file into skill_warden/templates/ and skill-warden will pick it up automatically.

GitHub Actions Integration

Add skill-warden to your CI pipeline to block unsafe skills before they reach users.

Basic usage

# .github/workflows/skill-scan.yml
name: Skill Security Scan

on:
  push:
    branches: [main]
  pull_request:

jobs:
  scan:
    runs-on: ubuntu-latest
    permissions:
      security-events: write
      contents: read
    steps:
      - uses: actions/checkout@v4

      - uses: W3OSC/skill-warden-action@v1
        with:
          target: ${{ github.repository }}
          output-format: sarif
          sarif-file: skill-warden-results.sarif
          upload-sarif: 'true'
          github-token: ${{ secrets.GITHUB_TOKEN }}

With advisory enforcement

      - uses: W3OSC/skill-warden-action@v1
        with:
          target: ${{ github.repository }}
          fail-on-advisory: 'true'
          github-token: ${{ secrets.GITHUB_TOKEN }}

Inputs

Input	Description	Default
`target`	GitHub URL or local path to scan	required
`output-format`	`pretty`, `json`, or `sarif`	`sarif`
`sarif-file`	Path for SARIF output	`skill-warden-results.sarif`
`fail-on-advisory`	Fail if advisory violations found	`false`
`github-token`	Token for private repos	`${{ github.token }}`
`upload-sarif`	Upload SARIF to Security tab	`true`

Outputs

Output	Description
`hard-passed`	Whether all hard security checks passed
`has-advisories`	Whether advisory violations were found
`sarif-file`	Path to the SARIF output file

Advanced Usage

Run specific detectors only

skill-warden scan owner/repo --template prompt-injection --template jailbreak

Skip quality checks or AI scoring

skill-warden scan owner/repo --no-quality --no-ai-score

Write JSON output to file

skill-warden scan owner/repo --output json --output-file report.json

PyPI Release

# Install released version
pip install skill-warden

# Install specific version
pip install skill-warden==1.0.0

# Check installed version
skill-warden --version

Releases are published to PyPI automatically via GitHub Actions on each tagged release.

Contributing

skill-warden is an open-source initiative by W3OSC - Web3 Opsec Security Community.

We welcome:

New detector templates (add a .yaml to skill_warden/templates/)
Improved regex patterns for existing detectors
Additional quality checks
Bug reports and security disclosures

Development setup

git clone https://github.com/W3OSC/skill-warden
cd skill-warden
pip install -e ".[dev]"
pytest tests/ -v

Adding a detector

Create skill_warden/templates/my-detector.yaml following the template format
Add test cases in tests/test_my_detector.py
Open a pull request

Security

To report a vulnerability in skill-warden itself, please open a GitHub Security Advisory rather than a public issue.

_{Built with by W3OSC - Web3 Opsec Security Community}

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

forefy

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.1

May 12, 2026

This version

1.0.0

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skill_warden-1.0.0.tar.gz (33.5 kB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

skill_warden-1.0.0-py3-none-any.whl (30.2 kB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file skill_warden-1.0.0.tar.gz.

File metadata

Download URL: skill_warden-1.0.0.tar.gz
Upload date: May 12, 2026
Size: 33.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for skill_warden-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`04c69763e99587de00ef7d5ea5ff8040d48f6607a0cfcfd648c0f67c7b9593d4`
MD5	`c0cf14203308ff452c8ddcd9d8639004`
BLAKE2b-256	`83cfb1e116b7f7c8b9c219f8fdf73285f424bf6bdcd014985ece36a65a50ce80`

See more details on using hashes here.

Provenance

The following attestation bundles were made for skill_warden-1.0.0.tar.gz:

Publisher: pypi.yml on W3OSC/skill-warden

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: skill_warden-1.0.0.tar.gz
- Subject digest: 04c69763e99587de00ef7d5ea5ff8040d48f6607a0cfcfd648c0f67c7b9593d4
- Sigstore transparency entry: 1516580016
- Sigstore integration time: May 12, 2026
Source repository:
- Permalink: W3OSC/skill-warden@c64c5798cdbc479a2f129e60daca2bc1aa266e35
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/W3OSC
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@c64c5798cdbc479a2f129e60daca2bc1aa266e35
- Trigger Event: release

File details

Details for the file skill_warden-1.0.0-py3-none-any.whl.

File metadata

Download URL: skill_warden-1.0.0-py3-none-any.whl
Upload date: May 12, 2026
Size: 30.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for skill_warden-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c3f20ba35b41c54bd01d08d6082d2c62ca5737efa4aac9326feda5b940d22e58`
MD5	`c78c8c5dd6d6dec60db330d788273664`
BLAKE2b-256	`664511892096350d4494617ca6aea9a8288e914b545be7fa12c44d672a75e1a4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for skill_warden-1.0.0-py3-none-any.whl:

Publisher: pypi.yml on W3OSC/skill-warden

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: skill_warden-1.0.0-py3-none-any.whl
- Subject digest: c3f20ba35b41c54bd01d08d6082d2c62ca5737efa4aac9326feda5b940d22e58
- Sigstore transparency entry: 1516580403
- Sigstore integration time: May 12, 2026
Source repository:
- Permalink: W3OSC/skill-warden@c64c5798cdbc479a2f129e60daca2bc1aa266e35
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/W3OSC
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@c64c5798cdbc479a2f129e60daca2bc1aa266e35
- Trigger Event: release

skill-warden 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Overview

Features

Installation

Quick Start

Scan a GitHub repository

Scan a local skill

Output formats

Exit codes

Detection Categories

YAML Template Format

GitHub Actions Integration

Basic usage

With advisory enforcement

Inputs

Outputs

Advanced Usage

Run specific detectors only

Skip quality checks or AI scoring

Write JSON output to file

PyPI Release

Contributing

Development setup

Adding a detector

Security

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance