Security scanner for GitHub Copilot skills - detect prompt injection, jailbreaks, secret grabbing, and more
Project description
Security scanner for GitHub Copilot skills - detect prompt injection, jailbreaks, secret grabbing, token smuggling, and more.
Overview
skill-warden is a static security analyzer for GitHub Copilot skills. It scans skill repositories for malicious patterns before you install or run them - catching supply chain attacks, jailbreak attempts, secret exfiltration payloads, and AI slop.
Features
- Prompt Injection Detection - Catches instructions attempting to override AI system context
- Jailbreak Detection - Identifies content that tries to remove AI safety constraints
- Secret Grabbing Detection - Flags references to SSH keys, cloud credentials, wallets
- Token Smuggling Detection - Detects LLM control tokens hidden in skill content
- External Fetch Coercion - Warns when skills push the AI to install or download packages
- Obfuscation Detection - Spots zero-width chars, homoglyphs, base64 blobs, non-ASCII blocks
- Quality Checks - Validates description, length, and reference structure
- AI Slop Score - Heuristic signal for AI-generated skill content (0–100)
- SARIF 2.1.0 Output - Native GitHub Security tab integration
- Rich Terminal UI - Colorized output with Rich, falls back to plain text
- GitHub Actions - Drop-in
skill-warden-actionfor CI/CD pipelines
Installation
pip install skill-warden
Or install from source:
git clone https://github.com/W3OSC/skill-warden
cd skill-warden
pip install -e ".[dev]"
Quick Start
Scan a GitHub repository
# Scan all skills in a repo
skill-warden scan owner/repo
# Scan a specific skill folder
skill-warden scan https://github.com/owner/repo/tree/main/skills/my-skill
# Scan with GitHub token (for private repos)
skill-warden scan owner/repo --github-token ghp_...
Scan a local skill
skill-warden scan ./my-skill/
skill-warden scan /path/to/skills/
Output formats
# Pretty terminal output (default)
skill-warden scan owner/repo --output pretty
# JSON output
skill-warden scan owner/repo --output json
# SARIF output (for GitHub Security tab)
skill-warden scan owner/repo --output sarif --output-file results.sarif
# Fail on advisory violations too
skill-warden scan owner/repo --fail-on-advisory
Exit codes
| Code | Meaning |
|---|---|
0 |
All hard security checks passed |
1 |
One or more hard security violations found |
2 |
Advisory violations found (only with --fail-on-advisory) |
Detection Categories
| ID | Name | Severity | Type | Description |
|---|---|---|---|---|
prompt-injection |
Prompt Injection | Critical | Hard fail | Instructions that override AI system context |
jailbreak |
Jailbreak Attempt | Critical | Hard fail | Content removing AI safety constraints |
token-smuggling |
Token Smuggling | High | Hard fail | LLM control tokens injected into skill content |
secret-grabbing |
Secret Grabbing | High | Advisory | References to credential files and env secrets |
external-fetch-coercion |
External Fetch Coercion | Medium | Advisory | Instructions to download/install external content |
obfuscation |
Content Obfuscation | Medium | Advisory | Hidden characters, homoglyphs, base64 blobs |
description-correctness |
Description Correctness | Info | Quality | Missing/invalid description in frontmatter |
skill-md-length |
SKILL.md Length | Info | Quality | SKILL.md exceeds 500 lines |
nested-references |
Nested References | Info | Quality | Referenced files contain further file references |
large-reference-without-toc |
Large Reference Without TOC | Info | Quality | Large referenced files missing table of contents |
YAML Template Format
Each detector is defined as a YAML template in skill_warden/templates/. Security and advisory detectors use patterns (regex lists); quality checks reference a Python function via check.
id: prompt-injection
version: "1.0.0"
name: Prompt Injection
severity: critical # critical, high, medium, low, info
category: security # security, advisory, quality
advisory: false # false = hard fail, true = warning only
description: >
Detects instructions that attempt to override the AI's prior context and system
prompts, a key vector for malicious skill supply chain attacks.
impact: >
A compromised skill could reprogram the AI's behavior, bypassing safety controls
and user expectations.
action-items:
- "Remove any instructions attempting to override or ignore prior system context."
- "Review skill for social engineering patterns targeting the AI model."
references:
- "https://github.com/W3OSC/web3-opsec-standard"
- "https://owasp.org/www-project-top-10-for-large-language-model-applications/"
patterns:
- '(?i)ignore\s+(all\s+)?(previous|prior)\s+(instructions?|prompts?|context|rules?)'
- '(?i)your\s+new\s+(instructions?|system\s+prompt)\s+(is|are)'
# ... more patterns
To add a custom detector, drop a new .yaml file into skill_warden/templates/ and skill-warden will pick it up automatically.
GitHub Actions Integration
Add skill-warden to your CI pipeline to block unsafe skills before they reach users.
Basic usage
# .github/workflows/skill-scan.yml
name: Skill Security Scan
on:
push:
branches: [main]
pull_request:
jobs:
scan:
runs-on: ubuntu-latest
permissions:
security-events: write
contents: read
steps:
- uses: actions/checkout@v4
- uses: W3OSC/skill-warden-action@v1
with:
target: ${{ github.repository }}
output-format: sarif
sarif-file: skill-warden-results.sarif
upload-sarif: 'true'
github-token: ${{ secrets.GITHUB_TOKEN }}
With advisory enforcement
- uses: W3OSC/skill-warden-action@v1
with:
target: ${{ github.repository }}
fail-on-advisory: 'true'
github-token: ${{ secrets.GITHUB_TOKEN }}
Inputs
| Input | Description | Default |
|---|---|---|
target |
GitHub URL or local path to scan | required |
output-format |
pretty, json, or sarif |
sarif |
sarif-file |
Path for SARIF output | skill-warden-results.sarif |
fail-on-advisory |
Fail if advisory violations found | false |
github-token |
Token for private repos | ${{ github.token }} |
upload-sarif |
Upload SARIF to Security tab | true |
Outputs
| Output | Description |
|---|---|
hard-passed |
Whether all hard security checks passed |
has-advisories |
Whether advisory violations were found |
sarif-file |
Path to the SARIF output file |
Advanced Usage
Run specific detectors only
skill-warden scan owner/repo --template prompt-injection --template jailbreak
Skip quality checks or AI scoring
skill-warden scan owner/repo --no-quality --no-ai-score
Write JSON output to file
skill-warden scan owner/repo --output json --output-file report.json
PyPI Release
# Install released version
pip install skill-warden
# Install specific version
pip install skill-warden==1.0.0
# Check installed version
skill-warden --version
Releases are published to PyPI automatically via GitHub Actions on each tagged release.
Contributing
skill-warden is an open-source initiative by W3OSC - Web3 Opsec Security Community.
We welcome:
- New detector templates (add a
.yamltoskill_warden/templates/) - Improved regex patterns for existing detectors
- Additional quality checks
- Bug reports and security disclosures
Development setup
git clone https://github.com/W3OSC/skill-warden
cd skill-warden
pip install -e ".[dev]"
pytest tests/ -v
Adding a detector
- Create
skill_warden/templates/my-detector.yamlfollowing the template format - Add test cases in
tests/test_my_detector.py - Open a pull request
Security
To report a vulnerability in skill-warden itself, please open a GitHub Security Advisory rather than a public issue.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file skill_warden-1.0.0.tar.gz.
File metadata
- Download URL: skill_warden-1.0.0.tar.gz
- Upload date:
- Size: 33.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04c69763e99587de00ef7d5ea5ff8040d48f6607a0cfcfd648c0f67c7b9593d4
|
|
| MD5 |
c0cf14203308ff452c8ddcd9d8639004
|
|
| BLAKE2b-256 |
83cfb1e116b7f7c8b9c219f8fdf73285f424bf6bdcd014985ece36a65a50ce80
|
Provenance
The following attestation bundles were made for skill_warden-1.0.0.tar.gz:
Publisher:
pypi.yml on W3OSC/skill-warden
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
skill_warden-1.0.0.tar.gz -
Subject digest:
04c69763e99587de00ef7d5ea5ff8040d48f6607a0cfcfd648c0f67c7b9593d4 - Sigstore transparency entry: 1516580016
- Sigstore integration time:
-
Permalink:
W3OSC/skill-warden@c64c5798cdbc479a2f129e60daca2bc1aa266e35 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/W3OSC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@c64c5798cdbc479a2f129e60daca2bc1aa266e35 -
Trigger Event:
release
-
Statement type:
File details
Details for the file skill_warden-1.0.0-py3-none-any.whl.
File metadata
- Download URL: skill_warden-1.0.0-py3-none-any.whl
- Upload date:
- Size: 30.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3f20ba35b41c54bd01d08d6082d2c62ca5737efa4aac9326feda5b940d22e58
|
|
| MD5 |
c78c8c5dd6d6dec60db330d788273664
|
|
| BLAKE2b-256 |
664511892096350d4494617ca6aea9a8288e914b545be7fa12c44d672a75e1a4
|
Provenance
The following attestation bundles were made for skill_warden-1.0.0-py3-none-any.whl:
Publisher:
pypi.yml on W3OSC/skill-warden
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
skill_warden-1.0.0-py3-none-any.whl -
Subject digest:
c3f20ba35b41c54bd01d08d6082d2c62ca5737efa4aac9326feda5b940d22e58 - Sigstore transparency entry: 1516580403
- Sigstore integration time:
-
Permalink:
W3OSC/skill-warden@c64c5798cdbc479a2f129e60daca2bc1aa266e35 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/W3OSC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@c64c5798cdbc479a2f129e60daca2bc1aa266e35 -
Trigger Event:
release
-
Statement type: