A Python CLI tool for evaluating agent skills through static analysis, trigger testing, and trace analysis
Project description
Skill Lab
A Python CLI tool for evaluating agent skills through static analysis, trigger testing, and trace analysis.
Features
- SKILL.md Parsing: Parse YAML frontmatter and markdown body from skill definitions
- 19 Static Checks: Comprehensive checks across 4 dimensions
- Structure: File existence, folder organization, frontmatter validation, standard fields
- Naming: Format, directory matching
- Description: Required, non-empty, max length
- Content: Examples, line budget, reference depth
- Trigger Testing: Test skill activation with 4 trigger types (explicit, implicit, contextual, negative)
- Trigger Generation: LLM-powered test case generation via Anthropic API
- Skill Inspector:
sklab infoshows metadata and token cost estimates - Prompt Export:
sklab promptexports skills as XML, Markdown, or JSON for agent platforms - Quality Scoring: Weighted 0-100 score based on check results
- Multiple Output Formats: Console (rich formatting) and JSON
Installation
# From PyPI
pip install skill-lab
# With LLM-based trigger generation (requires Anthropic API)
pip install skill-lab[generate]
# From source
pip install -e .
# With development dependencies
pip install -e ".[dev]"
Setup
API Key (required for sklab generate)
sklab generate uses the Anthropic API to generate trigger test cases. Set your API key:
export ANTHROPIC_API_KEY=sk-ant-...
Get your key at console.anthropic.com.
Model Configuration (optional)
The default model is claude-haiku-4-5-20251001. Override it per-command or globally:
# Per-command
sklab generate ./my-skill --model claude-sonnet-4-5-20250929
# Global default via environment variable
export SKLAB_MODEL=claude-sonnet-4-5-20250929
Quick Start
# Evaluate a skill (path defaults to current directory)
sklab evaluate ./my-skill
sklab evaluate # Uses current directory
# Quick validation (pass/fail)
sklab validate ./my-skill
# Inspect skill metadata and token costs
sklab info ./my-skill
sklab info ./my-skill --json # Machine-readable output
# Export skills as a prompt for agent platforms
sklab prompt ./skill-a ./skill-b # XML (default)
sklab prompt -f markdown # Markdown format
sklab prompt -f json # JSON format
# Generate trigger test cases (requires ANTHROPIC_API_KEY)
sklab generate ./my-skill
# Run trigger tests
sklab trigger ./my-skill
# List available checks
sklab list-checks
Usage
Evaluate a Skill
# Console output (default)
sklab evaluate ./my-skill
# JSON output
sklab evaluate ./my-skill --format json
# Save to file
sklab evaluate ./my-skill --output report.json
# Verbose (show all checks, not just failures)
sklab evaluate ./my-skill --verbose
# Spec-only (skip quality suggestions)
sklab evaluate ./my-skill --spec-only
Quick Validation
# Returns exit code 0 if valid, 1 if invalid
sklab validate ./my-skill
List Available Checks
# List all checks
sklab list-checks
# Filter by dimension
sklab list-checks --dimension structure
# Show only spec-required checks
sklab list-checks --spec-only
Inspect Skill Metadata
View skill metadata and token cost estimates:
# Rich-formatted panel (default)
sklab info ./my-skill
# JSON output (pipe-friendly)
sklab info ./my-skill --json
# Extract a single field
sklab info ./my-skill --field name
sklab info ./my-skill --field tokens
Token estimates show discovery cost (name + description, what agents see when choosing skills) and activation cost (full SKILL.md, loaded when the skill is invoked).
Export as Prompt
Export one or more skills into a prompt format for agent platforms:
# XML format (default, recommended for Claude)
sklab prompt ./skill-a ./skill-b
# Markdown format
sklab prompt ./my-skill -f markdown
# JSON format
sklab prompt ./my-skill -f json
Output goes to stdout for easy piping. A token estimate summary is printed to stderr.
Generate Trigger Tests
Auto-generate trigger test cases from a SKILL.md using an LLM:
# Generate tests (writes to .skill-lab/tests/triggers.yaml)
sklab generate ./my-skill
# Use a specific model
sklab generate ./my-skill --model claude-sonnet-4-5-20250929
# Overwrite existing tests
sklab generate ./my-skill --force
Generates ~13 test cases across 4 trigger types:
- explicit (3): Direct
$skill-nameinvocation - implicit (3): Describes the need without naming the skill
- contextual (3): Realistic prompts with project context
- negative (4): Adjacent requests that should NOT trigger
Token usage and cost are displayed after each run.
Trigger Testing
Run the generated (or hand-written) trigger tests against a real LLM:
# Run trigger tests (path defaults to current directory)
sklab trigger ./my-skill
sklab trigger # Uses current directory
# Filter by trigger type
sklab trigger --type explicit
sklab trigger --type negative
Prerequisites: Trigger testing requires:
- Claude CLI: Install via
npm install -g @anthropic-ai/claude-code
Test Definition (.skill-lab/tests/triggers.yaml):
skill: my-skill
test_cases:
- id: explicit-1
name: "Direct invocation to do something"
type: explicit
prompt: "$my-skill do something"
expected: trigger
- id: negative-1
name: "Unrelated question (should not trigger)"
type: negative
prompt: "unrelated question"
expected: no_trigger
Output Format (JSON)
{
"skill_path": "/path/to/skill",
"skill_name": "my-skill",
"timestamp": "2026-01-25T14:30:00Z",
"duration_ms": 45.3,
"quality_score": 87.5,
"overall_pass": true,
"checks_run": 19,
"checks_passed": 17,
"checks_failed": 2,
"results": [...],
"summary": {
"by_severity": {...},
"by_dimension": {...}
}
}
Development
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=skill_lab
# Type checking
mypy src/
# Linting
ruff check src/
# Format code
ruff format src/
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file skill_lab-0.4.0.tar.gz.
File metadata
- Download URL: skill_lab-0.4.0.tar.gz
- Upload date:
- Size: 77.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
822525e99238c22b68fd5792b682eb08f08a98748c68187db7de4f2bb61f06db
|
|
| MD5 |
9eff5344bc9bfd57e5cc42bf9860747c
|
|
| BLAKE2b-256 |
29189f4d9a2fd4e94fd4db40e48e3e75544bfc0f24a99aa22c2d8dfa7cc38445
|
Provenance
The following attestation bundles were made for skill_lab-0.4.0.tar.gz:
Publisher:
publish.yml on 8ddieHu0314/Skill-Lab
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
skill_lab-0.4.0.tar.gz -
Subject digest:
822525e99238c22b68fd5792b682eb08f08a98748c68187db7de4f2bb61f06db - Sigstore transparency entry: 1032114623
- Sigstore integration time:
-
Permalink:
8ddieHu0314/Skill-Lab@7742242e68853b33ecab553d79203b29b880893f -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/8ddieHu0314
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7742242e68853b33ecab553d79203b29b880893f -
Trigger Event:
release
-
Statement type:
File details
Details for the file skill_lab-0.4.0-py3-none-any.whl.
File metadata
- Download URL: skill_lab-0.4.0-py3-none-any.whl
- Upload date:
- Size: 76.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ef1220724303a92103e59afc1cb5c333c81662b57c223c8c4c38fdece9270df
|
|
| MD5 |
9b49fdae6d75b1edb96d236d64dcfd60
|
|
| BLAKE2b-256 |
861e715cec1f0501a25a252734e642cce22aaf512e58190eab6983f379ac1f60
|
Provenance
The following attestation bundles were made for skill_lab-0.4.0-py3-none-any.whl:
Publisher:
publish.yml on 8ddieHu0314/Skill-Lab
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
skill_lab-0.4.0-py3-none-any.whl -
Subject digest:
8ef1220724303a92103e59afc1cb5c333c81662b57c223c8c4c38fdece9270df - Sigstore transparency entry: 1032114710
- Sigstore integration time:
-
Permalink:
8ddieHu0314/Skill-Lab@7742242e68853b33ecab553d79203b29b880893f -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/8ddieHu0314
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7742242e68853b33ecab553d79203b29b880893f -
Trigger Event:
release
-
Statement type: