Evaluate agent skills via static analysis, trigger testing, and trace analysis. Run `sklab` after installing to scan your skills.

These details have not been verified by PyPI

Project description

Skill Lab

A Python CLI tool for evaluating agent skills through static analysis, trigger testing, and trace analysis.

Features

SKILL.md Parsing: Parse YAML frontmatter and markdown body from skill definitions
19 Static Checks: Comprehensive checks across 4 dimensions
- Structure: File existence, folder organization, frontmatter validation, standard fields
- Naming: Format, directory matching
- Description: Required, non-empty, max length
- Content: Examples, line budget, reference depth
Trigger Testing: Test skill activation with 4 trigger types (explicit, implicit, contextual, negative)
Trigger Generation: LLM-powered test case generation via Anthropic API
Skill Inspector: sklab info shows metadata and token cost estimates
Prompt Export: sklab prompt exports skills as XML, Markdown, or JSON for agent platforms
Quality Scoring: Weighted 0-100 score based on check results
Multiple Output Formats: Console (rich formatting) and JSON

Installation

# From PyPI
pip install skill-lab

# With LLM-based trigger generation (requires Anthropic API)
pip install skill-lab[generate]

# From source
pip install -e .

# With development dependencies
pip install -e ".[dev]"

Setup

API Key (required for `sklab generate`)

sklab generate uses the Anthropic API to generate trigger test cases. Set your API key:

export ANTHROPIC_API_KEY=sk-ant-...

Get your key at console.anthropic.com.

Model Configuration (optional)

The default model is claude-haiku-4-5-20251001. Override it per-command or globally:

# Per-command
sklab generate ./my-skill --model claude-sonnet-4-5-20250929

# Global default via environment variable
export SKLAB_MODEL=claude-sonnet-4-5-20250929

Quick Start

After installing, run sklab to scan your repo for skills and see an initial evaluation:

sklab                             # First run: scans repo + shows Getting Started guide

# Evaluate a skill (path defaults to current directory)
sklab evaluate ./my-skill
sklab evaluate                    # Uses current directory
sklab evaluate --all              # Evaluate every skill in the current directory
sklab evaluate --repo             # Evaluate every skill from the git repo root

# Quick validation (pass/fail)
sklab validate ./my-skill

# Inspect skill metadata and token costs
sklab info ./my-skill
sklab info ./my-skill --json      # Machine-readable output

# Export skills as a prompt for agent platforms
sklab prompt ./skill-a ./skill-b  # XML (default)
sklab prompt -f markdown          # Markdown format
sklab prompt -f json              # JSON format

# Generate trigger test cases (requires ANTHROPIC_API_KEY)
sklab generate ./my-skill

# Run trigger tests
sklab trigger ./my-skill

# List available checks
sklab list-checks

Usage

Evaluate a Skill

# Console output (default)
sklab evaluate ./my-skill

# JSON output
sklab evaluate ./my-skill --format json

# Save to file
sklab evaluate ./my-skill --output report.json

# Verbose (show all checks, not just failures)
sklab evaluate ./my-skill --verbose

# Spec-only (skip quality suggestions)
sklab evaluate ./my-skill --spec-only

# Bulk evaluation
sklab evaluate --all              # All skills under current directory
sklab evaluate --repo             # All skills from git repo root

Quick Validation

# Returns exit code 0 if valid, 1 if invalid
sklab validate ./my-skill

List Available Checks

# List all checks
sklab list-checks

# Filter by dimension
sklab list-checks --dimension structure

# Show only spec-required checks
sklab list-checks --spec-only

Inspect Skill Metadata

View skill metadata and token cost estimates:

# Rich-formatted panel (default)
sklab info ./my-skill

# JSON output (pipe-friendly)
sklab info ./my-skill --json

# Extract a single field
sklab info ./my-skill --field name
sklab info ./my-skill --field tokens

Token estimates show discovery cost (name + description, what agents see when choosing skills) and activation cost (full SKILL.md, loaded when the skill is invoked).

Export as Prompt

Export one or more skills into a prompt format for agent platforms:

# XML format (default, recommended for Claude)
sklab prompt ./skill-a ./skill-b

# Markdown format
sklab prompt ./my-skill -f markdown

# JSON format
sklab prompt ./my-skill -f json

Output goes to stdout for easy piping. A token estimate summary is printed to stderr.

Generate Trigger Tests

Auto-generate trigger test cases from a SKILL.md using an LLM:

# Generate tests (writes to .skill-lab/tests/triggers.yaml)
sklab generate ./my-skill

# Use a specific model
sklab generate ./my-skill --model claude-sonnet-4-5-20250929

# Overwrite existing tests
sklab generate ./my-skill --force

Generates ~13 test cases across 4 trigger types:

explicit (3): Direct $skill-name invocation
implicit (3): Describes the need without naming the skill
contextual (3): Realistic prompts with project context
negative (4): Adjacent requests that should NOT trigger

Token usage and cost are displayed after each run.

Trigger Testing

Run the generated (or hand-written) trigger tests against a real LLM:

# Run trigger tests (path defaults to current directory)
sklab trigger ./my-skill
sklab trigger                     # Uses current directory

# Filter by trigger type
sklab trigger --type explicit
sklab trigger --type negative

Prerequisites: Trigger testing requires:

Claude CLI: Install via npm install -g @anthropic-ai/claude-code

Test Definition (.skill-lab/tests/triggers.yaml):

skill: my-skill
test_cases:
  - id: explicit-1
    name: "Direct invocation to do something"
    type: explicit
    prompt: "$my-skill do something"
    expected: trigger
  - id: negative-1
    name: "Unrelated question (should not trigger)"
    type: negative
    prompt: "unrelated question"
    expected: no_trigger

Output Format (JSON)

{
  "skill_path": "/path/to/skill",
  "skill_name": "my-skill",
  "timestamp": "2026-01-25T14:30:00Z",
  "duration_ms": 45.3,
  "quality_score": 87.5,
  "overall_pass": true,
  "checks_run": 19,
  "checks_passed": 17,
  "checks_failed": 2,
  "results": [...],
  "summary": {
    "by_severity": {...},
    "by_dimension": {...}
  }
}

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=skill_lab

# Type checking
mypy src/

# Linting
ruff check src/

# Format code
ruff format src/

Telemetry

sklab collects anonymous usage data to help improve your experience and the tool. On first interactive run a notice is printed. No skill content, file paths, or flag values are ever collected.

What is collected: command names, flags used, duration, exit codes, OS, Python version, sklab version, skill names, skill versions, evaluation scores, token counts (input + output), session ID, CI environment, and error types (class name only — no messages or content).

To opt out, set an environment variable before running any sklab command:

export SKLAB_NO_ANALYTICS=1   # sklab-specific
export DO_NOT_TRACK=1          # standard cross-tool opt-out

See docs/PRIVACY.md for the full privacy policy.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.7.0

Apr 6, 2026

0.6.0

Mar 21, 2026

This version

0.5.0

Mar 18, 2026

0.4.0

Mar 4, 2026

0.3.0

Mar 2, 2026

0.2.0

Feb 7, 2026

0.1.0

Jan 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skill_lab-0.5.0.tar.gz (129.0 kB view details)

Uploaded Mar 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

skill_lab-0.5.0-py3-none-any.whl (104.5 kB view details)

Uploaded Mar 18, 2026 Python 3

File details

Details for the file skill_lab-0.5.0.tar.gz.

File metadata

Download URL: skill_lab-0.5.0.tar.gz
Upload date: Mar 18, 2026
Size: 129.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for skill_lab-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`2eda414ff8ea3a0ef5905cba4d6c3fd01b2aab34aeee0f3a40ffdd7f4306a8bc`
MD5	`a15cdd1330a28094abd3ce5842b51afe`
BLAKE2b-256	`1ce07d21b5cdb9c58e3148d6a39b5e8f5f617c548aaa4530dff1544dcc517448`

See more details on using hashes here.

Provenance

The following attestation bundles were made for skill_lab-0.5.0.tar.gz:

Publisher: publish.yml on 8ddieHu0314/Skill-Lab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: skill_lab-0.5.0.tar.gz
- Subject digest: 2eda414ff8ea3a0ef5905cba4d6c3fd01b2aab34aeee0f3a40ffdd7f4306a8bc
- Sigstore transparency entry: 1122088199
- Sigstore integration time: Mar 18, 2026
Source repository:
- Permalink: 8ddieHu0314/Skill-Lab@4313db122b0b506a5a3b4accad6c1aca34eb0576
- Branch / Tag: refs/tags/v0.5.0
- Owner: https://github.com/8ddieHu0314
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4313db122b0b506a5a3b4accad6c1aca34eb0576
- Trigger Event: release

File details

Details for the file skill_lab-0.5.0-py3-none-any.whl.

File metadata

Download URL: skill_lab-0.5.0-py3-none-any.whl
Upload date: Mar 18, 2026
Size: 104.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for skill_lab-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`082eff3be5e444e18bd1df50a794c57963179431bf448e5ca55ab9d52e88dae3`
MD5	`e3802e0b6413c5390b6934162906a98c`
BLAKE2b-256	`30d00ece1ecbf6e3919bef0522651b6c7d0a7a66c6c78b8bc40e0293046d167c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for skill_lab-0.5.0-py3-none-any.whl:

Publisher: publish.yml on 8ddieHu0314/Skill-Lab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: skill_lab-0.5.0-py3-none-any.whl
- Subject digest: 082eff3be5e444e18bd1df50a794c57963179431bf448e5ca55ab9d52e88dae3
- Sigstore transparency entry: 1122088241
- Sigstore integration time: Mar 18, 2026
Source repository:
- Permalink: 8ddieHu0314/Skill-Lab@4313db122b0b506a5a3b4accad6c1aca34eb0576
- Branch / Tag: refs/tags/v0.5.0
- Owner: https://github.com/8ddieHu0314
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4313db122b0b506a5a3b4accad6c1aca34eb0576
- Trigger Event: release

skill-lab 0.5.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Skill Lab

Features

Installation

Setup

API Key (required for sklab generate)

Model Configuration (optional)

Quick Start

Usage

Evaluate a Skill

Quick Validation

List Available Checks

Inspect Skill Metadata

Export as Prompt

Generate Trigger Tests

Trigger Testing

Output Format (JSON)

Development

Telemetry

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

API Key (required for `sklab generate`)