Skip to main content

A Python CLI tool for evaluating agent skills through static analysis and quality checks

Project description

Skill Lab

PyPI version Python 3.10+ License: MIT Tests

A Python CLI tool for evaluating agent skills through static analysis and quality checks.

Features

  • SKILL.md Parsing: Parse YAML frontmatter and markdown body from skill definitions
  • 18 Static Checks: Comprehensive checks across 4 dimensions
    • Structure: File existence, folder organization, frontmatter validation
    • Naming: Format, directory matching
    • Description: Length, trigger information
    • Content: Examples, line budget, reference depth
  • Quality Scoring: Weighted 0-100 score based on check results
  • Multiple Output Formats: Console (rich formatting) and JSON
  • Trace Evaluation: Analyze execution traces against defined checks
  • Trigger Testing: Verify skill activation with different prompt types

Installation

# From PyPI
pip install skill-lab

# From source
pip install -e .

# With development dependencies
pip install -e ".[dev]"

Quick Start

# Evaluate a skill
sklab evaluate ./my-skill

# Quick validation (pass/fail)
sklab validate ./my-skill

# List available checks
sklab list-checks

Usage

Evaluate a Skill

# Console output (default)
sklab evaluate ./my-skill

# JSON output
sklab evaluate ./my-skill --format json

# Save to file
sklab evaluate ./my-skill --output report.json

# Verbose (show all checks, not just failures)
sklab evaluate ./my-skill --verbose

# Spec-only (skip quality suggestions)
sklab evaluate ./my-skill --spec-only

Quick Validation

# Returns exit code 0 if valid, 1 if invalid
sklab validate ./my-skill

List Available Checks

# List all checks
sklab list-checks

# Filter by dimension
sklab list-checks --dimension structure

# Show only spec-required checks
sklab list-checks --spec-only

Test Triggers

# Run trigger tests
sklab test-triggers ./my-skill

# Filter by trigger type
sklab test-triggers ./my-skill --type explicit

Evaluate Traces

# Evaluate an execution trace
sklab eval-trace ./my-skill --trace ./trace.jsonl

Check Categories

Structure Checks

Check ID Severity Description
structure.skill-md-exists ERROR SKILL.md file exists
structure.valid-frontmatter ERROR YAML frontmatter is parseable
frontmatter.compatibility-length ERROR Compatibility under 500 chars
frontmatter.metadata-format ERROR Metadata is string-to-string map
structure.scripts-valid WARNING /scripts contains valid files
structure.references-valid WARNING /references contains valid files

Naming Checks

Check ID Severity Description
naming.required ERROR Name field is present
naming.format ERROR Lowercase, hyphens only, max 64 chars
naming.matches-directory ERROR Name matches parent directory

Description Checks

Check ID Severity Description
description.required ERROR Description field is present
description.not-empty ERROR Description is not empty
description.max-length ERROR Max 1024 characters
description.includes-triggers INFO Describes when to use

Content Checks

Check ID Severity Description
content.body-not-empty WARNING Body has content (min 50 chars)
content.line-budget WARNING Under 500 lines
content.has-examples INFO Contains code examples
content.reference-depth WARNING References max 1 level deep

Output Format (JSON)

{
  "skill_path": "/path/to/skill",
  "skill_name": "my-skill",
  "timestamp": "2026-01-25T14:30:00Z",
  "duration_ms": 45.3,
  "quality_score": 87.5,
  "overall_pass": true,
  "checks_run": 18,
  "checks_passed": 19,
  "checks_failed": 2,
  "results": [...],
  "summary": {
    "by_severity": {...},
    "by_dimension": {...}
  }
}

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=skill_lab

# Type checking
mypy src/

# Linting
ruff check src/

# Format code
ruff format src/

Project Structure

sklab/
├── src/skill_lab/
│   ├── cli.py                    # CLI interface (Typer)
│   ├── core/
│   │   ├── models.py             # Data models
│   │   ├── registry.py           # Check registration
│   │   └── scoring.py            # Quality scoring
│   ├── parsers/
│   │   └── skill_parser.py       # SKILL.md parsing
│   ├── checks/
│   │   ├── base.py               # Base check class
│   │   └── static/               # Static checks
│   ├── evaluators/
│   │   ├── static_evaluator.py   # Static analysis
│   │   └── trace_evaluator.py    # Trace evaluation
│   ├── triggers/
│   │   └── trigger_evaluator.py  # Trigger testing
│   └── reporters/
│       ├── json_reporter.py
│       └── console_reporter.py
├── tests/
├── docs/
└── examples/

Related

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skill_lab-0.1.0.tar.gz (49.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skill_lab-0.1.0-py3-none-any.whl (56.8 kB view details)

Uploaded Python 3

File details

Details for the file skill_lab-0.1.0.tar.gz.

File metadata

  • Download URL: skill_lab-0.1.0.tar.gz
  • Upload date:
  • Size: 49.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for skill_lab-0.1.0.tar.gz
Algorithm Hash digest
SHA256 83328fcf4b2c764fd91c8a88096e894137dc84fd208c93a30124dd66571783ec
MD5 ca4619adafe41cf269f91a421c690f5f
BLAKE2b-256 f850d7f893faaa2b631ca94ff903eb0bafb7934d5de05b7b13cef1889687c418

See more details on using hashes here.

Provenance

The following attestation bundles were made for skill_lab-0.1.0.tar.gz:

Publisher: publish.yml on 8ddieHu0314/Skill-Lab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file skill_lab-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: skill_lab-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 56.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for skill_lab-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4c0c2f717f3225db806f8daf3b210d12f2d66a401fffba566a79d08ae581ece2
MD5 a5a8264c1b3b4f99e44c06b170fe7d65
BLAKE2b-256 412283721ae70811f9557e71fa42682f1953be3c6f7efb9540d9f465f4f0985f

See more details on using hashes here.

Provenance

The following attestation bundles were made for skill_lab-0.1.0-py3-none-any.whl:

Publisher: publish.yml on 8ddieHu0314/Skill-Lab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page