Skip to main content

Evaluation-driven Claude Code skill development

Project description

skillet

CI License Python

Evaluation-driven Claude Code skill development.

Three levels of documentation:

  1. This README — the concise overview, mirroring the structure below
  2. docs/ — full markdown documentation, shipped with the package
  3. skillet.run — the rendered docs site, 1:1 with docs/

Install

pip install pyskillet

Why

Anthropic recommends building evaluations before writing skills:

Create evaluations BEFORE writing extensive documentation. This ensures your Skill solves real problems rather than documenting imagined ones.

But they don't provide tooling:

We do not currently provide a built-in way to run these evaluations.

skillet fills that gap.

Quick Start

Capture failures with /skillet:add in Claude Code, then run the loop:

skillet eval my-skill                              # baseline
skillet create my-skill                            # generate skill from evals
skillet eval my-skill ~/.claude/skills/my-skill    # eval with skill
skillet tune my-skill ~/.claude/skills/my-skill    # iteratively improve

Overview

Skillet captures failures, runs systematic evaluations, and iterates on skills with quantitative feedback. docs/index.md

Getting Started

End-to-end walkthrough from your first capture to a tuned skill. docs/getting-started.md

Concepts

Skills vs Agents

Skillet evaluates skills (instructions that shape behavior), not agents (the underlying capability). docs/concepts/skills-vs-agents.md

Capability vs Regression

The same eval format serves two purposes: capability (pass@k, exploratory) during development, regression (pass^k, strict) in CI. docs/concepts/capability-vs-regression.md

Balanced Problem Sets

A good eval suite needs negative cases — prompts where the skill should not trigger — to catch overtriggering. docs/concepts/balanced-problem-sets.md

Guides

Capture with /skillet:add

Use /skillet:add in Claude Code to record failures as YAML eval files. docs/guides/capture-with-slash-command.md

Linting

skillet lint <path> checks a SKILL.md against 14 rules covering naming, frontmatter, body length, and recommended fields. docs/guides/linting.md

Contributing

Development setup, testing strategy, code style, and PR conventions. docs/guides/contributing.md

Reference

CLI

skillet ships with eval, create, tune, compare, show, lint, and generate-evals. docs/reference/cli.md

Eval Format

YAML schema for eval files: required name/prompt/expected, optional domain/setup/teardown. docs/reference/eval-format.md

Python API

Programmatic interface: evaluate(), tune(), create_skill(), generate_evals(), show(), lint_skill(). docs/reference/python-api.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyskillet-0.2.30.tar.gz (487.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyskillet-0.2.30-py3-none-any.whl (221.7 kB view details)

Uploaded Python 3

File details

Details for the file pyskillet-0.2.30.tar.gz.

File metadata

  • Download URL: pyskillet-0.2.30.tar.gz
  • Upload date:
  • Size: 487.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pyskillet-0.2.30.tar.gz
Algorithm Hash digest
SHA256 ecd7cff857b882f8ad515e0bf6601d5cc9ee9dd10c0663a3bfbbb9518effb7a2
MD5 cf9c1570bd74baedc9d65078b3fd4f3a
BLAKE2b-256 268676dddf2fe3cc0856a7ea50bd610c177e7111575f878ff0e4cb57a0426cf5

See more details on using hashes here.

File details

Details for the file pyskillet-0.2.30-py3-none-any.whl.

File metadata

  • Download URL: pyskillet-0.2.30-py3-none-any.whl
  • Upload date:
  • Size: 221.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pyskillet-0.2.30-py3-none-any.whl
Algorithm Hash digest
SHA256 00b875e19ec42d5e75817bc65d0c513d3edb499d4510c1c54373f84438ac4a98
MD5 f3b78f5fd17b55ae5ac8fde2078570e1
BLAKE2b-256 8b21fd54d6c9f4a844f951b8608685d226d3be665a8d45cda99e94c6f5f84f06

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page