Evaluation-driven Claude Code skill development
Project description
skillet
Evaluation-driven Claude Code skill development.
Three levels of documentation:
- This README — the concise overview, mirroring the structure below
docs/— full markdown documentation, shipped with the package- skillet.run — the rendered docs site, 1:1 with
docs/
Install
pip install pyskillet
Why
Anthropic recommends building evaluations before writing skills:
Create evaluations BEFORE writing extensive documentation. This ensures your Skill solves real problems rather than documenting imagined ones.
But they don't provide tooling:
We do not currently provide a built-in way to run these evaluations.
skillet fills that gap.
Quick Start
Capture failures with /skillet:add in Claude Code, then run the loop:
skillet eval my-skill # baseline
skillet create my-skill # generate skill from evals
skillet eval my-skill ~/.claude/skills/my-skill # eval with skill
skillet tune my-skill ~/.claude/skills/my-skill # iteratively improve
Overview
Skillet captures failures, runs systematic evaluations, and iterates on skills with quantitative feedback. → docs/index.md
Getting Started
End-to-end walkthrough from your first capture to a tuned skill. → docs/getting-started.md
Concepts
Skills vs Agents
Skillet evaluates skills (instructions that shape behavior), not agents (the underlying capability). → docs/concepts/skills-vs-agents.md
Capability vs Regression
The same eval format serves two purposes: capability (pass@k, exploratory) during development, regression (pass^k, strict) in CI. → docs/concepts/capability-vs-regression.md
Balanced Problem Sets
A good eval suite needs negative cases — prompts where the skill should not trigger — to catch overtriggering. → docs/concepts/balanced-problem-sets.md
Guides
Capture with /skillet:add
Use /skillet:add in Claude Code to record failures as YAML eval files. → docs/guides/capture-with-slash-command.md
Linting
skillet lint <path> checks a SKILL.md against 14 rules covering naming, frontmatter, body length, and recommended fields. → docs/guides/linting.md
Contributing
Development setup, testing strategy, code style, and PR conventions. → docs/guides/contributing.md
Reference
CLI
skillet ships with eval, create, tune, compare, show, lint, and generate-evals. → docs/reference/cli.md
Eval Format
YAML schema for eval files: required name/prompt/expected, optional domain/setup/teardown. → docs/reference/eval-format.md
Python API
Programmatic interface: evaluate(), tune(), create_skill(), generate_evals(), show(), lint_skill(). → docs/reference/python-api.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyskillet-0.2.30.tar.gz.
File metadata
- Download URL: pyskillet-0.2.30.tar.gz
- Upload date:
- Size: 487.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ecd7cff857b882f8ad515e0bf6601d5cc9ee9dd10c0663a3bfbbb9518effb7a2
|
|
| MD5 |
cf9c1570bd74baedc9d65078b3fd4f3a
|
|
| BLAKE2b-256 |
268676dddf2fe3cc0856a7ea50bd610c177e7111575f878ff0e4cb57a0426cf5
|
File details
Details for the file pyskillet-0.2.30-py3-none-any.whl.
File metadata
- Download URL: pyskillet-0.2.30-py3-none-any.whl
- Upload date:
- Size: 221.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00b875e19ec42d5e75817bc65d0c513d3edb499d4510c1c54373f84438ac4a98
|
|
| MD5 |
f3b78f5fd17b55ae5ac8fde2078570e1
|
|
| BLAKE2b-256 |
8b21fd54d6c9f4a844f951b8608685d226d3be665a8d45cda99e94c6f5f84f06
|