Skip to main content

Evolve AI agent skills through iterative meta-skill-driven optimization

Project description

skill-evolution

CI

Evolve AI agent skills through iterative meta-skill-driven optimization.

Inspired by SkillEvolver and EmbodiSkill, skill-evolution is a framework-agnostic CLI tool that automatically improves AI agent skill documents through a principled evolution loop.

How It Works

          ┌─────────────┐
          │ Initial Skill│
          └──────┬───────┘
                 │
    ┌────────────▼────────────┐
    │  1. Strategy Explorer    │  Generate K diverse approaches
    └────────────┬────────────┘
                 │
    ┌────────────▼────────────┐
    │  2. Task Executor        │  Run each strategy independently
    └────────────┬────────────┘
                 │
    ┌────────────▼────────────┐
    │  3. Trajectory Comparator│  Compare success vs failure → delta signals
    └────────────┬────────────┘
                 │
    ┌────────────▼────────────┐
    │  4. Skill Patcher        │  Apply targeted patches (not rewrites)
    └────────────┬────────────┘
                 │
    ┌────────────▼────────────┐
    │  5. Independent Auditor  │  Check for overfitting, hardcoding, etc.
    └────────────┬────────────┘
                 │
          ┌──────▼───────┐
          │ Evolved Skill │──── repeat for R rounds
          └──────────────┘

Key design principles:

  • Contrastive updates: improvement signals come from comparing successful vs failed trajectories, not from self-reflection
  • Targeted patching: only modify what signals indicate — preserve everything else
  • Skill-aware attribution: distinguish skill defects (fix the body) from execution lapses (reinforce in appendix)
  • Independent audit: a separate LLM instance reviews evolved skills for overfitting

Quick Start

Install

pip install git+https://github.com/victorzhong0110/skill-evolution.git

Or for development:

git clone https://github.com/victorzhong0110/skill-evolution.git
cd skill-evolution && pip install -e ".[dev]"

Zero-API-key quickstart (Claude Code users)

If you have the claude CLI installed, no API key is needed — the cli provider reuses your existing Claude Code authentication. Try evolving a CLAUDE.md-style guidance file:

skill-evolution evolve examples/claude_md/skill.md examples/claude_md/tasks.txt \
  --provider cli --rounds 1 --strategies 2

See examples/claude_md/ for the walkthrough, including how to point this at your own project's CLAUDE.md.

Evolve a skill (API providers)

# Set your API key
export ANTHROPIC_API_KEY=sk-...
# or for OpenAI:
# export OPENAI_API_KEY=sk-...

# Run evolution (2 rounds, 4 strategies per task)
skill-evolution evolve examples/code_review/skill.md examples/code_review/tasks.txt

# With options
skill-evolution evolve examples/code_review/skill.md examples/code_review/tasks.txt \
  --rounds 3 \
  --strategies 4 \
  --budget 5.0 \
  --provider claude \
  --model claude-sonnet-4-20250514

Audit a skill

skill-evolution audit my-skill.md

View version history

skill-evolution history code-review --workspace .skill-evolution

Rollback

skill-evolution rollback code-review 2 --workspace .skill-evolution

Generate default config

skill-evolution init

Skill Format

Skills are Markdown files with YAML front matter:

---
name: my-skill
version: 0
domain: engineering
tags: [example]
---

# Skill Body

Core rules and knowledge go here.

## Appendix

Reinforcement reminders for rules agents tend to skip.

Task Format

Tasks are plain text files, one task per line:

Review this code for SQL injection vulnerabilities: `query = f"SELECT * FROM users WHERE id = {user_id}"`
Analyze this function for performance issues: `def find(items): return [x for x in items if x in other_list]`

Or JSON arrays:

["Task 1 description", "Task 2 description"]

Configuration

Generate a config file with skill-evolution init, then edit skill-evolution.yaml:

llm:
  provider: claude          # claude | openai | cli | bridge
  model: claude-sonnet-4-20250514
  temperature: 0.7
evolution:
  num_strategies: 4         # K: strategies per task per round
  num_rounds: 2             # R: evolution rounds
  budget_usd: 10.0          # Max spend (null = unlimited)
  auto_snapshot: true
audit:
  enabled: true
workspace_dir: .skill-evolution

Architecture

src/skill_evolution/
├── cli.py              # CLI commands (evolve, audit, history, rollback, init)
├── config.py           # YAML configuration
├── llm/                # LLM abstraction (Claude + OpenAI compatible)
├── skill/              # Skill schema + version management
├── core/               # Evolution engine
│   ├── explorer.py     # Strategy diversification
│   ├── comparator.py   # Contrastive trajectory analysis
│   ├── patcher.py      # Targeted skill patching
│   ├── auditor.py      # Independent quality audit
│   └── pipeline.py     # Orchestrates the full loop
├── runner/             # Task execution
│   └── executor.py     # Independent agent execution
└── meta_skills/        # Built-in meta-skills (themselves evolvable)
    ├── strategy_generation.md
    ├── trajectory_comparison.md
    ├── skill_audit.md
    └── skill_patch.md

Meta-Skills: The Bootstrap

The four meta-skills in meta_skills/ drive the evolution process itself. They can be evolved using the same pipeline — making the system self-improving:

# Evolve the strategy generation meta-skill using its own pipeline
skill-evolution evolve src/skill_evolution/meta_skills/strategy_generation.md meta_skill_tasks.txt

Citation

If you use this tool in research, please cite the papers that inspired it:

@article{skillevolver2026,
  title={SkillEvolver: Skill Learning as a Meta-Skill},
  author={Zhang, Genrui and Zhu, Erle and Zhou, Jinfeng and Jia, Caiyan and Wang, Hongning},
  journal={arXiv preprint arXiv:2605.10500},
  year={2026}
}

@article{embodiskill2026,
  title={EmbodiSkill: Skill-Aware Reflection for Self-Evolving Embodied Agents},
  author={...},
  journal={arXiv preprint arXiv:2605.10332},
  year={2026}
}

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skill_evolution-0.1.0.tar.gz (91.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skill_evolution-0.1.0-py3-none-any.whl (60.3 kB view details)

Uploaded Python 3

File details

Details for the file skill_evolution-0.1.0.tar.gz.

File metadata

  • Download URL: skill_evolution-0.1.0.tar.gz
  • Upload date:
  • Size: 91.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for skill_evolution-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a0e9c5a77548c55edc8d0f3c096f595f76c5335d2a13424917beeac41b977e90
MD5 f4a1a8424c826211a08851081c470cff
BLAKE2b-256 70434fbaef7b7f2cca98911f33a0627c3fbb29ea89cd6b385f69db515832a788

See more details on using hashes here.

Provenance

The following attestation bundles were made for skill_evolution-0.1.0.tar.gz:

Publisher: publish.yml on victorzhong0110/skill-evolution

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file skill_evolution-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: skill_evolution-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 60.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for skill_evolution-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4fdc82c84069e30a7fc0172444a604c85fb0c6a94c0f6d62a176985e9266773e
MD5 79502a13c34ffb7dcf891d5cc4735321
BLAKE2b-256 95a0ea20e5dd221fe8c28cbcb48f466b96b3ab2f4b85bf6cb2120d5fcf9bcf34

See more details on using hashes here.

Provenance

The following attestation bundles were made for skill_evolution-0.1.0-py3-none-any.whl:

Publisher: publish.yml on victorzhong0110/skill-evolution

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page