Skip to main content

Distill reusable Skills from AI Agent execution trajectories

Project description

xskill

Turn the work your AI agents already do into a reusable Skill library — automatically.

PyPI version Python License GitHub

English · 简体中文


xskill watches the conversations your coding agents have already had, picks out the moves that actually worked, and writes them into Skill files. Next time a similar problem shows up, the agent hits the Skill instead of starting from scratch.

Why xskill

If you use Claude Code (or Codex, Cursor, Trae…) day-to-day, you have probably noticed the same loop:

  • You teach the agent how to fix a class of bug today.
  • Tomorrow it forgets, googles around, and reinvents the same fix.
  • You either retype the playbook every morning, or maintain a prompt library by hand and watch it rot.

xskill closes that loop. Point it at the directory where your agent stores its session history; it quietly distills the useful patterns into a library of Skill files that the agent loads on its next run. You keep your day job; the Skill library grows on its own.

It is opinionated about two things:

  • You do not curate. xskill decides whether a new piece of experience fits an existing Skill or deserves a new one.
  • Only Skills that actually help users get shipped. A new candidate runs side-by-side with the current Skill and the loser is dropped.

Install

pip install xskill

The PyPI package is xskill; the CLI entry point is also xskill. Python 3.11+.

The first time you run xskill it writes an annotated config template to ~/.xskill/config.yaml and asks you to fill it in:

xskill serve
#  Created a config template at ~/.xskill/config.yaml
#  Edit it — fill in llm.api_key and embedding.api_key — then run `xskill serve` again.

# edit ~/.xskill/config.yaml, then:
xskill serve

A minimal config.yaml is just the two endpoints:

skill_dir: ~/.xskill/skill

llm:
  base_url: https://api.deepseek.com
  model:    deepseek-v4-flash
  api_key:  YOUR_KEY

embedding:
  base_url: https://api.deepseek.com
  model:    deepseek-embedding
  api_key:  YOUR_KEY
  dim:      0

Any OpenAI-compatible endpoint works (DeepSeek, OpenAI, Qwen/DashScope, OpenRouter, a local Ollama, …). Missing fields raise — there are no environment-variable fallbacks. The auto-created ~/.xskill/config.yaml is the full annotated template (canary, watcher, team, sandbox sections all documented inline).

If you use Claude Code, that is it — the daemon detects ~/.claude/projects/ on startup and starts watching your sessions. Otherwise, register the directory your agent writes trajectories to:

xskill registry add /path/to/your/agent/trajectories

Team mode (shared skill library)

One machine is the server, other machines join as thin clients and share its skill library:

# server — prints a join token on startup
xskill serve --server

# client — first time: pass the token; afterwards just `xskill connect`
xskill connect <host:port> --token <token>
xskill connect

xskill serve without --server stays standalone (single host, skills never leave the machine). On the server the full agent pipeline runs (split / cluster / write / canary); a client only collects + redacts + uploads its trajectories and holds a working copy of the skills the server picks for it. Canary buckets by client_id for true per-user A/B; a client's hand-edits only ever reach an isolated user-staging/<client_id> branch, never the shared main.

Helper CLI

xskill registry add    <abs-path> [--label NAME]
xskill registry remove <abs-path>
xskill registry list
xskill search traj  <query> [--top-k 5]
xskill search skill <query> [--top-k 5]

Agents inside xskill

A handful of LLM agents do the work, each with a single, narrow job:

Agent What it does (one line)
TaskAgent Reads a raw trajectory and splits it into small per-intent units (one Atom = one thing the user actually asked for).
TaskClusterAgent For each new Atom, decides: hit an existing Skill, merge into an existing Skill, or open a new Skill. Prefers reuse over creation.
SkillEditAgent When a Skill has accumulated enough relevant Atoms, writes / rewrites its SKILL.md (and any supporting scripts or references) and commits the update.
UserEditAbsorbAgent Watches for hand-edits you make to the installed Skill files and folds those changes back into the Skill library as ground truth.
AtomCanary Runs a current Skill and a new candidate side by side on real traffic and decides — based on per-Atom user-experience scores — which one ships.

Cross-agent support

The trajectory in / Skill out interfaces are pluggable. The daemon auto-detects which agents you have installed and keeps scanning them as long as it runs — install a new agent later and it gets picked up without a restart.

Status legend: ✅ verified end-to-end · 🟡 implemented, not yet verified end-to-end

Coding agent Status Trajectory ingest (input) Skill install (output)
Claude Code Native — auto-detects ~/.claude/projects/, bridges each session JSONL into a trajectory and (when a Skill is in evaluation) injects the canary marker. Native — Skill is symlinked into ~/.claude/skills/<name>/.
OpenClaw Native — auto-detects ~/.openclaw/agents/, bridges each *.trajectory.jsonl. Native — Skill is copied into ~/.agents/skills/<name>/ (OpenClaw rejects escape-root symlinks; see docs).
Codex CLI 🟡 Native — auto-detects ~/.codex/sessions/, bridges each rollout JSONL. Native — Skill is symlinked into ~/.agents/skills/<name>/ (shared user-scope skills dir).
OpenCode 🟡 Native — auto-detects ~/.local/share/opencode/opencode.db (SQLite). Native — Skill is symlinked into ~/.agents/skills/<name>/ (shared with Codex).
Cursor 🟡 Native — auto-detects ~/.cursor/projects/*/agent-transcripts/. Native — Skill is symlinked into ~/.cursor/skills/<name>/.
Any other agent Manual — submit a trajectory in markdown, json, or raw format via the SDK (xskill.adapters.submit_trajectory). Manual — every Skill is a directory with an Anthropic-style SKILL.md + YAML frontmatter; copy or symlink it into whatever discovery path your agent uses.

The output format is the same SKILL.md schema Anthropic uses, so any agent that already reads Anthropic Skills can read xskill's library verbatim. A failed install on one agent is logged and skipped — it never blocks the others.

Platforms: developed and verified on Linux; Windows is supported (a scripts/cursor_setup.ps1 helper is provided). macOS is expected to work (POSIX) but is not yet verified end-to-end.

Editing skills live

Skills install as symlinks, so editing an installed skill file — whether you do it by hand or let an agent do it — edits xskill's source copy directly. The change is live immediately: the next time your agent loads that skill, it sees your edit.

xskill folds the edit back in on its own. Once the skill has stayed quiet for ~3 minutes (no further changes), the daemon commits your edit to that skill's main branch as the new ground truth. If the skill happened to be mid-canary, the hand edit wins — the staging candidate is dropped, because a deliberate edit outranks an A/B guess.

Platforms

xskill is pure Python (3.11+) and the daemon, watcher and SDK are OS-agnostic in principle. Coverage we can honestly claim today:

Platform Status Notes
Linux (x86_64) tested ✅ Development and CI environment.
macOS should work Same POSIX surface — symlinks, ~/.claude/ path and git subprocess all behave the same as Linux. Not part of CI yet — report issues.
Windows 10 / 11 partial ⚠️ Trajectory ingest and Skill search work, but installing a Skill creates a directory symlink, which requires Developer Mode or running as Administrator on Windows. Without that, the symlink step fails. Not part of CI — community testing welcome.

If you are on Windows and want to avoid the symlink requirement, set skill_dir directly to your agent's discovery folder in ~/.xskill/config.yaml and skip the auto-install step.

Concepts

Term What it means
Trajectory One agent run — typically the transcript of a single session. xskill stores it as traj_*.md.
Atom The smallest "one user intent" slice of a trajectory. One trajectory can produce one or several Atoms. Routing decisions happen at this level.
Skill A reusable, prompt-shaped artifact your agent can load: an SKILL.md file plus optional scripts and references. Each Skill lives in its own versioned directory under ~/.xskill/skill/.
Canary The mechanism that compares an existing Skill with a new candidate version on real traffic and keeps whichever scores better on user experience.
Registry The list of directories xskill is watching. Add a path and the daemon polls it forever.
UX score An LLM-as-judge score that grades how well a Skill actually served the user on a given Atom. Used by the canary to pick a winner.

How xskill compares

We surveyed 10 academic and open-source trajectory-to-skill systems (Hermes, OpenSpace, EvoSkill, AutoSkill, AgentEvolver, MemSkill, EvoAgentX, SE-Agent, SkillRL, GEPA) before building xskill. The full matrix lives at docs/research/related-work-survey.md with path:line evidence per cell.

What xskill takes from prior work:

  • SKILL.md as the cross-agent unit — OpenSpace, EvoSkill and AutoSkill all converged here; xskill follows the same Anthropic frontmatter schema for portability.
  • LLM-as-judge UX scoring — inspired by AutoSkill's per-turn relevant / used signal.
  • Per-Skill versioning — each Skill is its own git repository, so history, diffs and rollbacks are first-class.

What xskill does that none of the surveyed projects do:

  • Real A/B between Skill versions — chat traffic is split, two-sided UX scores decide which version ships, no human in the loop.
  • Symmetric ingestion — both per-session streaming (drop a new transcript, the watcher picks it up) and batch backfill (xskill registry add /archive reindexes an entire history) are first-class.

Roadmap

  • More coding-agent adapters — Trae, Goose, OpenHands, Aider on both ends (trajectory ingest + Skill install)
  • Native MCP server interface (Skills exposed as tools)
  • Web UI for browsing the Skill library, viewing canary stats, manual approve / discard
  • Usage-driven auto-prune (delete Skills that are retrieved often but never actually used)
  • Skill marketplace: import / export portable Skill bundles
  • Multi-tenant Skill libraries (per-team skill_dir)

Have an idea? Open an issue.

Development

git clone https://github.com/SkillNerds/xskill
cd xskill
pip install -e .[dev]
pytest -q

Internal design notes live under docs/ (English and 中文 mixed).

Contributing

PRs welcome — please:

  1. Open an issue describing the problem first.
  2. Add or extend a test (no test, no merge).
  3. Keep public-API additions in xskill/__init__.py minimal — we guard the surface area.

License

MIT (c) 370025263. See LICENSE.


If xskill saves your agents from solving the same problem twice, a star on GitHub helps others find it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xskill-0.5.0a3.tar.gz (5.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xskill-0.5.0a3-py3-none-any.whl (239.0 kB view details)

Uploaded Python 3

File details

Details for the file xskill-0.5.0a3.tar.gz.

File metadata

  • Download URL: xskill-0.5.0a3.tar.gz
  • Upload date:
  • Size: 5.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for xskill-0.5.0a3.tar.gz
Algorithm Hash digest
SHA256 6dcf534d9e2543fe50c8299078b609bf9d91adec03b5a720a51a8389116bac36
MD5 9c7717ddc511ffaa54e70731dc741923
BLAKE2b-256 e1482975ddc9edee3e7b27ca5a154eb83ebd4cc9b4a93941c19b84821a91bce3

See more details on using hashes here.

File details

Details for the file xskill-0.5.0a3-py3-none-any.whl.

File metadata

  • Download URL: xskill-0.5.0a3-py3-none-any.whl
  • Upload date:
  • Size: 239.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for xskill-0.5.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 74797e4f7fe129c3cf30774a874cea15d9d9dbbed493b28375e9635c91df23ea
MD5 6b1738df03c7d772052bed0861f76256
BLAKE2b-256 08582b99f6044fc8bc0699bb708e0a60eb8ec7d26719d0949c576e6775f79b3e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page