Generate grounded AI coding assistant skills and repository blueprints from real Python codebases.

These details have not been verified by PyPI

Project links

Project description

code2skill

中文优先，后附英文快速说明。 Chinese first, with an English quick reference at the end.

code2skill 是一个面向 Python 仓库的 CLI。它会把真实代码仓库转换成结构化项目知识、AI 可消费的 Skill 文档，以及可以直接适配到 Cursor、Claude Code、Codex、Copilot、Windsurf 的规则文件。

它的目标不是“总结仓库”，而是生成能直接用于后续编码、审查、补丁编写和增量更新的高密度上下文。

核心定位：

输入：一个真实 Python 仓库
中间产物：project-summary.md、skill-blueprint.json、skill-plan.json
最终产物：skills/*.md、AGENTS.md、CLAUDE.md、copilot-instructions.md 等 AI 规则文件
目标工具：Cursor、Claude Code、Codex、GitHub Copilot、Windsurf

如果你在找下面这些能力，这个项目就是为此设计的：

Python repository analysis for AI coding assistants
Generate Cursor rules / Codex AGENTS.md / Claude Code docs from source code
Turn a backend repository into reusable AI skills instead of one-off prompts
Keep AI repo knowledge incrementally updated in CI

为什么它更适合 AI 消费

先做结构扫描，再做 Skill 规划，最后生成可复用文档，而不是一次性长 prompt
Phase 1 不依赖 LLM，先把目录、import、角色、模式、规则和流程提纯
输出是稳定文件，不是聊天记录，后续可以直接复用、提交和增量更新
支持把生成结果落到不同 IDE/Agent 约定位置，而不是手动复制粘贴

你会得到什么

project-summary.md：面向人快速浏览的项目概览
skill-blueprint.json：Phase 1 的结构化仓库蓝图
skill-plan.json：LLM 规划出的 Skill 列表和阅读文件
skills/index.md：Skill 索引
skills/*.md：真正给 AI 编程助手消费的领域规则文档
AGENTS.md / CLAUDE.md / .cursor/rules/*：适配后的 IDE 产物

典型使用场景

给一个已有 Python 后端仓库补齐 Cursor / Codex / Claude Code 规则
在 CI 里根据 diff 自动重建受影响的 Skill
给团队沉淀一套来自真实代码而不是口头约定的开发规范
给 AI 编程工具提供 grounded repository context，减少幻觉和误判

适用范围

当前只面向 Python 仓库
Phase 1 不调用 LLM
Python 源码使用 ast 做结构提取
支持 scan、estimate、ci、adapt
支持 openai、claude、qwen
默认使用英文 prompt 和英文 Skill 输出，不使用 emoji，证据不足处标记 [Needs confirmation]

核心特性

结构扫描：目录发现、过滤、预算裁剪、Python 骨架提取
结构分析：import graph、角色修正、模式检测、抽象规则提炼
Skill 规划：用 1 次 LLM 调用决定生成哪些 Skill、每个 Skill 读哪些文件
Skill 生成：按 Skill 聚焦上下文逐个生成高质量 Markdown
增量更新：在 CI 中根据 Git diff 只重写受影响的 Skill
目标适配：把 skills/*.md 复制或合并到 Cursor / Codex / Claude 等约定位置

30 秒上手

先设置模型环境变量：

export QWEN_API_KEY=...
export CODE2SKILL_LLM=qwen
export CODE2SKILL_MODEL=qwen-plus-latest

PowerShell:

$env:QWEN_API_KEY="..."
$env:CODE2SKILL_LLM="qwen"
$env:CODE2SKILL_MODEL="qwen-plus-latest"

进入要分析的仓库目录后直接运行：

code2skill scan

现在 repo_path 默认就是当前目录，所以在仓库根目录里不需要再写 .。

如果只想先做结构扫描：

code2skill scan --structure-only

如果已经有历史状态，想走自动增量：

code2skill ci --mode auto

安装

发布版：

pip install code2skill

开发版：

pip install -e .[dev]

命令入口：

code2skill --help
python -m code2skill --help

常用环境变量

这些变量是为了让本地和 CI 使用更短的命令。

LLM API Key：

export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export QWEN_API_KEY=...

PowerShell:

$env:OPENAI_API_KEY="..."
$env:ANTHROPIC_API_KEY="..."
$env:QWEN_API_KEY="..."

CLI 默认值：

export CODE2SKILL_LLM=qwen
export CODE2SKILL_MODEL=qwen-plus-latest
export CODE2SKILL_OUTPUT_DIR=.code2skill
export CODE2SKILL_MAX_SKILLS=6
export CODE2SKILL_BASE_REF=origin/main

PowerShell:

$env:CODE2SKILL_LLM="qwen"
$env:CODE2SKILL_MODEL="qwen-plus-latest"
$env:CODE2SKILL_OUTPUT_DIR=".code2skill"
$env:CODE2SKILL_MAX_SKILLS="6"
$env:CODE2SKILL_BASE_REF="origin/main"

说明：

qwen 默认走阿里国际站兼容接口
qwen 会读取 QWEN_API_KEY，也兼容 DASHSCOPE_API_KEY
如果没有配置对应 API key，命令会直接报错，不会静默降级

命令速查

完整扫描并生成 Skill：

code2skill scan --llm qwen --model qwen-plus-latest

只做结构扫描：

code2skill scan --structure-only

自动增量：

code2skill ci --mode auto --base-ref origin/main

只做成本预估：

code2skill estimate

把 Skill 合并到 Codex 规则文件：

code2skill adapt --target codex --source-dir .code2skill/skills

适配所有目标：

code2skill adapt --target all --source-dir .code2skill/skills

工作流说明

Phase 1：结构扫描

输入：

仓库路径

输出：

project-summary.md
skill-blueprint.json
references/architecture.md
references/code-style.md
references/workflows.md
references/api-usage.md
report.json
state/analysis-state.json

主要步骤：

文件发现与过滤
粗评分与预算裁剪
Python AST 骨架提取
import graph 构建
基于结构信号修正优先级和角色
模式检测与抽象规则提炼
组装 SkillBlueprint

Phase 2：Skill 规划

输入：

skill-blueprint.json

输出：

skill-plan.json

主要步骤：

压缩项目画像、目录摘要、依赖簇、核心模块、规则和流程
调用 1 次 LLM
决定要生成哪些 Skill
为每个 Skill 选出最值得阅读的文件集合

Phase 3：Skill 生成

输入：

skill-plan.json
每个 Skill 对应的文件正文或骨架

输出：

skills/index.md
skills/*.md

主要步骤：

按 Skill 收集上下文文件
筛选与该 Skill 最相关的抽象规则
调用 LLM 生成 Skill 文档
在增量模式下只修订受影响的 section

Adapt：目标格式适配

输入：

skills/*.md

输出：

Cursor：复制到 .cursor/rules/
Claude：合并为 CLAUDE.md
Codex：合并为 AGENTS.md
Copilot：合并为 .github/copilot-instructions.md
Windsurf：合并为 .windsurfrules

输出目录

.code2skill/
  project-summary.md
  skill-blueprint.json
  skill-plan.json
  report.json
  references/
    architecture.md
    code-style.md
    workflows.md
    api-usage.md
  skills/
    index.md
    *.md
  state/
    analysis-state.json

CI / 增量使用建议

推荐把 .code2skill/ 当成 CI cache 或 artifact，而不是提交进仓库。

增量模式依赖这些文件：

.code2skill/state/analysis-state.json
.code2skill/skill-plan.json
最好同时恢复 .code2skill/skills/

如果这些文件缺失，或者 diff 条件不满足，ci --mode auto 会自动回退到全量模式。

自动回退到全量的常见情况

没有历史状态
改动了核心配置文件，例如 pyproject.toml
改动文件数超过 --max-incremental-changed-files
当前目录不是 Git 仓库，且也没有提供 --diff-file

GitHub Actions 示例

name: code2skill

on:
  pull_request:

jobs:
  build-skills:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Restore code2skill cache
        uses: actions/cache@v4
        with:
          path: .code2skill
          key: code2skill-${{ runner.os }}-${{ github.ref_name }}-${{ github.sha }}
          restore-keys: |
            code2skill-${{ runner.os }}-${{ github.ref_name }}-

      - name: Install
        run: pip install -e .[dev]

      - name: Run code2skill
        env:
          QWEN_API_KEY: ${{ secrets.QWEN_API_KEY }}
          CODE2SKILL_LLM: qwen
          CODE2SKILL_MODEL: qwen-plus-latest
        run: |
          code2skill ci \
            --mode auto \
            --base-ref origin/${{ github.base_ref }} \
            --head-ref HEAD

      - name: Upload outputs
        uses: actions/upload-artifact@v4
        with:
          name: code2skill-output
          path: .code2skill

说明：

fetch-depth: 0 很重要，否则基线提交可能不在本地历史里
restore-keys 能让同一分支上的后续提交复用历史状态
第一次没有 cache 时，ci --mode auto 会自动走全量

生成产物与 Git 管理

默认情况下，仓库根目录下的这些目录已经在 .gitignore 中忽略：

.code2skill/
.code2skill-*/
.pypi-smoke/

建议：

正式产物统一写到 .code2skill/
本地试跑、真人验收、不同模型对比时，用 .code2skill-qwen-live/、.code2skill-test/ 这类命名
不要把测试生成的 skills/ 目录提交到 Git
如果要在 PR 中查看结果，优先用 artifact，而不是直接提交生成文件

这个项目内部是怎么完成的

如果你想理解 code2skill 自己是如何工作的，可以从这些模块开始：

src/code2skill/scanner/：文件发现、过滤、预算裁剪、优先级评分
src/code2skill/extractors/python_extractor.py：Python AST 骨架提取
src/code2skill/import_graph.py：仓库内 import graph
src/code2skill/pattern_detector.py：同角色文件模式检测
src/code2skill/analyzers/skill_blueprint_builder.py：把扫描结果组装成 SkillBlueprint
src/code2skill/skill_planner.py：生成 skill-plan.json
src/code2skill/skill_generator.py：生成和增量修订 skills/*.md
src/code2skill/core.py：统一编排 scan / estimate / ci

推荐阅读顺序：

cli.py
core.py
scanner/ 与 extractors/
analyzers/
skill_planner.py
skill_generator.py
adapt.py

发布检查清单

开发与发布前推荐跑：

pip install -e .[dev]
python -m pytest tests -q
python -m build
python -m twine check dist/code2skill-*.tar.gz dist/code2skill-*.whl

当前边界

目前只面向 Python 仓库
生成的 Skill 已适合辅助编码与审查，但不应被当作绝对事实
增量更新依赖历史状态文件与可用 diff
report.json 中部分影响摘要仍带启发式成分，最终以 skill-plan.json 和生成出来的 skills/*.md 为准

English Quick Reference

What It Does

code2skill turns a Python repository into:

a structural blueprint
a skill plan
generated skill markdown files
cached state for incremental CI/CD runs

Quick Start

From the target repo root:

export QWEN_API_KEY=...
export CODE2SKILL_LLM=qwen
export CODE2SKILL_MODEL=qwen-plus-latest
code2skill scan

PowerShell:

$env:QWEN_API_KEY="..."
$env:CODE2SKILL_LLM="qwen"
$env:CODE2SKILL_MODEL="qwen-plus-latest"
code2skill scan

Main Commands

code2skill scan
code2skill scan --structure-only
code2skill ci --mode auto --base-ref origin/main
code2skill estimate
code2skill adapt --target codex --source-dir .code2skill/skills

Incremental CI Requirements

Restore:

.code2skill/state/analysis-state.json
.code2skill/skill-plan.json
preferably .code2skill/skills/

If they are missing, ci --mode auto falls back to a full run.

Release Validation

pip install -e .[dev]
python -m pytest tests -q
python -m build
python -m twine check dist/code2skill-*.tar.gz dist/code2skill-*.whl

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.4

Mar 17, 2026

This version

0.1.3

Mar 17, 2026

0.1.2

Mar 17, 2026

0.1.1

Mar 17, 2026

0.1.0

Mar 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code2skill-0.1.3.tar.gz (75.5 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

code2skill-0.1.3-py3-none-any.whl (82.6 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file code2skill-0.1.3.tar.gz.

File metadata

Download URL: code2skill-0.1.3.tar.gz
Upload date: Mar 17, 2026
Size: 75.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for code2skill-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`faeb63941302b02f158002a1b2dd41eaa106708d8078e6686837e55cb14863d3`
MD5	`b3b8423872ed3388bb0118a7ac29934a`
BLAKE2b-256	`2edabec442398bffecb79576dc6c4fbcf7000533ad3b0d7f82a0c7f0ae9b1604`

See more details on using hashes here.

File details

Details for the file code2skill-0.1.3-py3-none-any.whl.

File metadata

Download URL: code2skill-0.1.3-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 82.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for code2skill-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c429700e134f19116f213f40edad16c3d2af41c583610e496e5379fdd089d05b`
MD5	`9db84557d67c8781fadc1fde4a4c0204`
BLAKE2b-256	`c1ed8c801dc1102fabb5348a3909411e6cc15acaf07126c1c879e11acf02c56d`

See more details on using hashes here.

code2skill 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

code2skill

为什么它更适合 AI 消费

你会得到什么

典型使用场景

适用范围

核心特性

30 秒上手

安装

常用环境变量

命令速查

工作流说明

Phase 1：结构扫描

Phase 2：Skill 规划

Phase 3：Skill 生成

Adapt：目标格式适配

输出目录

CI / 增量使用建议

自动回退到全量的常见情况

GitHub Actions 示例

生成产物与 Git 管理

这个项目内部是怎么完成的

发布检查清单

当前边界

English Quick Reference

What It Does

Quick Start

Main Commands

Incremental CI Requirements

Release Validation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes