Skip to main content

Adaptive skill routing for multi-agent systems

Project description

๐Ÿงถ Skill Weave

Routing that learns. Chains that self-correct. Zero install to try.

Python License Tests Release Colab


What makes it special: A three-stage routing pipeline that shrinks 141 candidate skills to 15 before any LLM call. An online learner that gets smarter every time you use it. A weaver that chains skills into DAGs instead of picking just one.

95.7% accuracy. 81% fewer tokens. 20 tests. Zero required dependencies.


๐Ÿ”ฅ Why This Exists

Multi-agent systems drown in their own skills.

Problem Why It Hurts How Skill Weave Fixes It
Keyword routing breaks under overlap "deploy" vs "ssh-deploy" vs "docker-deploy" all match 4-dim weighted scoring: semantic ร— recency ร— success ร— cost
Static tables rot silently Add/remove one skill, the whole map breaks Dynamic registration + online learning from every route
Flat LLM routing burns tokens 141 skills = 141 candidates to rank. Every. Single. Time. 3-stage cascade: Tree Filter (141โ†’15) โ†’ BM25 โ†’ LLM Re-rank

๐Ÿ†š How It Compares

Feature Skill Weave Keyword Match LangChain Router Semantic Kernel
Zero-dependency core โœ… โœ… โŒ โŒ
3-stage cascade pipeline โœ… โŒ โŒ โŒ
Online learning from outcomes โœ… โŒ โŒ โŒ
Multi-skill DAG weaving โœ… โŒ โŒ โŒ
Chinese-English synonym match โœ… โŒ โŒ โŒ
Production-tested (141 skills) โœ… โ€” โ€” โ€”
Token cost per route 0โ€“2K 0 โˆž (flat) โˆž (flat)

Bottom line: Keyword matching is fast but brittle. LangChain/SK handle semantics but burn tokens on every call. Skill Weave does both โ€” cascade filtering + semantic re-rank โ€” with learning on top.


๐ŸŽฎ Try It Now

No install. No API key. 10 seconds.

Open in Colab

Four interactive demos: basic routing โ†’ active learning โ†’ skill weaving โ†’ multi-plan comparison.


๐Ÿ“ฆ Install

# From GitHub (recommended until PyPI listing is complete)
pip install git+https://github.com/Hxh-yaoxing/skill-weave.git

# Coming soon: pip install skill-weave

โšก 30-Second Quick Start

from skill_weave import SkillRouter

router = SkillRouter()
router.register_skill("deploy",   metadata="deploy to production, handle rollback")
router.register_skill("monitor",  metadata="monitor health metrics, alert on anomalies")
router.register_skill("rollback", metadata="revert failed deployments")

results = router.route("The new deploy broke everything, we need to go back")
for r in results:
    print(f"{r.skill.name}: {r.score:.2f}")
# โ†’ rollback: 0.68
# โ†’ deploy:   0.54
# โ†’ monitor:  0.42

It chose rollback โ€” even though the query never said "rollback". That's semantic routing.


๐Ÿง  The Pipeline

flowchart TD
    TASK["๐Ÿ’ฌ Task: 'deploy broke, revert now'"]
    
    TASK --> L1
    
    subgraph L1["L1: Tree Filter (zero token)"]
        T1["'deploy' โ†’ infrastructure โ†’ 37 matches"]
        T2["'revert' โ†’ narrows to 4 candidates"]
        T1 --> T2
    end
    
    L1 --> L2
    
    subgraph L2["L2: BM25 Rank (<50ms)"]
        B1["Statistical scoring over 4 candidates"]
        B2["rollback: 0.68 | deploy: 0.54"]
        B1 --> B2
    end
    
    L2 --> L3
    
    subgraph L3["L3: LLM Re-rank (optional, ~1s)"]
        R1["Semantic understanding over top ~10"]
        R2["Accuracy: 69.6% โ†’ 95.7%"]
        R1 --> R2
    end
    
    L3 --> OUTPUT["โœ… rollback (score: 0.92)"]
Stage What It Does Token Cost Latency
L1: Tree Filter Hierarchy + synonym match โ†’ narrows 141โ†’~15 0 <1ms
L2: BM25 Character 2-gram (ไธญๆ–‡) + word-level (EN) retrieval 0 <50ms
L3: LLM Re-rank Deep semantic reasoning over ~10 candidates ~2K ~1s

๐Ÿ“– API Reference

SkillRouter โ€” Zero-dependency core

router = SkillRouter(
    alpha=0.45,    # semantic weight
    beta=0.20,     # recency weight
    gamma=0.25,    # success rate weight
    delta=0.10,    # cost weight
)
router.register_skill(name, metadata="...", tags=[...], avg_cost=1.0)
router.unregister_skill(name)
router.route(task, top_k=5, max_cost=None, tags_filter=None)  โ†’ list[RouteResult]
router.record_outcome(skill_name, success=True, cost=1.0)
router.skills  โ†’ dict[str, Skill]

SkillWeave โ€” Production 3-stage pipeline

sw = SkillWeave(skill_dir="/path/to/skills", llm_rank_fn=my_llm_fn)
sw.route(query, top_k=5, exclude_tier3=True)  โ†’ list[dict]
sw.run_benchmark(queries, verbose=True)        โ†’ {"accuracy": 0.957, ...}
sw.stats                                       โ†’ {"total_skills": 141, ...}

FeedbackLearner โ€” Online weight adjustment

learner = FeedbackLearner(router)
learner.route(task, explore=True)              # UCB bandit exploration
learner.record(skill_name, task, success=True,
               dimension_contributions={"semantic": 0.9, ...})
learner.stats()                                # weight changes + success rates
learner.reset()                                # restore original weights

WeavePlanner โ€” Multi-skill DAG orchestration

planner = WeavePlanner(router)
planner.register_chain_simple("pipeline", ["fetch", "parse", "store"])
planner.register_chain("ci-cd", ["deploy", "monitor"],
    conditions={1: ("'error' in str(output)", "rollback")})
planner.plan("run the ci-cd pipeline")          โ†’ WeaveChain
planner.plan_deep("complex task", max_depth=3)  โ†’ list[list[str]]
planner.record_chain_outcome("pipeline", True)  # track chain success

annotate โ€” Skill metadata management

from skill_weave import annotate_skill, inject_annotations, load_skill_metadata

dims = annotate_skill("path/to/SKILL.md")       # generate 4-dim metadata
inject_annotations("path/to/SKILL.md", dims)    # write into frontmatter
skills = load_skill_metadata("/skill/dir")      # scan all skill metadata

๐Ÿ“Š Real-World Performance

Deployed in production routing 141 skills across 63 categories:

           BM25 only           + LLM Re-rank
Accuracy   โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘ 69.6%    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ 95.7%
Tokens     0                   ~2K per query
Latency    <50ms               ~1s

23-query benchmark included in the repo (benchmark/queries.json).


๐Ÿ—บ๏ธ Architecture

skill_weave/
โ”œโ”€โ”€ router.py        SkillRouter โ€” 4-dim weighted scoring (zero-dependency)
โ”œโ”€โ”€ advanced.py      SkillWeave  โ€” 3-stage pipeline + BM25 + TreeFilter
โ”œโ”€โ”€ annotate.py      Annotation  โ€” 4-dim metadata generation + injection
โ”œโ”€โ”€ learner.py       Learning    โ€” UCB bandit + gradient weight adjustment
โ””โ”€โ”€ weaver.py        Weaving     โ€” DAG orchestration (chains, parallel, conditional)

benchmark/queries.json     23 real-world routing test cases
notebooks/demo.ipynb       Colab: try before you read
tests/                     20 tests, all passing

๐Ÿค Contributing

Skill Weave is built by FeiMing Studio โ€” a small team of humans and AI agents building together.

We welcome contributions. Before diving in:

  1. Browse the Colab demo โ€” understand what the project does
  2. Read CONTRIBUTING.md โ€” setup, conventions, commit style
  3. Open an issue โ€” discuss before coding large changes
  4. Run the tests โ€” python tests/test_router.py && python tests/test_learner.py

We use conventional commits (feat:, fix:, docs:) and squash-merge to main.

Development Setup

git clone https://github.com/Hxh-yaoxing/skill-weave.git
cd skill-weave
pip install -e ".[dev]"
python tests/test_router.py   # 9 tests
python tests/test_learner.py  # 11 tests

๐Ÿ“Š Status & Roadmap

Active development. Core pipeline stable. Used daily in production.

Version Date Highlights
0.3.0 2026-06-05 Active learning (UCB), skill weaving (DAG), 20 tests
0.2.0 2026-06-05 3-stage pipeline, BM25, TreeFilter, annotation, benchmark
0.1.0 2026-06-05 Core SkillRouter with 4-dim weighted scoring

Up next: v0.4 โ€” async routing + embedding backends. Full changelog โ†’


๐Ÿ‘ฅ Authors

Role Name
Engine & Architecture Hermes ๆทฑ่“ (@Hxh-yaoxing)
Creative Direction & Co-creation ๆ›œ่กŒ (He Xuheng)
Initial Scaffold Hermes ๆฅšไน”
Infrastructure FeiMing Studio

We're real people (and agents) who iterate fast, communicate openly, and ship on weekends. If you open an issue, a human will respond.


๐Ÿ“„ License

MIT โ€” use it, fork it, ship it.


FeiMing Studio โ€” where humans and agents build together.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skill_weave-0.3.0.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skill_weave-0.3.0-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

File details

Details for the file skill_weave-0.3.0.tar.gz.

File metadata

  • Download URL: skill_weave-0.3.0.tar.gz
  • Upload date:
  • Size: 25.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for skill_weave-0.3.0.tar.gz
Algorithm Hash digest
SHA256 8ddfcaffaff0a6f1fcc538269456a8397b9255f0db5e53e24b8d4d913d72b987
MD5 632fbed4972805f20e6bf8b3685b938b
BLAKE2b-256 704528fe10236c744f39ddebecd77cfb2314791c01e326d225846aab31a4f344

See more details on using hashes here.

File details

Details for the file skill_weave-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: skill_weave-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for skill_weave-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e13e3bde00db6f334549812dfaeff1c213c8a7622b6113fbbfab783fec63c18
MD5 03c92974fdb2772648d1e9370cdd8f4a
BLAKE2b-256 bb92357ea2ede6da203a6d575d8c895cf5482bc83c5a001d155920968a878121

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page