Skip to main content

Architect โ€” taxonomy-driven skill recommendation engine for AI agent builders

Project description

Skills Tree

Skills Tree

Banner

๐Ÿ“† This Week's Highlights โ€” June 15, 2026

๐Ÿ”ฅ Most Active Skills

  • Readme โ€” 1 PR
  • Xquik Api โ€” 1 PR

The AI Agent Skill OS โ€” Build Smarter Agents, Faster

360 skills across 17 categories. Versioned, benchmarked, and openly evolving.
Stop rediscovering. Start building on what the community has already proven.

50 skills are battle-tested today. 308 are stubs waiting for a real example, real I/O, and real failure modes โ€” see meta/QUALITY-REPORT.md for the full list. PRs that turn a stub into a production-ready entry are the highest-impact contribution you can make.

Stars Forks Watchers Views Issues PRs Welcome Contributors Last Commit Repo Size CI License: MIT Skills Version GitHub Pages

๐ŸŒ Browse Live UI ยท ๐Ÿ—บ๏ธ Systems ยท ๐Ÿ—๏ธ Blueprints ยท ๐Ÿ“Š Benchmarks ยท ๐Ÿ”ฌ Labs ยท ๐Ÿค Contribute ยท ๐Ÿ—บ Roadmap

๐Ÿฆ Share Skills Tree on X / Twitter โ†’

๐ŸŒ Read in your language: ๐Ÿ‡ฌ๐Ÿ‡ง English ยท ๐Ÿ‡ธ๐Ÿ‡ฆ ุงู„ุนุฑุจูŠุฉ ยท ๐Ÿ‡จ๐Ÿ‡ณ ไธญๆ–‡ ยท ๐Ÿ‡ช๐Ÿ‡ธ Espaรฑol ยท ๐Ÿ‡ฉ๐Ÿ‡ช Deutsch ยท ๐Ÿ‡ซ๐Ÿ‡ท Franรงais ยท ๐Ÿ‡ฎ๐Ÿ‡ณ เคนเคฟเคจเฅเคฆเฅ€ ยท ๐Ÿ‡ฏ๐Ÿ‡ต ๆ—ฅๆœฌ่ชž ยท ๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ตญ์–ด ยท ๐Ÿ‡ง๐Ÿ‡ท Portuguรชs ยท ๐Ÿ‡ท๐Ÿ‡บ ะ ัƒััะบะธะน


The Problem

Every AI agent builder rediscovers the same skills from scratch.

Someone learns RAG the hard way. Someone else figures out memory injection at 2am. A third person spends a week benchmarking ReAct vs LATS โ€” and never shares the results. A fourth discovers the same failure modes you already hit last month.

That collective knowledge is disappearing into Slack threads, private repos, and Twitter bookmarks.

Skills Tree fixes that.


What This Is

Skills Tree is the shared operating system for AI agent capabilities.

A living, versioned, community-powered index of everything an agent can do โ€” at its best, documented with working code, real benchmarks, failure modes, and evolution history.

We don't pretend every entry is finished. Battle-tested skills (badged ๐ŸŸข verified) are production-ready and copy-paste safe. Yellow / unscanned skills are the community's TODO list โ€” open files, real problem space, and the clearest signal of where contributions are most useful.

It's not a list. It's infrastructure being built in public.


๐Ÿš€ Start Here โ€” Battle-Tested Skills

If you're new, read these first. Each one ships with runnable code, typed I/O, failure modes, and a model-comparison table.

Agent reasoning loops

  • ReAct โ€” Thought โ†’ Action โ†’ Observation, the foundation of tool-using agents
  • Chain of Thought โ€” explicit step-by-step reasoning + self-consistency
  • Tree of Thought โ€” branched reasoning with scoring + beam search
  • Reflection / Reflexion โ€” critique โ†’ revise loop on top of any output
  • Self-Consistency โ€” sample N chains, majority-vote
  • Planning โ€” typed, DAG-validated plans your executor can run
  • Task Decomposition โ€” break a goal into atomic, runnable subtasks

Retrieval & memory

  • RAG โ€” chunk โ†’ embed โ†’ retrieve โ†’ cite, end-to-end with confidence + threshold
  • Vector Store Retrieval โ€” typed top-k cosine search with metadata filtering
  • Embedding Generation โ€” batched, content-hash-cached, Matryoshka-truncatable
  • Memory Injection โ€” top-K user memories per turn
  • Short-Term Memory โ€” token-budgeted rolling window (the foundation for everything else)

Calling LLMs in production

  • Function / Tool Calling โ€” the primitive that turns an LLM into an agent
  • OpenAI API โ€” chat, structured outputs, tools, embeddings, streaming, retry
  • Anthropic API โ€” Claude with tool loop, prompt caching, streaming

Working with text

  • Translation โ€” placeholder-safe MT with glossary + tone
  • Paraphrasing โ€” simplify / formalize / diversify
  • OCR โ€” VLM + classical OCR with confidence-based human-review routing

Code

  • Code Generation โ€” spec โ†’ AST-validated source with self-repair on failure
  • Bug Fixing โ€” agentic loop: read โ†’ patch โ†’ test โ†’ repeat until green
  • Code Review โ€” automated critique with severity tiers

Web

  • Web Search โ€” Tavily/Serper/Brave with recency + host allowlist + TTL cache
  • Web Scraping โ€” trafilatura + BS4 fallback, metadata, redirect-safe

Security

Action execution

  • File Write โ€” atomic, crash-safe file writes for agents
  • HTTP Request โ€” production HTTP with idempotency, retry-on-idempotent-only, header redaction
  • Dependency Auditor โ€” vulnerability + license + freshness audit

The full battle-tested set is auto-listed in meta/QUALITY-REPORT.md. The same report names every stub that needs upgrading โ€” those are the highest-impact PRs you can submit.


What's Inside

skills-tree/
โ”‚
โ”œโ”€โ”€ skills/          โ†’ 360 atomic skill files (50 battle-tested, 308 stubs awaiting upgrade)
โ”‚                     run `python3 tools/check_skill_quality.py` for the live count
โ”œโ”€โ”€ systems/         โ†’ Multi-skill workflows (research agent, code reviewer...)
โ”œโ”€โ”€ blueprints/      โ†’ Copy-paste production architectures
โ”œโ”€โ”€ benchmarks/      โ†’ Head-to-head, reproducible skill comparisons
โ”œโ”€โ”€ labs/            โ†’ Experimental & bleeding-edge capabilities
โ”‚
โ”œโ”€โ”€ docs/            โ†’ Interactive web UI (GitHub Pages)
โ”œโ”€โ”€ i18n/            โ†’ Localized READMEs (Arabic, Chinese, Spanish, German, French, Hindi, Japanese, Korean, Portuguese, Russian)
โ”œโ”€โ”€ meta/            โ†’ Schema, glossary, frameworks, roadmap, changelog
โ””โ”€โ”€ requirements.txt โ†’ Pinned Python deps for CI workflows

๐Ÿ—‚๏ธ The 17 Skill Categories

# Category Skills What It Covers
01 ๐Ÿ‘๏ธ Perception 36 Text, images, PDFs, code, sensors, databases, screens
02 ๐Ÿง  Reasoning 39 Planning, deduction, abduction, causal chains, commonsense
03 ๐Ÿ—„๏ธ Memory 19 Working, episodic, semantic, vector, injection, forgetting
04 โšก Action Execution 21 File I/O, HTTP, email, shell, database writes
05 ๐Ÿ’ป Code 28 Write, run, debug, review, refactor, test, deploy
06 ๐Ÿ’ฌ Communication 15 Summarize, translate, draft, argue, adapt tone
07 ๐Ÿ”ง Tool Use 32 APIs โ€” GitHub, Slack, Stripe, OpenAI, MCP, A2A
08 ๐ŸŽญ Multimodal 14 Images, audio, video, VQA, 3D, charts
09 ๐Ÿค– Agentic Patterns 23 ReAct, CoT, ToT, MCTS, LATS, RAG, Debate
10 ๐Ÿ–ฅ๏ธ Computer Use 20 Click, type, scroll, OCR, terminal, VM, a11y tree
11 ๐ŸŒ Web 17 Search, scrape, crawl, login, fill forms, parse RSS
12 ๐Ÿ“Š Data 18 ETL, SQL, embeddings, time series, anomaly detection
13 ๐ŸŽจ Creative 14 Copywriting, image prompts, SVG, music, scripts
14 ๐Ÿ”’ Security 13 Sandboxing, secret scanning, audit logs, rollback
15 ๐ŸŽผ Orchestration 22 Multi-agent, state machines, retry, consensus
16 ๐Ÿบ Domain-Specific 28 Medical, legal, finance, DevOps, education, science
17 ๐Ÿ› ๏ธ Infrastructure 1 Dependency auditing & supply-chain tooling (early)

Counts above reflect skill files on disk and are auto-synced by tools/update_readme_counts.py (run nightly via update-skill-count.yml). If you spot a drift, open an issue.


A Skill in 60 Seconds

Every skill file is self-contained and production-ready:

# Memory Injection
Category: memory | Level: intermediate | Stability: stable | Version: v2

## Description
Dynamically inject relevant past memories into an agent's system prompt
before each turn โ€” giving the model user context without filling the window.

## Example
```python
client.messages.create(
    system=f"{base_system}\n\n## Memory\n{top_k_memories}",
    messages=[{"role": "user", "content": user_message}]
)
```

## Benchmarks  โ†’ benchmarks/memory/injection-strategies.md
## Related     โ†’ working-memory.md ยท rag.md ยท vector-store-retrieval.md
## Changelog   โ†’ v1 (2025-03) ยท v2 (2026-04, added retrieval scoring)

Every skill includes:

  • โœ… What it does and why it matters
  • โœ… Typed inputs/outputs
  • โœ… Runnable Python code (claude-opus-4-5 / gpt-4o)
  • โœ… Frameworks table (LangChain, LangGraph, CrewAI, mem0...)
  • โœ… Failure modes and edge cases
  • โœ… Related skills cross-links
  • โœ… Version history

Skill Versioning โ€” How Evolution Works

Skills are not static files. They evolve as the community learns:

v1 โ€” Initial entry: description + minimal example
v2 โ€” Enriched: better example + failure modes + related skills
v3 โ€” Battle-tested: benchmarks + model comparison + production notes

To upgrade a skill:

  1. Bump the version in frontmatter
  2. Add a changelog entry explaining what improved
  3. Open a PR titled improve: skill-name โ€” v1 โ†’ v2

The best versions surface naturally โ€” through PR merge frequency and inclusion in Systems + Blueprints.


๐Ÿ—บ๏ธ Systems โ€” Multi-Skill Workflows

See how skills combine into real, working agent pipelines:

System Skills Used Use Case
Research Agent Web search + RAG + Summarize + Cite Deep research automation
Coding Agent Code reading + Write + Debug + Test End-to-end code generation
Code Reviewer Code reading + Reasoning + Comment gen Automated PR reviews
Data Pipeline Agent DB reading + ETL + Anomaly detection Automated data ops
Customer Support Bot Memory injection + Intent + Response gen Personalized support
Computer Use Agent Screen reading + OCR + Click + Type Full GUI automation
Data Analyst SQL + Charts + Summarize + Insight gen Automated data analysis
Voice Agent Audio transcription + NLU + TTS Real-time voice interaction

๐Ÿ—๏ธ Blueprints โ€” Production Architectures

Copy-paste architectures for the most common agent patterns:

Blueprint Description
RAG Stack Embed โ†’ store โ†’ retrieve โ†’ generate, fully wired
Multi-Agent Workflow Sequential orchestration with handoffs
Multi-Agent Mesh N specialists + orchestrator, parallel execution
Computer Use Browser Browser automation via Playwright + vision
Human-in-the-Loop Approval gates, escalation, audit trails
Self-Healing Agent Error detection, retry logic, rollback
Memory-First Agent Profile + episodic + vector memory combined

๐Ÿ“Š Benchmarks โ€” Real Numbers, Reproducible

We test so you don't have to:

Benchmark Winner Margin Link
ReAct vs LATS (HotpotQA) LATS +8.3% accuracy โ†’
RAG retrieval strategies HyDE +12% recall โ†’
Memory injection methods Top-K semantic Best cost/quality ratio โ†’
Function calling comparison Claude 3.7 +6% on tool accuracy โ†’

Every benchmark includes methodology, dataset, and reproducible test scripts.


๐Ÿ† This Week's Highlights

Auto-updated weekly ยท Full leaderboard โ†’

๐Ÿ”ฅ Most Active Skills

  • skills/09-agentic-patterns/react.md โ€” 12 community improvements this month
  • skills/03-memory/memory-injection.md โ€” v2 with retrieval scoring
  • skills/02-reasoning/causal.md โ€” new benchmark comparison added

โšก Battle-Tested (used in 10+ public projects) ReAct ยท Chain of Thought ยท RAG Pipeline ยท Memory Injection ยท Tool Use

๐Ÿ”ฌ Hot in Labs

  • labs/reasoning/tree-of-agents.md โ€” multi-agent tree search
  • labs/memory/episodic-compression.md โ€” lossy-but-useful memory compression
  • labs/tool-use/adaptive-tool-selection.md โ€” dynamic tool filtering for large registries

๐Ÿค How to Contribute

Four types of contributions โ€” all valued:

Type What It Is PR Title Format
New Skill A capability not yet indexed feat: add [skill] to [category]
Skill Upgrade Bump v1โ†’v2 with better content improve: [skill] โ€” v1โ†’v2
Benchmark Head-to-head with real numbers benchmark: [skill-a] vs [skill-b]
System / Blueprint Multi-skill workflow or architecture system: add [name]
git clone https://github.com/SamoTech/skills-tree.git
cp meta/skill-template.md skills/05-code/my-new-skill.md
# Fill in every section โ†’ open a PR

Quality Rules

  • โŒ No generic prompts or vague descriptions
  • โŒ No skills without a working code example
  • โœ… Must solve a real, specific problem
  • โœ… Must be structured and reusable
  • โœ… Must include inputs, outputs, and at least one runnable example

Full guide: CONTRIBUTING.md


Quick Start

# Clone
git clone https://github.com/SamoTech/skills-tree.git

# Find a skill by keyword
grep -r "memory injection" skills/ --include="*.md" -l

# Read a full system end-to-end
cat systems/research-agent.md

# See benchmark results
cat benchmarks/tool-use/function-calling-comparison.md

Or browse the live UI โ†’


Who This Is For

๐Ÿ—๏ธ  Agent Builders       โ†’ Production skill patterns, ready to use today
๐Ÿ”ฌ  AI Researchers        โ†’ Benchmarks, taxonomy, and full capability coverage
๐Ÿ“  System Architects     โ†’ Blueprints for multi-agent production systems
๐ŸŽ“  Learners              โ†’ Structured path from basic skills โ†’ advanced systems
๐Ÿค  Contributors          โ†’ A community that improves everything together

๐Ÿ—บ๏ธ Roadmap

See the full plan: meta/ROADMAP.md

Near-term (v2.x):

  • Skill dependency graph โ€” visual map of how skills relate
  • Skill Paths โ€” curated learning tracks (e.g., "Build a Research Agent in 5 skills")
  • JSON/YAML export of all skill metadata for programmatic use
  • Community skill ratings and upvotes
  • Auto-leaderboard: Top Skills This Week, Most Improved, Battle-Tested

Medium-term (v3.0):

  • CLI: skills-tree search "memory injection" โ†’ returns ranked results
  • LangChain Hub / MCP registry integration
  • โœ… Localization: Arabic, Chinese, Spanish READMEs โ€” shipped in v2.1
  • Automated changelog generation on PR merge

Long-term vision:

  • Skills Tree becomes the canonical reference for AI agent capabilities
  • Every major agent framework links here as the skill index
  • 1000+ skills, all battle-tested, all benchmarked

Vision

AI agents are becoming teammates, not tools.

Skills Tree is the shared foundation they run on โ€” a living OS of capabilities that the community builds, tests, and evolves together.

Every skill added here saves every agent builder who comes after you. Every benchmark run here prevents someone else from wasting a week. Every system documented here becomes a launchpad for the next builder.

This is not a repo. It's infrastructure for the AI-native era.


โญ Star this repo ยท ๐ŸŒ Browse Skills ยท ๐Ÿค Contribute ยท ๐Ÿ—บ Roadmap ยท ๐Ÿ’– Sponsor

The AI Agent Skill OS โ€” built by the community, for the community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skills_tree-1.0.0.tar.gz (111.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skills_tree-1.0.0-py3-none-any.whl (99.4 kB view details)

Uploaded Python 3

File details

Details for the file skills_tree-1.0.0.tar.gz.

File metadata

  • Download URL: skills_tree-1.0.0.tar.gz
  • Upload date:
  • Size: 111.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for skills_tree-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2197beff8775a502f768e450650d35d0ba2c90caabdd309047f22d76f84fd3e5
MD5 5300821057b3a5cf58ea5c796851c538
BLAKE2b-256 6372734a64d03d31cb3af9be70f64db735d10cdd2bcb09febc8fecb1d5f78959

See more details on using hashes here.

File details

Details for the file skills_tree-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: skills_tree-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 99.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for skills_tree-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41e7ff99a2a479857d82807e5e99a834489af2af5677290d7dd65ac91000ab5c
MD5 2a021236433d260fd0f9c5a4fe537732
BLAKE2b-256 fd49e6faf955f7267245912a7187704bab91b4981dd255010a8f388ffed32d89

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page