Skip to main content

Dynamic RAG-powered skills for code assistants via Model Context Protocol

Project description

mcp-skills

PyPI version Python Versions License: MIT Test Coverage

Dynamic RAG-powered skills for code assistants via Model Context Protocol (MCP)

mcp-skills is a standalone Python application that provides intelligent, context-aware skills to code assistants through hybrid RAG (vector + knowledge graph). Unlike static skills that load at startup, mcp-skills enables runtime skill discovery, automatic recommendations based on your project's toolchain, and dynamic loading optimized for your workflow.

Key Features

  • 🚀 Zero Config: mcp-skills setup handles everything automatically
  • 🧠 Intelligent: Auto-detects your project's toolchain (Python, TypeScript, Rust, Go, etc.)
  • 🔍 Dynamic Discovery: Vector similarity + knowledge graph for better skill finding
  • 📦 Multi-Source: Pulls skills from multiple git repositories
  • ⚡ On-Demand Loading: Skills loaded when needed, not all at startup
  • 🔌 MCP Native: First-class Model Context Protocol integration

Installation

From PyPI

pip install mcp-skillkit

From Source

git clone https://github.com/bobmatnyc/mcp-skills.git
cd mcp-skills
pip install -e .

Local Development (Without Installation)

For development, you can run mcp-skills directly from source without installing:

# Use the development script
./mcp-skills-dev --help
./mcp-skills-dev search "python testing"
./mcp-skills-dev setup --auto

The mcp-skills-dev script:

  • Runs the package from source code (not installed version)
  • Uses local virtual environment if available
  • Sets up PYTHONPATH automatically
  • Passes all arguments through to the CLI

This is useful for:

  • Testing changes without reinstalling
  • Developing new features
  • Debugging with source code
  • Contributing to the project

Note: For production use, install the package normally with pip install -e . or pip install mcp-skillkit.

First-Run Requirements

Important: On first run, mcp-skills will automatically download a ~90MB sentence-transformer model (all-MiniLM-L6-v2) for semantic search. This happens during the initial mcp-skills setup or when you first run any command that requires indexing.

Requirements:

  • ✅ Active internet connection
  • ✅ ~100MB free disk space
  • ✅ 2-5 minutes for initial download (depending on connection speed)

Model Caching:

  • Models are cached in ~/.cache/huggingface/ for future use
  • Subsequent runs use the cached model (no download required)
  • The cache persists across mcp-skills updates

Quick Start

1. Setup

Run the interactive setup wizard to configure mcp-skills for your project:

mcp-skills setup

Note: The first run will download the embedding model (~90MB) before proceeding with setup. Allow 2-5 minutes for this initial download. Subsequent runs will be much faster.

This will:

  • Download embedding model (first run only)
  • Detect your project's toolchain
  • Clone relevant skill repositories
  • Build vector + knowledge graph indices
  • Configure MCP server integration
  • Validate the setup

2. Start the MCP Server

mcp-skills serve

The server will start and expose skills to your code assistant via MCP protocol.

3. Use with Claude Code

Skills are automatically available in Claude Code. Try:

  • "What testing skills are available for Python?"
  • "Show me debugging skills"
  • "Recommend skills for my project"

Project Structure

~/.mcp-skills/
├── config.yaml              # User configuration
├── repos/                   # Cloned skill repositories
│   ├── anthropics/skills/
│   ├── obra/superpowers/
│   └── custom-repo/
├── indices/                 # Vector + KG indices
│   ├── vector_store/
│   └── knowledge_graph/
└── metadata.db             # SQLite metadata

Architecture

mcp-skills uses a hybrid RAG approach combining:

Vector Store (ChromaDB):

  • Fast semantic search over skill descriptions
  • Embeddings generated with sentence-transformers
  • Persistent local storage with minimal configuration

Knowledge Graph (NetworkX):

  • Skill relationships and dependencies
  • Category and toolchain associations
  • Related skill discovery

Toolchain Detection:

  • Automatic detection of programming languages
  • Framework and build tool identification
  • Intelligent skill recommendations

Configuration

Global Configuration (~/.mcp-skills/config.yaml)

repositories:
  - url: https://github.com/anthropics/skills.git
    priority: 100
    auto_update: true

vector_store:
  backend: chromadb
  embedding_model: all-MiniLM-L6-v2

server:
  transport: stdio
  log_level: info

Project Configuration (.mcp-skills.yaml)

project:
  name: my-project
  toolchain:
    primary: Python
    frameworks: [Flask, SQLAlchemy]

auto_load:
  - systematic-debugging
  - test-driven-development

CLI Commands

# Setup and Configuration
mcp-skills setup                    # Interactive setup wizard
mcp-skills config                   # Show configuration

# Server
mcp-skills serve                    # Start MCP server (stdio)
mcp-skills serve --http             # Start HTTP server
mcp-skills serve --dev              # Development mode (auto-reload)

# Skills Management
mcp-skills search "testing"         # Search skills
mcp-skills list                     # List all skills
mcp-skills info pytest-skill        # Show skill details
mcp-skills recommend                # Get recommendations

# Repositories
mcp-skills repo add <url>           # Add repository
mcp-skills repo list                # List repositories
mcp-skills repo update              # Update all repositories

# Indexing
mcp-skills index                    # Rebuild indices
mcp-skills index --incremental      # Index only new skills

# Utilities
mcp-skills health                   # Health check
mcp-skills stats                    # Usage statistics

MCP Tools

mcp-skills exposes these tools to code assistants:

  • search_skills: Natural language skill search
  • get_skill: Load full skill instructions by ID
  • recommend_skills: Get recommendations for current project
  • list_categories: List all skill categories
  • update_repositories: Pull latest skills from git

Development

Requirements

  • Python 3.11+
  • Git

Setup Development Environment

git clone https://github.com/bobmatnyc/mcp-skills.git
cd mcp-skills
pip install -e ".[dev]"

Running from Source (Development Mode)

Use the ./mcp-skills-dev script to run commands directly from source without installation:

# Run any CLI command
./mcp-skills-dev --version
./mcp-skills-dev search "debugging"
./mcp-skills-dev serve --dev

# All arguments pass through
./mcp-skills-dev info systematic-debugging

How it works:

  1. Sets PYTHONPATH to include src/ directory
  2. Activates local .venv if present
  3. Runs python -m mcp_skills.cli.main with all arguments

When to use:

  • ✅ Rapid iteration during development
  • ✅ Testing changes without reinstalling
  • ✅ Debugging with source code modifications
  • ❌ Production deployments (use pip install instead)

Installed vs. Source:

# Installed version (from pip install -e .)
mcp-skills search "testing"

# Source version (no installation required)
./mcp-skills-dev search "testing"

Run Tests

make quality

Performance Benchmarks

mcp-skills includes comprehensive performance benchmarks to track and prevent regressions:

# Run all benchmarks (includes slow tests)
make benchmark

# Run fast benchmarks only (skip 10k skill tests)
make benchmark-fast

# Compare current performance with baseline
make benchmark-compare

Benchmark Categories:

  • Indexing Performance: Measure time to index 100, 1000, and 10000 skills
  • Search Performance: Track query latency (p50, p95, p99) for vector and hybrid search
  • Database Performance: Benchmark SQLite operations (lookup, query, batch insert)
  • Memory Usage: Monitor memory consumption during large-scale operations

Baseline Thresholds:

  • Index 100 skills: < 10 seconds
  • Index 1000 skills: < 100 seconds
  • Search query (p50): < 100ms
  • Search query (p95): < 500ms
  • SQLite lookup by ID: < 1ms

Benchmark Results:

  • Results are saved to .benchmarks/ directory (git-ignored)
  • Use make benchmark-compare to detect performance regressions
  • CI/CD can be configured to fail on significant performance degradation

Example Output:

-------------------------- benchmark: 15 tests --------------------------
Name (time in ms)                    Min      Max     Mean   StdDev
---------------------------------------------------------------------
test_vector_search_latency_100      45.2     52.1    47.8     2.1
test_lookup_by_id_single             0.3      0.8     0.4     0.1
test_hybrid_search_end_to_end       89.5    105.2    94.3     5.2
---------------------------------------------------------------------

Linting and Formatting

make lint-fix

Documentation

Architecture

See docs/architecture/README.md for detailed architecture design.

Skills Collections

See docs/skills/RESOURCES.md for a comprehensive index of skill repositories compatible with mcp-skills, including:

  • Official Anthropic skills
  • Community collections (obra/superpowers, claude-mpm-skills, etc.)
  • Toolchain-specific skills (Python, TypeScript, Rust, Go, Java)
  • Operations & DevOps skills
  • MCP servers that provide skill-like capabilities

Troubleshooting

Model Download Issues

If you encounter problems downloading the embedding model on first run:

1. Check Internet Connection

The model is downloaded from HuggingFace Hub. Verify you can reach:

curl -I https://huggingface.co

2. Manual Model Download

Pre-download the model manually if automatic download fails:

python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"

This downloads the model to ~/.cache/huggingface/ and verifies it works.

3. Proxy Configuration

If behind a corporate proxy, configure environment variables:

export HTTP_PROXY=http://proxy.example.com:8080
export HTTPS_PROXY=http://proxy.example.com:8080
export HF_ENDPOINT=https://huggingface.co  # Or your mirror

4. Offline/Air-Gapped Installation

For environments without internet access:

On a machine with internet:

  1. Download the model:

    python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')"
    
  2. Package the model cache:

    cd ~/.cache/huggingface
    tar -czf sentence-transformers-model.tar.gz hub/
    

On the air-gapped machine:

  1. Transfer sentence-transformers-model.tar.gz to the target machine

  2. Extract to the HuggingFace cache directory:

    mkdir -p ~/.cache/huggingface
    cd ~/.cache/huggingface
    tar -xzf /path/to/sentence-transformers-model.tar.gz
    
  3. Install mcp-skills (transfer wheel if needed):

    pip install mcp-skillkit  # Or install from wheel
    
  4. Verify the setup:

    mcp-skills health
    

5. Custom Cache Location

If you need to use a different cache directory:

export HF_HOME=/custom/path/to/cache
export TRANSFORMERS_CACHE=/custom/path/to/cache
mcp-skills setup

6. Disk Space Issues

Check available space in the cache directory:

df -h ~/.cache/huggingface

The model requires ~90MB, but allow ~100MB for temporary files during download.

7. Permission Issues

Ensure the cache directory is writable:

mkdir -p ~/.cache/huggingface
chmod 755 ~/.cache/huggingface

Common Issues

"Connection timeout" during model download

  • Check internet connection and firewall settings
  • Try manual download (see step 2 above)
  • Configure proxy if behind corporate network (see step 3 above)

"No space left on device"

  • Check disk space: df -h ~/.cache
  • Clear old HuggingFace cache: rm -rf ~/.cache/huggingface/*
  • Use custom cache location (see step 5 above)

"Permission denied" on cache directory

  • Fix permissions: chmod 755 ~/.cache/huggingface
  • Or use custom cache location with proper permissions

Slow initial setup

  • First run downloads ~90MB and builds indices
  • Expected time: 2-10 minutes depending on connection speed and number of skills
  • Subsequent runs use cached model and are much faster

Getting Help

If you encounter issues not covered here:

  1. Check GitHub Issues
  2. Review logs: ~/.mcp-skills/logs/
  3. Run health check: mcp-skills health
  4. Open a new issue with:
    • Error message and stack trace
    • Output of mcp-skills --version
    • Operating system and Python version
    • Steps to reproduce

Contributing

Contributions welcome! Please read our contributing guidelines first.

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run make quality to ensure tests pass
  5. Submit a pull request

License

MIT License - see LICENSE for details.

Acknowledgments

Links


Status: ✅ v0.1.0 - Production Ready | Test Coverage: 85-96% | Tests: 48 passing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_skillkit-0.1.0.tar.gz (58.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_skillkit-0.1.0-py3-none-any.whl (66.3 kB view details)

Uploaded Python 3

File details

Details for the file mcp_skillkit-0.1.0.tar.gz.

File metadata

  • Download URL: mcp_skillkit-0.1.0.tar.gz
  • Upload date:
  • Size: 58.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for mcp_skillkit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7cdea33185b32e3f6be1e5fcf27e1bea285659a591f6e95c5a56cc1757e9369a
MD5 3873419223ed6488f37dac1d8e8ea84f
BLAKE2b-256 352a6ecdc543b4940c52055a37db277556c19db9970496d65b9d6b1cb65efd03

See more details on using hashes here.

File details

Details for the file mcp_skillkit-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mcp_skillkit-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 66.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for mcp_skillkit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 645b02aadec9c7c568ef101062a2028279ae8309da5efa5d1a7062fbbe54a903
MD5 9df29389f5c95b134345fdf590ffe2c0
BLAKE2b-256 7ec4e70f2f8073c16fdafd8fac5f75a6b8b8d416802d07aed73f435f533a7fb3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page