Skip to main content

MCP server for semantic code search using local embeddings

Project description

Semantic Search MCP Server

An MCP server that provides semantic code search using local embeddings. Search your codebase with natural language queries like "authentication middleware" or "database connection pooling".

Features

  • Hybrid search: Combines vector similarity (Jina code embeddings) with FTS5 keyword matching using Reciprocal Rank Fusion
  • 165+ languages: Tree-sitter parsing for Python, TypeScript, JavaScript, Go, Rust, Java, C/C++, Ruby, PHP, and more
  • Incremental indexing: File watcher automatically detects additions, modifications, and deletions
  • Respects .gitignore: Honors your project's .gitignore files (including nested ones)
  • Auto-initialization: Model loads and codebase indexes in the background on server startup
  • Zero external APIs: All embeddings generated locally with FastEmbed

Installation

uv tool install semantic-search-mcp

Or with pip:

pip install semantic-search-mcp

Or run directly without installing:

uvx semantic-search-mcp

Quick Start

Add to Claude Code

Option A: Project-level config (recommended)

After installing with uv tool install or pip install, create .mcp.json in your project root:

{
  "mcpServers": {
    "semantic-search": {
      "command": "semantic-search-mcp"
    }
  }
}

Option B: CLI

claude mcp add semantic-search -- semantic-search-mcp

Option C: Without installing (ephemeral)

If you prefer not to install, use uvx to run in an ephemeral environment:

{
  "mcpServers": {
    "semantic-search": {
      "command": "uvx",
      "args": ["semantic-search-mcp"]
    }
  }
}

Use

The server auto-initializes on startup.

Available Tools

Tool Description
search_code Search codebase with natural language
get_status Get server state, progress, and statistics
pause_watcher Pause file watching (events discarded)
resume_watcher Resume file watching
reindex Start full reindex (runs in background)
cancel_indexing Cancel running indexing job
clear_index Wipe all indexed data
exclude_paths Add paths to ignore (session-only)
include_paths Remove paths from exclusion list

How It Works

Indexing

On startup, the server:

  1. Scans your codebase for supported file types
  2. Parses code into semantic chunks (functions, classes, methods) using Tree-sitter
  3. Generates embeddings for each chunk using Jina's code embedding model
  4. Stores everything in a local SQLite database with vector search support

File Watching

The server monitors your codebase for changes in real-time:

Event Action
File created Parsed, embedded, and added to index
File modified Re-indexed if content hash changed
File deleted Removed from index

Changes are debounced (default 1s) to batch rapid modifications.

What Gets Indexed

Included:

  • Files with code extensions: .py, .js, .ts, .tsx, .jsx, .go, .rs, .java, .c, .cpp, .h, .rb, .php, .swift, .kt, .scala, and more

Excluded:

  • Files matching .gitignore patterns (all .gitignore files in your project are respected)
  • Common non-code directories: node_modules, __pycache__, .venv, build, dist, .git, vendor, etc.
  • Binary files and non-code file types

Configuration

Environment variables:

Variable Default Description
SEMANTIC_SEARCH_DB_PATH .semantic-search/index.db Index database location
SEMANTIC_SEARCH_EMBEDDING_MODEL jinaai/jina-embeddings-v2-base-code Embedding model
SEMANTIC_SEARCH_MIN_SCORE 0.3 Minimum relevance threshold (0-1)
SEMANTIC_SEARCH_DEBOUNCE_MS 1000 File watcher debounce in milliseconds
SEMANTIC_SEARCH_BATCH_SIZE 50 Files per batch (reduce if running out of memory)
SEMANTIC_SEARCH_MAX_FILE_SIZE_KB 512 Skip files larger than this (KB)
SEMANTIC_SEARCH_EMBEDDING_BATCH_SIZE 8 Texts per embedding call (reduce if OOM)
SEMANTIC_SEARCH_EMBEDDING_THREADS 4 ONNX runtime threads (higher = faster on multi-core)
SEMANTIC_SEARCH_USE_QUANTIZED true Use INT8 quantized model (30-40% faster)

Performance

GPU Acceleration

GPU acceleration is auto-detected and used when available:

Platform Provider Installation
NVIDIA CUDA pip install semantic-search-mcp[gpu]
Apple Silicon CoreML Automatic (M1/M2/M3)
AMD ROCm Install ROCm-enabled onnxruntime
Windows DirectML Install DirectML-enabled onnxruntime

Alternative Models

For faster indexing (with quality tradeoffs), you can use a lighter model:

Model Dimensions Speed Best For
jinaai/jina-embeddings-v2-base-code 768 Baseline Code search (default)
BAAI/bge-small-en-v1.5 384 ~10x faster General text
sentence-transformers/all-MiniLM-L6-v2 384 ~32x faster Speed priority

To use an alternative model:

export SEMANTIC_SEARCH_EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2"

Note: Changing models requires a full reindex (delete .semantic-search/ directory).

UniXcoder (Experimental)

Microsoft UniXcoder is a code-specific model pre-trained on code + AST + comments. It may provide better semantic understanding of code structure, but is substantially slower (~20x slower than Jina).

Model Dimensions Speed Languages
microsoft/unixcoder-base 768 ~20x slower 6 (java, ruby, python, php, js, go)
microsoft/unixcoder-base-nine 768 ~20x slower 9 (+ c, c++, c#)

Installation (requires additional dependencies):

pip install semantic-search-mcp[unixcoder]

Usage:

export SEMANTIC_SEARCH_EMBEDDING_MODEL="microsoft/unixcoder-base-nine"

When to use UniXcoder:

  • You prioritize search quality over indexing speed
  • Your codebase is small to medium sized
  • You have GPU acceleration (CUDA or Apple Silicon MPS)

When to avoid UniXcoder:

  • Large codebases (10,000+ files) - indexing will take hours
  • You need fast initial indexing
  • Running on CPU without GPU acceleration

Claude Code Integration

Skills and commands are automatically installed when the MCP server first starts:

  • Skills~/.claude/skills/ (AI auto-discovery)
  • Commands~/.claude/commands/ (user-invocable slash commands)

To manually reinstall or update:

semantic-search-mcp-install-skills

Available Slash Commands

Command Description
/semantic-search-search <query> Search codebase with natural language
/semantic-search-status Check server status and index stats
/semantic-search-reindex Trigger full codebase reindex
/semantic-search-cancel Cancel running indexing job
/semantic-search-clear Wipe all indexed data
/semantic-search-pause Pause file watcher
/semantic-search-resume Resume file watcher

Requirements

  • Python 3.11+
  • ~700MB disk for embedding model (downloaded on first run, ~150MB with INT8 quantization)
  • ~1GB RAM for embedding model

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semantic_search_mcp-0.3.0.tar.gz (157.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semantic_search_mcp-0.3.0-py3-none-any.whl (43.9 kB view details)

Uploaded Python 3

File details

Details for the file semantic_search_mcp-0.3.0.tar.gz.

File metadata

  • Download URL: semantic_search_mcp-0.3.0.tar.gz
  • Upload date:
  • Size: 157.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for semantic_search_mcp-0.3.0.tar.gz
Algorithm Hash digest
SHA256 ed023413029e9891cb8ecdb5c69e8a20320403536aa061378dbbf9ef045e94f4
MD5 17cb0501d322af834f4a8cc667cfe0a9
BLAKE2b-256 731d5a0265121d642d7549a8f3f407b6fbd0ce65cbf04d800a0af51fe332d3cb

See more details on using hashes here.

File details

Details for the file semantic_search_mcp-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for semantic_search_mcp-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 56f8f164a7946d873226e1416539536ac877e4bdb9d8e7b588906df2b8b36e7e
MD5 016056ddd7e0d923896b9adf0d3a4da1
BLAKE2b-256 d151c2ee85a8da95bb690a7e419378313455f6176d0419b7cb42716ec1c258ba

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page