MCP server for semantic code search using local embeddings
Project description
Semantic Search MCP Server
An MCP server that provides semantic code search using local embeddings. Search your codebase with natural language queries like "authentication middleware" or "database connection pooling".
Features
- Hybrid search: Combines vector similarity (Jina code embeddings) with FTS5 keyword matching using Reciprocal Rank Fusion
- 165+ languages: Tree-sitter parsing for Python, TypeScript, JavaScript, Go, Rust, Java, C/C++, Ruby, PHP, and more
- Incremental indexing: File watcher automatically detects additions, modifications, and deletions
- Respects .gitignore: Honors your project's
.gitignorefiles (including nested ones) - Auto-initialization: Model loads and codebase indexes in the background on server startup
- Zero external APIs: All embeddings generated locally with FastEmbed
Installation
uv tool install semantic-search-mcp
Or with pip:
pip install semantic-search-mcp
Or run directly without installing:
uvx semantic-search-mcp
Quick Start
Add to Claude Code
Option A: Project-level config (recommended)
After installing with uv tool install or pip install, create .mcp.json in your project root:
{
"mcpServers": {
"semantic-search": {
"command": "semantic-search-mcp"
}
}
}
Option B: CLI
claude mcp add semantic-search -- semantic-search-mcp
Option C: Without installing (ephemeral)
If you prefer not to install, use uvx to run in an ephemeral environment:
{
"mcpServers": {
"semantic-search": {
"command": "uvx",
"args": ["semantic-search-mcp"]
}
}
}
Use
The server auto-initializes on startup.
Available Tools
| Tool | Description |
|---|---|
search_code |
Search codebase with natural language |
get_status |
Get server state, progress, and statistics |
pause_watcher |
Pause file watching (events discarded) |
resume_watcher |
Resume file watching |
reindex |
Start full reindex (runs in background) |
cancel_indexing |
Cancel running indexing job |
clear_index |
Wipe all indexed data |
exclude_paths |
Add paths to ignore (session-only) |
include_paths |
Remove paths from exclusion list |
How It Works
Indexing
On startup, the server:
- Scans your codebase for supported file types
- Parses code into semantic chunks (functions, classes, methods) using Tree-sitter
- Generates embeddings for each chunk using Jina's code embedding model
- Stores everything in a local SQLite database with vector search support
File Watching
The server monitors your codebase for changes in real-time:
| Event | Action |
|---|---|
| File created | Parsed, embedded, and added to index |
| File modified | Re-indexed if content hash changed |
| File deleted | Removed from index |
Changes are debounced (default 1s) to batch rapid modifications.
What Gets Indexed
Included:
- Files with code extensions:
.py,.js,.ts,.tsx,.jsx,.go,.rs,.java,.c,.cpp,.h,.rb,.php,.swift,.kt,.scala, and more
Excluded:
- Files matching
.gitignorepatterns (all.gitignorefiles in your project are respected) - Common non-code directories:
node_modules,__pycache__,.venv,build,dist,.git,vendor, etc. - Binary files and non-code file types
Configuration
Environment variables:
| Variable | Default | Description |
|---|---|---|
SEMANTIC_SEARCH_DB_PATH |
.semantic-search/index.db |
Index database location |
SEMANTIC_SEARCH_EMBEDDING_MODEL |
jinaai/jina-embeddings-v2-base-code |
Embedding model |
SEMANTIC_SEARCH_MIN_SCORE |
0.3 |
Minimum relevance threshold (0-1) |
SEMANTIC_SEARCH_DEBOUNCE_MS |
1000 |
File watcher debounce in milliseconds |
SEMANTIC_SEARCH_BATCH_SIZE |
50 |
Files per batch (reduce if running out of memory) |
SEMANTIC_SEARCH_MAX_FILE_SIZE_KB |
512 |
Skip files larger than this (KB) |
SEMANTIC_SEARCH_EMBEDDING_BATCH_SIZE |
8 |
Texts per embedding call (reduce if OOM) |
SEMANTIC_SEARCH_EMBEDDING_THREADS |
4 |
ONNX runtime threads (higher = faster on multi-core) |
SEMANTIC_SEARCH_USE_QUANTIZED |
true |
Use INT8 quantized model (30-40% faster) |
Performance
GPU Acceleration
GPU acceleration is auto-detected and used when available:
| Platform | Provider | Installation |
|---|---|---|
| NVIDIA | CUDA | pip install semantic-search-mcp[gpu] |
| Apple Silicon | CoreML | Automatic (M1/M2/M3) |
| AMD | ROCm | Install ROCm-enabled onnxruntime |
| Windows | DirectML | Install DirectML-enabled onnxruntime |
Alternative Models
For faster indexing (with quality tradeoffs), you can use a lighter model:
| Model | Dimensions | Speed | Best For |
|---|---|---|---|
jinaai/jina-embeddings-v2-base-code |
768 | Baseline | Code search (default) |
BAAI/bge-small-en-v1.5 |
384 | ~10x faster | General text |
sentence-transformers/all-MiniLM-L6-v2 |
384 | ~32x faster | Speed priority |
To use an alternative model:
export SEMANTIC_SEARCH_EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2"
Note: Changing models requires a full reindex (delete .semantic-search/ directory).
UniXcoder (Experimental)
Microsoft UniXcoder is a code-specific model pre-trained on code + AST + comments. It may provide better semantic understanding of code structure, but is substantially slower (~20x slower than Jina).
| Model | Dimensions | Speed | Languages |
|---|---|---|---|
microsoft/unixcoder-base |
768 | ~20x slower | 6 (java, ruby, python, php, js, go) |
microsoft/unixcoder-base-nine |
768 | ~20x slower | 9 (+ c, c++, c#) |
Installation (requires additional dependencies):
pip install semantic-search-mcp[unixcoder]
Usage:
export SEMANTIC_SEARCH_EMBEDDING_MODEL="microsoft/unixcoder-base-nine"
When to use UniXcoder:
- You prioritize search quality over indexing speed
- Your codebase is small to medium sized
- You have GPU acceleration (CUDA or Apple Silicon MPS)
When to avoid UniXcoder:
- Large codebases (10,000+ files) - indexing will take hours
- You need fast initial indexing
- Running on CPU without GPU acceleration
Claude Code Integration
Skills and commands are automatically installed when the MCP server first starts:
- Skills →
~/.claude/skills/(AI auto-discovery) - Commands →
~/.claude/commands/(user-invocable slash commands)
To manually reinstall or update:
semantic-search-mcp-install-skills
Available Slash Commands
| Command | Description |
|---|---|
/semantic-search-search <query> |
Search codebase with natural language |
/semantic-search-status |
Check server status and index stats |
/semantic-search-reindex |
Trigger full codebase reindex |
/semantic-search-cancel |
Cancel running indexing job |
/semantic-search-clear |
Wipe all indexed data |
/semantic-search-pause |
Pause file watcher |
/semantic-search-resume |
Resume file watcher |
Requirements
- Python 3.11+
- ~700MB disk for embedding model (downloaded on first run, ~150MB with INT8 quantization)
- ~1GB RAM for embedding model
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file semantic_search_mcp-0.3.0.tar.gz.
File metadata
- Download URL: semantic_search_mcp-0.3.0.tar.gz
- Upload date:
- Size: 157.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed023413029e9891cb8ecdb5c69e8a20320403536aa061378dbbf9ef045e94f4
|
|
| MD5 |
17cb0501d322af834f4a8cc667cfe0a9
|
|
| BLAKE2b-256 |
731d5a0265121d642d7549a8f3f407b6fbd0ce65cbf04d800a0af51fe332d3cb
|
File details
Details for the file semantic_search_mcp-0.3.0-py3-none-any.whl.
File metadata
- Download URL: semantic_search_mcp-0.3.0-py3-none-any.whl
- Upload date:
- Size: 43.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56f8f164a7946d873226e1416539536ac877e4bdb9d8e7b588906df2b8b36e7e
|
|
| MD5 |
016056ddd7e0d923896b9adf0d3a4da1
|
|
| BLAKE2b-256 |
d151c2ee85a8da95bb690a7e419378313455f6176d0419b7cb42716ec1c258ba
|