MCP server for semantic code search using local embeddings

These details have not been verified by PyPI

Project links

Project description

Semantic Search MCP Server

An MCP server that provides semantic code search using local embeddings. Search your codebase with natural language queries like "authentication middleware" or "database connection pooling".

Features

Hybrid search: Combines vector similarity (Jina code embeddings) with FTS5 keyword matching using Reciprocal Rank Fusion
165+ languages: Tree-sitter parsing for Python, TypeScript, JavaScript, Go, Rust, Java, C/C++, Ruby, PHP, and more
Incremental indexing: File watcher automatically detects additions, modifications, and deletions
Respects .gitignore: Honors your project's .gitignore files (including nested ones)
Auto-initialization: Model loads and codebase indexes in the background on server startup
Zero external APIs: All embeddings generated locally with FastEmbed

Installation

uv tool install semantic-search-mcp

Or with pip:

pip install semantic-search-mcp

Or run directly without installing:

uvx semantic-search-mcp

Quick Start

Add to Claude Code

Option A: Project-level config (recommended)

After installing with uv tool install or pip install, create .mcp.json in your project root:

{
  "mcpServers": {
    "semantic-search": {
      "command": "semantic-search-mcp"
    }
  }
}

Option B: CLI

claude mcp add semantic-search -- semantic-search-mcp

Option C: Without installing (ephemeral)

If you prefer not to install, use uvx to run in an ephemeral environment:

{
  "mcpServers": {
    "semantic-search": {
      "command": "uvx",
      "args": ["semantic-search-mcp"]
    }
  }
}

Use

The server auto-initializes on startup.

Available Tools

Tool	Description
`search_code`	Search codebase with natural language
`get_status`	Get server state, progress, and statistics
`pause_watcher`	Pause file watching (events discarded)
`resume_watcher`	Resume file watching
`reindex`	Start full reindex (runs in background)
`cancel_indexing`	Cancel running indexing job
`clear_index`	Wipe all indexed data
`exclude_paths`	Add paths to ignore (session-only)
`include_paths`	Remove paths from exclusion list

How It Works

Indexing

On startup, the server:

Scans your codebase for supported file types
Parses code into semantic chunks (functions, classes, methods) using Tree-sitter
Generates embeddings for each chunk using Jina's code embedding model
Stores everything in a local SQLite database with vector search support

File Watching

The server monitors your codebase for changes in real-time:

Event	Action
File created	Parsed, embedded, and added to index
File modified	Re-indexed if content hash changed
File deleted	Removed from index

Changes are debounced (default 1s) to batch rapid modifications.

What Gets Indexed

Included:

Files with code extensions: .py, .js, .ts, .tsx, .jsx, .go, .rs, .java, .c, .cpp, .h, .rb, .php, .swift, .kt, .scala, and more

Excluded:

Files matching .gitignore patterns (all .gitignore files in your project are respected)
Common non-code directories: node_modules, __pycache__, .venv, build, dist, .git, vendor, etc.
Binary files and non-code file types

Configuration

Environment variables:

Variable	Default	Description
`SEMANTIC_SEARCH_DB_PATH`	`.semantic-search/index.db`	Index database location
`SEMANTIC_SEARCH_EMBEDDING_MODEL`	`jinaai/jina-embeddings-v2-base-code`	Embedding model
`SEMANTIC_SEARCH_MIN_SCORE`	`0.3`	Minimum relevance threshold (0-1)
`SEMANTIC_SEARCH_DEBOUNCE_MS`	`1000`	File watcher debounce in milliseconds
`SEMANTIC_SEARCH_BATCH_SIZE`	`50`	Files per batch (reduce if running out of memory)
`SEMANTIC_SEARCH_MAX_FILE_SIZE_KB`	`512`	Skip files larger than this (KB)
`SEMANTIC_SEARCH_EMBEDDING_BATCH_SIZE`	`8`	Texts per embedding call (reduce if OOM)
`SEMANTIC_SEARCH_EMBEDDING_THREADS`	`4`	ONNX runtime threads (higher = faster on multi-core)
`SEMANTIC_SEARCH_USE_QUANTIZED`	`true`	Use INT8 quantized model (30-40% faster)

Performance

GPU Acceleration

GPU acceleration is auto-detected and used when available:

Platform	Provider	Installation
NVIDIA	CUDA	`pip install semantic-search-mcp[gpu]`
Apple Silicon	CoreML	Automatic (M1/M2/M3)
AMD	ROCm	Install ROCm-enabled onnxruntime
Windows	DirectML	Install DirectML-enabled onnxruntime

Alternative Models

For faster indexing (with quality tradeoffs), you can use a lighter model:

Model	Dimensions	Speed	Best For
`jinaai/jina-embeddings-v2-base-code`	768	Baseline	Code search (default)
`BAAI/bge-small-en-v1.5`	384	~10x faster	General text
`sentence-transformers/all-MiniLM-L6-v2`	384	~32x faster	Speed priority

To use an alternative model:

export SEMANTIC_SEARCH_EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-v2"

Note: Changing models requires a full reindex (delete .semantic-search/ directory).

UniXcoder (Experimental)

Microsoft UniXcoder is a code-specific model pre-trained on code + AST + comments. It may provide better semantic understanding of code structure, but is substantially slower (~20x slower than Jina).

Model	Dimensions	Speed	Languages
`microsoft/unixcoder-base`	768	~20x slower	6 (java, ruby, python, php, js, go)
`microsoft/unixcoder-base-nine`	768	~20x slower	9 (+ c, c++, c#)

Installation (requires additional dependencies):

pip install semantic-search-mcp[unixcoder]

Usage:

export SEMANTIC_SEARCH_EMBEDDING_MODEL="microsoft/unixcoder-base-nine"

When to use UniXcoder:

You prioritize search quality over indexing speed
Your codebase is small to medium sized
You have GPU acceleration (CUDA or Apple Silicon MPS)

When to avoid UniXcoder:

Large codebases (10,000+ files) - indexing will take hours
You need fast initial indexing
Running on CPU without GPU acceleration

Claude Code Integration

Skills and commands are automatically installed when the MCP server first starts:

Skills → ~/.claude/skills/ (AI auto-discovery)
Commands → ~/.claude/commands/ (user-invocable slash commands)

To manually reinstall or update:

semantic-search-mcp-install-skills

Available Slash Commands

Command	Description
`/semantic-search-search <query>`	Search codebase with natural language
`/semantic-search-status`	Check server status and index stats
`/semantic-search-reindex`	Trigger full codebase reindex
`/semantic-search-cancel`	Cancel running indexing job
`/semantic-search-clear`	Wipe all indexed data
`/semantic-search-pause`	Pause file watcher
`/semantic-search-resume`	Resume file watcher

Requirements

Python 3.11+
~700MB disk for embedding model (downloaded on first run, ~150MB with INT8 quantization)
~1GB RAM for embedding model

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Dec 30, 2025

0.2.0

Dec 29, 2025

0.1.0

Dec 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semantic_search_mcp-0.3.0.tar.gz (157.9 kB view details)

Uploaded Dec 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

semantic_search_mcp-0.3.0-py3-none-any.whl (43.9 kB view details)

Uploaded Dec 30, 2025 Python 3

File details

Details for the file semantic_search_mcp-0.3.0.tar.gz.

File metadata

Download URL: semantic_search_mcp-0.3.0.tar.gz
Upload date: Dec 30, 2025
Size: 157.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for semantic_search_mcp-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`ed023413029e9891cb8ecdb5c69e8a20320403536aa061378dbbf9ef045e94f4`
MD5	`17cb0501d322af834f4a8cc667cfe0a9`
BLAKE2b-256	`731d5a0265121d642d7549a8f3f407b6fbd0ce65cbf04d800a0af51fe332d3cb`

See more details on using hashes here.

File details

Details for the file semantic_search_mcp-0.3.0-py3-none-any.whl.

File metadata

Download URL: semantic_search_mcp-0.3.0-py3-none-any.whl
Upload date: Dec 30, 2025
Size: 43.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for semantic_search_mcp-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`56f8f164a7946d873226e1416539536ac877e4bdb9d8e7b588906df2b8b36e7e`
MD5	`016056ddd7e0d923896b9adf0d3a4da1`
BLAKE2b-256	`d151c2ee85a8da95bb690a7e419378313455f6176d0419b7cb42716ec1c258ba`

See more details on using hashes here.

semantic-search-mcp 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Semantic Search MCP Server

Features

Installation

Quick Start

Add to Claude Code

Use

Available Tools

How It Works

Indexing

File Watching

What Gets Indexed

Configuration

Performance

GPU Acceleration

Alternative Models

UniXcoder (Experimental)

Claude Code Integration

Available Slash Commands

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes