A Model Context Protocol (MCP) server for semantic search and Retrieval-Augmented Generation (RAG) over local codebases and documents.
Project description
🔍 Source-MCP
A Model Context Protocol (MCP) server for semantic search and Retrieval-Augmented Generation (RAG) over local codebases and documents.
📖 Overview
Source-MCP leverages the Model Context Protocol to provide AI assistants (like Claude, Gemini, and others) with direct access to local files through semantic search.
Instead of manually copy-pasting code or documentation into your prompts, Source-MCP automatically indexes your local repository, generates vector embeddings, and enables the AI to semantically search and retrieve only the most relevant files.
✨ Key Features
- Dual Embedding Support:
- OpenAI: Uses robust
text-embedding-3-small(1536 dimensions) for high-quality enterprise embeddings. - FastEmbed (Local): Uses
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2(384 dims). Runs entirely locally, no API keys required, and supports multilingual inquiries.
- OpenAI: Uses robust
- Smart Incremental Indexing: Uses file fingerprints (modified time + size) to only index new or modified files, ensuring lightning-fast startup times.
- Auto-Migration: Automatically detects embedding dimension changes (e.g., switching from OpenAI to FastEmbed) and safely recreates the vector index.
- Web Dashboard (Port 8000):
- Live Logs: View real-time indexing and search activity with auto-scroll.
- Reindex Base: Force-wipe the vector DB and manifest for a completely fresh full scan.
- Reindex Base: Force-wipe the vector DB and manifest for a completely fresh full scan.
- Search Debugging: Special endpoint (
/api/search/debug?q=...) to test raw semantic search scores.
🤔 Why local embeddings and zvec?
We use zvec, a lightweight, high-performance vector database maintained by Alibaba. zvec is embedded directly into the Python process, eliminating the need to set up or run external vector servers (like Pinecone, Milvus, or Qdrant). Combined with FastEmbed, this allows Source-MCP to build the entire semantic search pipeline fully offline, quickly, and entirely on your local machine.
🚀 Installation & Setup
-
Prerequisites: Ensure you have Python 3.10+ and
uvinstalled. -
Clone the repository:
git clone https://github.com/AlexShimmy/source-mcp.git cd source-mcp
-
Install Dependencies:
# uv will automatically handle virtual environment creation and dependencies uv sync
⚙️ Configuration
Create a .env file in the root directory (you can copy .env.example if available).
# Choose your provider: "openai" or "fastembed"
EMBEDDING_PROVIDER=openai
# Required ONLY if using OpenAI
# Required ONLY if using OpenAI
OPENAI_API_KEY=sk-your-openai-api-key
# Optional: Path to store the vector database (Defaults to `.source-mcp/zvec_db` in the index dir)
ZVEC_PATH=./zvec_db
# Optional: Which directory to index (Defaults to current directory)
SOURCE_MCP_INDEX_DIR=/path/to/your/project
# Optional: Port for the Web Dashboard (Defaults to 8000)
WEB_PORT=8000
🖱️ Usage
Running Manually (Terminal & Dashboard)
To start the MCP server manually and access the web dashboard:
uv run python -m src.main --path .
- The MCP protocol will listen on
stdio. - The Web Dashboard will be available at http://localhost:8000.
🔌 MCP Configuration
The config is the same for all clients (Claude Desktop, Cursor, VS Code / Cline, etc.):
{
"mcpServers": {
"source-mcp": {
"command": "uv",
"args": ["--directory", "/absolute/path/to/source-mcp", "run", "python", "-m", "src.main"]
}
}
}
All other settings (such as SOURCE_MCP_INDEX_DIR, EMBEDDING_PROVIDER or OPENAI_API_KEY) should be configured via the .env file in the root directory of Source-MCP.
🧪 Testing
The project uses pytest for unit and end-to-end tests. To run the test suite:
uv run python -m pytest tests/ -v
📜 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file source_mcp-0.1.3b5.tar.gz.
File metadata
- Download URL: source_mcp-0.1.3b5.tar.gz
- Upload date:
- Size: 120.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f4d7592e69a2489a30da5be7cdc77277e51f4cf17d3bfe753165562d09cce07
|
|
| MD5 |
79e8a7deeeb2fed4c38a4e5d10075593
|
|
| BLAKE2b-256 |
1a3d903b7121d9dceff7d8f1986ed6b4801a62efc5c69d6c93647c93a48204a3
|
File details
Details for the file source_mcp-0.1.3b5-py3-none-any.whl.
File metadata
- Download URL: source_mcp-0.1.3b5-py3-none-any.whl
- Upload date:
- Size: 23.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16ba0b46a1f3f4b7e5ecde640e5468e145fac00eb773722d95f2f4eb4c3c77a9
|
|
| MD5 |
b6402d57246ff7b7f8a28bf21516802b
|
|
| BLAKE2b-256 |
31cdb4934365d0aada8eec544bb64743d3ee8ccf3c6d66017e3b7d136e90551d
|