A language-aware semantic code search MCP server with intelligent filtering and 9.3x better dependency analysis
This project has been archived.
The maintainers of this project have marked this project as archived. No new releases are expected.
Project description
SemanticScout
A language-aware semantic code search MCP server with intelligent filtering and enhanced dependency analysis.
🚀 Features
- 🧠 Semantic Search: Natural language understanding of code functionality
- 🔍 Symbol Resolution: Precise function/class/variable lookup with AST parsing
- 📊 Dependency Tracking: Advanced dependency graph analysis (9.3x better than v3.1.4)
- 🎯 Context Expansion: Intelligent code context retrieval with multiple expansion levels
- ⚡ GPU Acceleration: 5-10x faster embedding generation with CUDA support
- 🔧 Language Support: Python, JavaScript, TypeScript, Java, C#, Go, Rust, and more
- 🎛️ Query Intent Tracking: Meta-learning system for search optimization
- 🚀 High Performance: Optimized indexing with parallel processing
📋 Prerequisites
- Python 3.12 or higher (required)
- 4GB+ RAM recommended
- SSD storage for best performance
- Optional: NVIDIA GPU with CUDA support for 5-10x acceleration
- Optional: Ollama with
nomic-embed-textmodel (alternative embedding provider)
🛠️ Installation
Python 3.12 Setup
macOS (Homebrew):
brew install python@3.12
Windows:
- Download from python.org
- Or use winget:
winget install Python.Python.3.12
Linux (Ubuntu/Debian):
sudo apt update
sudo apt install python3.12 python3.12-venv python3.12-pip
Linux (CentOS/RHEL/Fedora):
sudo dnf install python3.12 python3.12-pip
SemanticScout Installation
-
Install SemanticScout:
# Basic installation (uses sentence-transformers by default) pip install semanticscout # With GPU support (recommended for 5-10x performance boost) pip install semanticscout[gpu]
-
Verify Installation:
python -c "import semanticscout; print('✅ SemanticScout installed')"
-
Optional: Install Ollama (alternative embedding provider):
# Download from https://ollama.ai ollama pull nomic-embed-text
🎮 GPU Support (Recommended for Performance)
SemanticScout uses sentence-transformers by default and supports GPU acceleration for 5-10x faster embedding generation.
Two ways to enable GPU support:
Option 1: Install with GPU extras (easiest):
pip install semanticscout[gpu]
Option 2: Manual PyTorch GPU installation:
# For CUDA 12.4 (recommended)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
# For CUDA 11.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Verify GPU Support:
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'GPU devices: {torch.cuda.device_count()}')"
Performance Comparison:
- CPU: ~100-200 embeddings/second
- GPU: ~1000-2000 embeddings/second (RTX 3090/4090)
- Large codebases: 5-10x faster indexing
- Auto-detection: GPU used automatically if available, graceful CPU fallback
⚙️ Configuration
Configure SemanticScout in your MCP client (e.g., Claude Desktop's claude_desktop_config.json):
{
"mcpServers": {
"semanticscout": {
"command": "uvx",
"args": ["--python", "3.12", "semanticscout@latest"],
"env": {
"EMBEDDING_PROVIDER": "sentence-transformers",
"DEVICE_PREFERENCE": "auto",
"GPU_BATCH_SIZE": "64",
"SEMANTICSCOUT_ENABLE_ENHANCEMENTS": "true"
}
}
}
}
For Ollama (alternative embedding provider):
{
"mcpServers": {
"semanticscout": {
"command": "uvx",
"args": ["--python", "3.12", "semanticscout@latest"],
"env": {
"EMBEDDING_PROVIDER": "ollama",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_MODEL": "nomic-embed-text"
}
}
}
}
GPU Configuration Options
# Auto-detect GPU (default)
DEVICE_PREFERENCE=auto
# Force GPU usage
DEVICE_PREFERENCE=cuda
# Force CPU usage
DEVICE_PREFERENCE=cpu
# Customize batch sizes
GPU_BATCH_SIZE=128
CPU_BATCH_SIZE=32
# Enable GPU memory monitoring
GPU_MEMORY_MONITORING=true
🚀 Quick Start
-
Index your codebase:
# Index current directory index_codebase() # Index specific path index_codebase(path="/path/to/your/project")
-
Search your code:
# Natural language search search_code("authentication functions", collection_name="your_project") # Find specific symbols find_symbol("UserController", collection_name="your_project")
-
Check GPU status:
get_gpu_status()
📚 Documentation
- User Guide - Complete usage guide
- API Reference - Detailed API documentation
- Configuration - Advanced configuration options
- Performance Tuning - Optimization tips
- Architecture - Technical architecture details
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Built with FastMCP framework
- Powered by ChromaDB vector database
- Uses tree-sitter for AST parsing
- GPU acceleration via PyTorch
- Embeddings by Ollama and sentence-transformers
Made with ❤️ for developers who want smarter code search
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file semanticscout-3.4.1.tar.gz.
File metadata
- Download URL: semanticscout-3.4.1.tar.gz
- Upload date:
- Size: 193.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eeb4fadc83ec9e90e4d8941f24e94038a128e56f5bd204968919555b5ac51ff2
|
|
| MD5 |
904308de1a3490a959ac0ade8ded19f5
|
|
| BLAKE2b-256 |
fe5483102c46a46524419958b02956da896aada007f626e78facce0392193821
|
File details
Details for the file semanticscout-3.4.1-py3-none-any.whl.
File metadata
- Download URL: semanticscout-3.4.1-py3-none-any.whl
- Upload date:
- Size: 184.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87767b80cfbb7c0ca7238dade21554aadc89a9eea0c8c2b3544f17798316fd72
|
|
| MD5 |
bfaf48d5db46a51a5f7edfbf821eafb7
|
|
| BLAKE2b-256 |
39f6d13d533671e2e97f024712db2ab71e0664791359fdc87e05a259e96a20fe
|