Skip to main content

A language-aware semantic code search MCP server with intelligent filtering and 9.3x better dependency analysis

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

SemanticScout

A language-aware semantic code search MCP server with intelligent filtering and enhanced dependency analysis.

Python 3.12+ License: MIT GPU Support

🚀 Features

  • 🧠 Semantic Search: Natural language understanding of code functionality
  • 🔍 Symbol Resolution: Precise function/class/variable lookup with AST parsing
  • 📊 Dependency Tracking: Advanced dependency graph analysis (9.3x better than v3.1.4)
  • 🎯 Context Expansion: Intelligent code context retrieval with multiple expansion levels
  • ⚡ GPU Acceleration: 5-10x faster embedding generation with CUDA support
  • 🔧 Language Support: Python, JavaScript, TypeScript, Java, C#, Go, Rust, and more
  • 🎛️ Query Intent Tracking: Meta-learning system for search optimization
  • 🚀 High Performance: Optimized indexing with parallel processing

📋 Prerequisites

  • Python 3.12 or higher (required)
  • 4GB+ RAM recommended
  • SSD storage for best performance
  • Optional: NVIDIA GPU with CUDA support for 5-10x acceleration
  • Optional: Ollama with nomic-embed-text model (alternative embedding provider)

🛠️ Installation

Python 3.12 Setup

macOS (Homebrew):

brew install python@3.12

Windows:

  • Download from python.org
  • Or use winget: winget install Python.Python.3.12

Linux (Ubuntu/Debian):

sudo apt update
sudo apt install python3.12 python3.12-venv python3.12-pip

Linux (CentOS/RHEL/Fedora):

sudo dnf install python3.12 python3.12-pip

SemanticScout Installation

  1. Install SemanticScout:

    # Basic installation (uses sentence-transformers by default)
    pip install semanticscout
    
    # With GPU support (recommended for 5-10x performance boost)
    pip install semanticscout[gpu]
    
  2. Verify Installation:

    python -c "import semanticscout; print('✅ SemanticScout installed')"
    
  3. Optional: Install Ollama (alternative embedding provider):

    # Download from https://ollama.ai
    ollama pull nomic-embed-text
    

🎮 GPU Support (Recommended for Performance)

SemanticScout uses sentence-transformers by default and supports GPU acceleration for 5-10x faster embedding generation.

Two ways to enable GPU support:

Option 1: Install with GPU extras (easiest):

pip install semanticscout[gpu]

Option 2: Manual PyTorch GPU installation:

# For CUDA 12.4 (recommended)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

# For CUDA 11.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Verify GPU Support:

python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'GPU devices: {torch.cuda.device_count()}')"

Performance Comparison:

  • CPU: ~100-200 embeddings/second
  • GPU: ~1000-2000 embeddings/second (RTX 3090/4090)
  • Large codebases: 5-10x faster indexing
  • Auto-detection: GPU used automatically if available, graceful CPU fallback

⚙️ Configuration

Configure SemanticScout in your MCP client (e.g., Claude Desktop's claude_desktop_config.json):

{
  "mcpServers": {
    "semanticscout": {
      "command": "uvx",
      "args": ["--python", "3.12", "semanticscout@latest"],
      "env": {
        "EMBEDDING_PROVIDER": "sentence-transformers",
        "DEVICE_PREFERENCE": "auto",
        "GPU_BATCH_SIZE": "64",
        "SEMANTICSCOUT_ENABLE_ENHANCEMENTS": "true"
      }
    }
  }
}

For Ollama (alternative embedding provider):

{
  "mcpServers": {
    "semanticscout": {
      "command": "uvx",
      "args": ["--python", "3.12", "semanticscout@latest"],
      "env": {
        "EMBEDDING_PROVIDER": "ollama",
        "OLLAMA_BASE_URL": "http://localhost:11434",
        "OLLAMA_MODEL": "nomic-embed-text"
      }
    }
  }
}

GPU Configuration Options

# Auto-detect GPU (default)
DEVICE_PREFERENCE=auto

# Force GPU usage
DEVICE_PREFERENCE=cuda

# Force CPU usage
DEVICE_PREFERENCE=cpu

# Customize batch sizes
GPU_BATCH_SIZE=128
CPU_BATCH_SIZE=32

# Enable GPU memory monitoring
GPU_MEMORY_MONITORING=true

🚀 Quick Start

  1. Index your codebase:

    # Index current directory
    index_codebase()
    
    # Index specific path
    index_codebase(path="/path/to/your/project")
    
  2. Search your code:

    # Natural language search
    search_code("authentication functions", collection_name="your_project")
    
    # Find specific symbols
    find_symbol("UserController", collection_name="your_project")
    
  3. Check GPU status:

    get_gpu_status()
    

📚 Documentation

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments


Made with ❤️ for developers who want smarter code search

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semanticscout-3.4.2.tar.gz (197.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semanticscout-3.4.2-py3-none-any.whl (189.0 kB view details)

Uploaded Python 3

File details

Details for the file semanticscout-3.4.2.tar.gz.

File metadata

  • Download URL: semanticscout-3.4.2.tar.gz
  • Upload date:
  • Size: 197.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for semanticscout-3.4.2.tar.gz
Algorithm Hash digest
SHA256 bc4af4d7a8fd7b86dabafd87553bb800c1124ebf0512e156b868cdd997c68189
MD5 7a3f60678eecaa388abbcdf1cf1a37e3
BLAKE2b-256 0dfc5676426a034fb9dd41e39bf8976210f70dbca8944e6f495701c418d514ef

See more details on using hashes here.

File details

Details for the file semanticscout-3.4.2-py3-none-any.whl.

File metadata

  • Download URL: semanticscout-3.4.2-py3-none-any.whl
  • Upload date:
  • Size: 189.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for semanticscout-3.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 131f18e8db52e394934ee9cee07fcfc6f10c87873bda462eacabe367e752f945
MD5 2632e1de102616e75d44324484df8a94
BLAKE2b-256 fc7dec46d8cdbc761ae454a5e341817e1a0fc67aae4ea8ac46c0312af429265b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page