Skip to main content

A language-aware semantic code search MCP server with intelligent filtering and 9.3x better dependency analysis

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

SemanticScout

A language-aware semantic code search MCP server with intelligent filtering and enhanced dependency analysis.

Python 3.12+ License: MIT GPU Support

🚀 Features

  • 🧠 Semantic Search: Natural language understanding of code functionality
  • 🔍 Symbol Resolution: Precise function/class/variable lookup with AST parsing
  • 📊 Dependency Tracking: Advanced dependency graph analysis (9.3x better than v3.1.4)
  • 🎯 Context Expansion: Intelligent code context retrieval with multiple expansion levels
  • ⚡ GPU Acceleration: 5-10x faster embedding generation with CUDA support
  • 🔧 Language Support: Python, JavaScript, TypeScript, Java, C#, Go, Rust, and more
  • 🎛️ Query Intent Tracking: Meta-learning system for search optimization
  • 🚀 High Performance: Optimized indexing with parallel processing

📋 Prerequisites

  • Python 3.12 or higher (required)
  • 4GB+ RAM recommended
  • SSD storage for best performance
  • Optional: NVIDIA GPU with CUDA support for 5-10x acceleration
  • Optional: Ollama with nomic-embed-text model (alternative embedding provider)

🛠️ Installation

Python 3.12 Setup

macOS (Homebrew):

brew install python@3.12

Windows:

  • Download from python.org
  • Or use winget: winget install Python.Python.3.12

Linux (Ubuntu/Debian):

sudo apt update
sudo apt install python3.12 python3.12-venv python3.12-pip

Linux (CentOS/RHEL/Fedora):

sudo dnf install python3.12 python3.12-pip

SemanticScout Installation

  1. Install SemanticScout:

    # Basic installation (uses sentence-transformers by default)
    pip install semanticscout
    
    # With GPU support (recommended for 5-10x performance boost)
    pip install semanticscout[gpu]
    
  2. Verify Installation:

    python -c "import semanticscout; print('✅ SemanticScout installed')"
    
  3. Optional: Install Ollama (alternative embedding provider):

    # Download from https://ollama.ai
    ollama pull nomic-embed-text
    

🎮 GPU Support (Recommended for Performance)

SemanticScout uses sentence-transformers by default and supports GPU acceleration for 5-10x faster embedding generation.

Two ways to enable GPU support:

Option 1: Install with GPU extras (easiest):

pip install semanticscout[gpu]

Option 2: Manual PyTorch GPU installation:

# For CUDA 12.4 (recommended)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124

# For CUDA 11.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Verify GPU Support:

python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'GPU devices: {torch.cuda.device_count()}')"

Performance Comparison:

  • CPU: ~100-200 embeddings/second
  • GPU: ~1000-2000 embeddings/second (RTX 3090/4090)
  • Large codebases: 5-10x faster indexing
  • Auto-detection: GPU used automatically if available, graceful CPU fallback

⚙️ Configuration

Configure SemanticScout in your MCP client (e.g., Claude Desktop's claude_desktop_config.json):

{
  "mcpServers": {
    "semanticscout": {
      "command": "uvx",
      "args": ["--python", "3.12", "semanticscout@latest"],
      "env": {
        "EMBEDDING_PROVIDER": "sentence-transformers",
        "DEVICE_PREFERENCE": "auto",
        "GPU_BATCH_SIZE": "64",
        "SEMANTICSCOUT_ENABLE_ENHANCEMENTS": "true"
      }
    }
  }
}

For Ollama (alternative embedding provider):

{
  "mcpServers": {
    "semanticscout": {
      "command": "uvx",
      "args": ["--python", "3.12", "semanticscout@latest"],
      "env": {
        "EMBEDDING_PROVIDER": "ollama",
        "OLLAMA_BASE_URL": "http://localhost:11434",
        "OLLAMA_MODEL": "nomic-embed-text"
      }
    }
  }
}

GPU Configuration Options

# Auto-detect GPU (default)
DEVICE_PREFERENCE=auto

# Force GPU usage
DEVICE_PREFERENCE=cuda

# Force CPU usage
DEVICE_PREFERENCE=cpu

# Customize batch sizes
GPU_BATCH_SIZE=128
CPU_BATCH_SIZE=32

# Enable GPU memory monitoring
GPU_MEMORY_MONITORING=true

🚀 Quick Start

  1. Index your codebase:

    # Index current directory
    index_codebase()
    
    # Index specific path
    index_codebase(path="/path/to/your/project")
    
  2. Search your code:

    # Natural language search
    search_code("authentication functions", collection_name="your_project")
    
    # Find specific symbols
    find_symbol("UserController", collection_name="your_project")
    
  3. Check GPU status:

    get_gpu_status()
    

📚 Documentation

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments


Made with ❤️ for developers who want smarter code search

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semanticscout-3.4.1.tar.gz (193.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semanticscout-3.4.1-py3-none-any.whl (184.3 kB view details)

Uploaded Python 3

File details

Details for the file semanticscout-3.4.1.tar.gz.

File metadata

  • Download URL: semanticscout-3.4.1.tar.gz
  • Upload date:
  • Size: 193.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for semanticscout-3.4.1.tar.gz
Algorithm Hash digest
SHA256 eeb4fadc83ec9e90e4d8941f24e94038a128e56f5bd204968919555b5ac51ff2
MD5 904308de1a3490a959ac0ade8ded19f5
BLAKE2b-256 fe5483102c46a46524419958b02956da896aada007f626e78facce0392193821

See more details on using hashes here.

File details

Details for the file semanticscout-3.4.1-py3-none-any.whl.

File metadata

  • Download URL: semanticscout-3.4.1-py3-none-any.whl
  • Upload date:
  • Size: 184.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for semanticscout-3.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 87767b80cfbb7c0ca7238dade21554aadc89a9eea0c8c2b3544f17798316fd72
MD5 bfaf48d5db46a51a5f7edfbf821eafb7
BLAKE2b-256 39f6d13d533671e2e97f024712db2ab71e0664791359fdc87e05a259e96a20fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page