Skip to main content

Lightning-fast semantic code search and indexing with DuckDB vector operations

Project description

Turboprop ๐Ÿš€

Lightning-fast semantic code search with AI embeddings

Transform your codebase into a searchable knowledge base using natural language queries. Perfect for AI-assisted development with Claude Code and other AI coding assistants.

โœจ What Makes Turboprop Special

๐Ÿ” Hybrid Search - Combines semantic understanding with exact text matching using advanced fusion algorithms
๐ŸŽฏ Smart Ranking - Multi-factor ranking considers file type, recency, construct type, and semantic relevance
๐Ÿง  Structured Results - Rich metadata with confidence scores, match explanations, and IDE navigation links
๐Ÿ† Lightning Fast - DuckDB vector operations deliver sub-second search across massive codebases
๐Ÿ”„ Live Updates - Watch mode with intelligent debouncing keeps your index fresh as you code
๐Ÿค– Claude Code Enhanced - Advanced MCP integration with structured responses and query analysis
๐Ÿ”’ Safe Concurrent Access - Advanced file locking prevents corruption during multi-process operations
๐Ÿ“ Git-Aware - Respects .gitignore and only indexes what matters
๐Ÿ’ป Beautiful CLI - Rich terminal interface with progress indicators and helpful guidance

๐Ÿ†• Enhanced Search Capabilities

Turboprop's enhanced search system provides sophisticated code discovery that goes far beyond simple semantic similarity:

๐Ÿ”„ Hybrid Search Modes

AUTO Mode (Recommended) - Automatically chooses the best search strategy for your query

turboprop search "JWT authentication middleware" --mode auto

HYBRID Mode - Combines semantic understanding with exact text matching

turboprop search "error handling patterns" --mode hybrid --explain

SEMANTIC Mode - Pure conceptual search for finding similar functionality

turboprop search "user input validation logic" --mode semantic

TEXT Mode - Fast exact text matching for specific syntax

turboprop search "def authenticate(" --mode text

๐Ÿ“Š Rich Search Results

Every result includes comprehensive metadata:

# Example enhanced result
2. src/auth/validators.py:28-35 (confidence: 0.92) ๐Ÿ python [function]
   def validate_jwt_token(token: str) -> bool:
       """Validates JWT token signature and expiration."""
       try:
           payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
   
   โœจ Why this matches: Strong semantic match for JWT validation logic
   ๐Ÿ’ก Navigate: vscode://file/src/auth/validators.py:28
   ๐Ÿ•’ Modified: 2 days ago

Rich metadata includes:

  • ๐ŸŽฏ Confidence scores - How well results match your query (0.0-1.0)
  • ๐Ÿท๏ธ Language detection - Automatic programming language identification
  • ๐Ÿงฉ Construct types - Functions, classes, methods, constants, etc.
  • ๐Ÿ’ญ Match explanations - Clear reasons why each result was selected
  • ๐Ÿ”— IDE navigation - Direct links to VS Code, PyCharm, and other editors
  • โฐ File recency - Git-based modification timestamps

๐ŸŽฏ Advanced Ranking System

Multi-factor ranking considers multiple relevance signals:

  • Semantic similarity (40%) - How well the meaning matches
  • File type relevance (20%) - Language and file type matching
  • Construct type match (15%) - Code structure alignment
  • File recency (15%) - Recently modified files boost
  • File size optimization (10%) - Prefers appropriately-sized files

๐Ÿ”ง Powerful Configuration

Fine-tune search behavior with comprehensive configuration:

# Environment variables
export TURBOPROP_SEARCH_MODE=hybrid
export TURBOPROP_MAX_FILE_SIZE_MB=2.0
export TURBOPROP_SNIPPET_CONTEXT_LINES=5
export TURBOPROP_RRF_K=80

# Or use configuration files
echo '{"search": {"mode": "hybrid", "max_results": 15}}' > .turboprop/config.json

๐Ÿง  MCP Tool Search System

Turboprop now includes an advanced MCP Tool Search System that revolutionizes how Claude Code and other AI agents discover and select tools. Instead of manually knowing which tools exist, you can now use natural language to intelligently find, compare, and get recommendations for the optimal tools for any development task.

๐Ÿ” Key Capabilities

๐Ÿ” Natural Language Tool Discovery

# Find tools by functionality
search_mcp_tools("read configuration files safely")
search_mcp_tools("execute shell commands with timeout support")
search_mcp_tools("web scraping tools for JSON APIs")

๐ŸŽฏ Intelligent Tool Recommendations

# Get personalized recommendations based on task context
recommend_tools_for_task(
    "process CSV files and generate reports",
    context="performance critical, large files",
    complexity_preference="balanced"
)

โš–๏ธ Tool Comparison & Decision Support

# Compare multiple tools to understand trade-offs
compare_mcp_tools(["read", "write", "edit"])
find_tool_alternatives("bash", context_filter="beginner-friendly")

๐ŸŽฏ Search Strategies

The system supports multiple search modes optimized for different scenarios:

  • Hybrid Search (Recommended) - Combines semantic understanding with keyword matching
  • Semantic Search - Pure conceptual search for finding tools by purpose
  • Keyword Search - Fast exact term matching for specific tool names

๐Ÿงช Claude Code Integration

With the tool search system, Claude Code becomes dramatically more effective:

๐Ÿ” Tool Search Suggestion: For this file processing task, consider using 'read' 
instead of 'bash cat' for better error handling and parameter validation.

Confidence: High (0.91)
Reasoning: Direct file reading, built-in error handling, type safety

Claude Code can now:

  • Proactively suggest optimal tools during conversations
  • Context-aware selection based on your specific requirements
  • Learn from patterns to improve future recommendations
  • Explain reasoning behind tool choices

๐Ÿ“š Complete Documentation

The tool search system includes comprehensive documentation:

๐Ÿš€ Getting Started with Tool Search

The tool search system activates automatically with Claude Code. Try these enhanced prompts:

Tool Discovery:

  • "What tools are available for processing JSON data?"
  • "Find tools that can execute shell commands safely"
  • "Search for web scraping tools with error handling"

Task-Based Recommendations:

  • "What's the best tool for reading large configuration files?"
  • "Recommend tools for automating deployment tasks"
  • "I need to parse CSV files - what should I use?"

Tool Comparison:

  • "Compare the 'read' and 'bash' tools for file operations"
  • "What are the alternatives to the 'write' tool?"
  • "Help me choose between different web request tools"

๐Ÿš€ Quick Start with Claude Code

MCP Installation

Add this to your Claude Code MCP configuration:

{
  "mcpServers": {
    "turboprop": {
      "command": "uvx",
      "args": ["turboprop@latest", "mcp", "--repository", ".", "--auto-index"],
      "env": {}
    }
  }
}

Sample Claude Code Prompts

Once installed, try these prompts with Claude Code:

Search for specific patterns:

  • "Use the turboprop tools to find JWT authentication code in this repository"
  • "Search the codebase for error handling middleware patterns"
  • "Find React components that handle form validation"
  • "Look for database connection setup code"

Index management:

  • "Index this repository with turboprop and then search for API route handlers"
  • "Check the turboprop index status and tell me what files are indexed"
  • "Reindex this codebase and search for logging implementations"

Development workflow:

  • "Use turboprop to find code similar to what I'm working on and explain the patterns"
  • "Search for examples of how authentication is implemented in this project"
  • "Find all places where JSON parsing happens and show me the different approaches"

Available MCP Tools

Claude Code can use these tools automatically:

  • index_repository - Build searchable index from your codebase
  • search_code - Perform semantic search with natural language
  • get_index_status - Check index health and file counts
  • watch_repository - Monitor for changes (auto-enabled by default)
  • list_indexed_files - Show what files are in the index

Custom Slash Commands

Turboprop includes custom slash commands for Claude Code. Add these to your .claude/ directory:

/search [query] - Quick semantic search

/search JWT authentication
/search error handling patterns  
/search React form components

/index [path] - Index a repository

/index .
/index /path/to/project

/status - Check index status

/status

โš™๏ธ Standalone CLI Usage

Installation

# Install globally
pip install turboprop

# Or with uv (recommended)
uvx turboprop

Core Commands

Index your codebase:

turboprop index .                    # Index current directory  
turboprop index ~/my-project         # Index specific project
turboprop index . --max-mb 2.0      # Allow larger files
turboprop index . --force-all        # Force reprocessing

Search with natural language:

turboprop search "JWT authentication"              # Find auth code
turboprop search "parse JSON response"             # JSON parsing logic  
turboprop search "error handling middleware"       # Error patterns
turboprop search "React component for forms"       # Form components

Watch for live updates:

turboprop watch .                    # Monitor current directory
turboprop watch . --debounce-sec 3.0 # Faster updates

Start MCP server:

turboprop mcp --repository . --auto-index    # Full auto mode
turboprop mcp --repository . --no-auto-watch # Manual updates only

๐Ÿ”ง Advanced Features

Concurrent Access Protection

Turboprop uses advanced file locking to prevent database corruption:

  • Process-safe indexing - Multiple processes can safely access the same repository
  • Atomic operations - Index updates are completed fully or rolled back
  • Deadlock prevention - Smart lock ordering prevents system hangs
  • Graceful recovery - Automatic cleanup of stale locks on restart

Performance Optimization

Smart file filtering:

  • Respects .gitignore automatically
  • Configurable file size limits (--max-mb)
  • Skips binary and generated files

Efficient indexing:

  • Parallel processing with worker pools
  • Incremental updates (only changed files)
  • Memory-efficient batch processing

Fast search:

  • Native DuckDB vector operations
  • 384-dimension embeddings for accuracy
  • Cosine similarity ranking

Configuration Options

File size limits:

--max-mb 1.0    # Default: 1MB max file size
--max-mb 5.0    # Allow larger files
--max-mb 0.1    # Strict limit for huge repos

Watch mode timing:

--debounce-sec 5.0   # Default: 5 second debounce
--debounce-sec 1.0   # Faster updates  
--debounce-sec 10.0  # Less CPU usage

Search results:

--k 5     # Default: 5 results
--k 10    # More results
--k 1     # Just the best match

๐Ÿ’ก Search Query Tips

Effective Query Patterns

Be descriptive and specific:

  • โœ… "JWT token validation middleware"
  • โŒ "auth"

Ask conceptual questions:

  • โœ… "how to handle database connection errors"
  • โŒ "try catch db"

Combine multiple concepts:

  • โœ… "React form validation with custom hooks"
  • โŒ "react forms"

Use domain-specific language:

  • โœ… "OAuth2 authorization flow implementation"
  • โŒ "login stuff"

Example Queries by Use Case

Authentication & Security:

  • "JWT token validation and refresh logic"
  • "password hashing and salt generation"
  • "OAuth2 provider integration code"
  • "session management middleware"

API & Data:

  • "REST API error handling patterns"
  • "JSON schema validation logic"
  • "database query optimization"
  • "caching layer implementation"

Frontend & UI:

  • "React component state management"
  • "form validation with error messages"
  • "responsive design utility classes"
  • "event handler patterns"

๐Ÿ—๏ธ Architecture & Technical Details

Database Schema

CREATE TABLE code_files (
  id VARCHAR PRIMARY KEY,        -- SHA-256 hash of path + content
  path VARCHAR,                  -- Absolute file path  
  content TEXT,                  -- Full file content
  embedding DOUBLE[384]          -- 384-dimension vector embeddings
);

ML Model

  • Model: SentenceTransformer "all-MiniLM-L6-v2"
  • Dimensions: 384 (balanced accuracy/speed)
  • Similarity: Cosine similarity via DuckDB vector operations

File System

  • Index location: .turboprop/code_index.duckdb in each repository
  • Git integration: Uses git ls-files for file discovery
  • Ignore handling: Respects .gitignore automatically

๐Ÿ“š Comprehensive Documentation

Turboprop includes extensive documentation to help you master enhanced search capabilities:

๐Ÿ“– User Guides

๐Ÿ”ง Developer Resources

๐Ÿš€ Quick References

  • Search Modes: AUTO (recommended), HYBRID, SEMANTIC, TEXT
  • Result Metadata: Confidence scores, match explanations, IDE navigation, language detection
  • Configuration: Environment variables, config files, response detail levels
  • MCP Integration: Structured responses, query analysis, Claude Code prompts

๐Ÿ’ก Learning Path

  1. Start Here: Read the Enhanced Search Guide for effective query techniques
  2. Upgrading: Follow the Migration Guide if coming from basic search
  3. Deep Dive: Explore the API Documentation for programmatic usage
  4. Contributing: Check the Architecture Guide for development setup

๐Ÿค Contributing

Key areas for contribution:

  • Language-specific improvements (better syntax highlighting, smart parsing)
  • Performance optimizations for enormous codebases
  • IDE/editor plugin development
  • Advanced search features (regex filters, file type limits)
  • Better error recovery and user guidance

๐Ÿ“„ License

MIT License - use freely in your projects!


Ready to supercharge your code exploration with Claude Code? ๐Ÿš€โœจ

Turboprop: Because finding code should be as smooth as flying.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turboprop-0.2.8.tar.gz (561.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

turboprop-0.2.8-py3-none-any.whl (470.7 kB view details)

Uploaded Python 3

File details

Details for the file turboprop-0.2.8.tar.gz.

File metadata

  • Download URL: turboprop-0.2.8.tar.gz
  • Upload date:
  • Size: 561.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for turboprop-0.2.8.tar.gz
Algorithm Hash digest
SHA256 86d891fa11bd7ae08f9436faa72167d24d4eff422f4bd3d20fe3cc3cfc21c4a3
MD5 0ea7ce458cdfde2d39bbfe3dbe7bdc90
BLAKE2b-256 21436e23d9bc4e4998aa134b25bb7b26d7fd50e825580758e6085fe093efb2be

See more details on using hashes here.

File details

Details for the file turboprop-0.2.8-py3-none-any.whl.

File metadata

  • Download URL: turboprop-0.2.8-py3-none-any.whl
  • Upload date:
  • Size: 470.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for turboprop-0.2.8-py3-none-any.whl
Algorithm Hash digest
SHA256 95cf00319d70efbd2fe0ca6f38a5874c8b1da320af4860a923cbcaf045bc334f
MD5 92732dba18372def3eac966d801b27cd
BLAKE2b-256 d27f413f9afeb101eb741daeb7f3c84a4e6cb25f881cbea1ec78de3a783663a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page