Skip to main content

Lightning-fast semantic code search and indexing with DuckDB vector operations

Project description

Turboprop ๐Ÿš€

A lightning-fast, aviation-inspired semantic code search & indexing system with MCP integration.

Turboprop transforms your codebase into a searchable knowledge base using AI embeddings and vector search. Instead of grepping for exact text matches, ask questions in natural language and find conceptually similar code across your entire repository.

โœจ Key Features

  • ๐Ÿ” Semantic Search: Find code by meaning, not just keywords ("JWT authentication" finds auth logic)
  • ๐Ÿ† Lightning Fast: DuckDB + HNSWLib for sub-second search across large codebases
  • ๐Ÿ”„ Live Updates: Watch mode with intelligent debouncing - your index stays fresh as you code
  • ๐Ÿš€ HTTP API: RESTful FastAPI server for integration with tools and IDEs
  • ๐Ÿค– MCP Ready: Perfect integration with Claude and other AI coding assistants
  • ๐Ÿ“ Git-Aware: Respects .gitignore and only indexes what matters
  • ๐Ÿ’ฌ Rich CLI: Beautiful command-line interface with helpful guidance and examples

๐ŸŽฏ Use Cases

  • "Find similar functions": Search for "parse JSON response" and discover all JSON parsing code
  • Documentation queries: "error handling patterns" reveals how your team handles exceptions
  • Architectural exploration: "database connection setup" shows all DB-related initialization
  • Code review assistance: Quickly find existing implementations before writing new code
  • AI pair programming: Let Claude explore your codebase through natural language queries
  • Onboarding new developers: Help them explore the codebase through semantic search

๐Ÿš€ Quick Start

1. Installation

Option A: Using uvx (Recommended - Like npx for Python)

# No installation needed! Run directly:
uvx turboprop-mcp --repository /path/to/your/repo

# Or install globally for repeated use:
pip install turboprop

Option B: Development Setup

# Clone and setup for development
git clone https://github.com/glamp/turboprop.git
cd turboprop

# Install with uv (recommended - fastest)
uv sync

# OR install with pip + virtual environment
python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e .

2. Index Your Code

# Using installed version
turboprop index .

# Or run directly without installation
uvx turboprop index .
# OR if you have activated the virtual environment:
# python code_index.py index .

# Example output:
# ๐Ÿš€ Turboprop - Semantic Code Search
# ========================================
# โšก Initializing database and AI model...
# ๐Ÿ” Scanning repository: .
# ๐Ÿ“ Max file size: 1.0 MB
# โœจ Found 127 code files to index
# ๐Ÿง  Generating embeddings and storing in database...
# ๐Ÿ”ง Building search index...
# ๐ŸŽ‰ Index ready with 127 embeddings!

3. Search Your Code

# Search using natural language with beautiful results
uv run python code_index.py search "JWT authentication" --k 5

# Example output:
# ๐Ÿ”Ž Searching for: "JWT authentication"
# ๐Ÿ“Š Returning top 5 results...
#
# ๐ŸŽฏ Found 3 relevant results:
#
# โ”Œโ”€ 1. src/auth/middleware.py
# โ”‚  ๐Ÿ“ˆ Similarity: 94.2%
# โ”‚
# โ”‚  def verify_jwt_token(token: str):
# โ”‚      """Verify JWT token and extract user claims"""
# โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

โš™๏ธ CLI Usage

python code_index.py index /path/to/repo --max-mb 1.0
python code_index.py search "terms" --k 5
python code_index.py watch /path/to/repo --max-mb 1.0 --debounce-sec 5.0

๐Ÿค– MCP Server (Claude Integration)

Quick Start with uvx (Recommended)

# Start MCP server directly (like npx)
uvx turboprop-mcp --repository /path/to/your/repo --auto-index

# For continuous use
pip install turboprop
turboprop-mcp --repository /path/to/your/repo

Claude Desktop Configuration

Add to your Claude Desktop MCP configuration:

{
  "mcpServers": {
    "turboprop": {
      "command": "uvx",
      "args": ["turboprop-mcp", "--repository", "/absolute/path/to/your/repo", "--auto-index"]
    }
  }
}

Features Available in Claude

  • ๐Ÿ” Semantic Code Search: Ask Claude to find code using natural language
  • ๐Ÿ“ Repository Indexing: Automatically index and watch your codebase
  • ๐Ÿ”„ Real-time Updates: Index stays fresh as you code
  • ๐Ÿ’ฌ Natural Queries: "Find JWT authentication code", "Show error handling patterns"

HTTP API Server

Run the HTTP API server:

uvicorn server:app --reload  # or uvx server:app --reload

๐Ÿง  Optimized for Claude Code

Add .claude/code-index.commands.md for slash commands.


Thatโ€™s itโ€”fucking easy as pie. ๐Ÿฐ๐Ÿš€

๐Ÿ’ป Enhanced CLI Experience

Turboprop now features a beautiful, user-friendly CLI with:

Rich Help System

uv run python code_index.py --help     # Main help with examples
uv run python code_index.py index --help    # Detailed command help
uv run python code_index.py search --help   # Search-specific guidance

Visual Feedback

  • ๐Ÿš€ Branded startup messages
  • โšก Progress indicators with emojis
  • ๐Ÿ“Š Rich search result formatting
  • ๐Ÿ’ก Helpful tips and suggestions
  • โŒ Clear error messages with solutions

Smart Error Handling

  • Missing index detection with guidance
  • File size limit recommendations
  • Search result explanations
  • Recovery suggestions

๐Ÿ”— MCP Integration (Claude & AI Assistants)

Quick Setup for Claude

  1. Start the MCP server:

    uv run uvicorn server:app --host localhost --port 8000
    
  2. Index your repository:

    uv run python code_index.py index /path/to/your/repo
    
  3. Use in Claude with slash commands:

    • /search JWT authentication - Find auth-related code
    • /search React form validation - Find form components
    • /status - Check index health

Available MCP Tools

  • index_repository - Build searchable index from code repository
  • search_code - Search code using natural language queries
  • get_index_status - Check current state of the code index
  • watch_repository - Monitor repository for changes
  • list_indexed_files - Show files currently in the index

MCP Configuration

The repository includes ready-to-use MCP configuration:

  • .mcp/config.json - Server configuration
  • .claude/turboprop-commands.md - Claude slash commands

โš™๏ธ Complete CLI Reference

index - Build Search Index

uv run python code_index.py index <repository_path> [options]

Options:
  --max-mb FLOAT    Maximum file size in MB to index (default: 1.0)

Examples:
  uv run python code_index.py index .                    # Index current directory
  uv run python code_index.py index ~/my-project         # Index specific project
  uv run python code_index.py index . --max-mb 2.0      # Allow larger files

search - Semantic Code Search

uv run python code_index.py search "<query>" [options]

Options:
  --k INTEGER      Number of results to return (default: 5)

Query Examples:
  "JWT authentication"              โ†’ Find auth-related code
  "parse JSON response"             โ†’ Discover JSON parsing logic
  "error handling middleware"       โ†’ Locate error handling patterns
  "database connection setup"       โ†’ Find DB initialization code
  "React component for forms"       โ†’ Find form-related components

watch - Live Index Updates

uv run python code_index.py watch <repository_path> [options]

Options:
  --max-mb FLOAT        Maximum file size in MB (default: 1.0)
  --debounce-sec FLOAT  Seconds to wait before processing changes (default: 5.0)

Example:
  uv run python code_index.py watch . --debounce-sec 3.0  # Faster updates

๐ŸŒ HTTP API Server

Start the Server

# Development mode with auto-reload (using uv)
uv run uvicorn server:app --reload --host 0.0.0.0 --port 8000

# Production mode
uv run uvicorn server:app --host 0.0.0.0 --port 8000

# OR with activated virtual environment
uvicorn server:app --reload --host 0.0.0.0 --port 8000

API Endpoints

POST /index - Build Index

curl -X POST "http://localhost:8000/index" \
  -H "Content-Type: application/json" \
  -d '{"repo": "/path/to/repository", "max_mb": 1.0}'

# Response: {"status": "indexed", "files": 1247}

GET /search - Search Code

curl "http://localhost:8000/search?query=JWT%20authentication&k=5"

# Response:
[
  {
    "path": "/src/auth/middleware.py",
    "snippet": "def verify_jwt_token(token: str):\n    \"\"\"Verify JWT token and extract claims...",
    "distance": 0.234
  }
]

GET /status - Index Status

curl "http://localhost:8000/status"

# Response:
{
  "files_indexed": 1247,
  "database_size_mb": 125.6,
  "search_index_ready": true,
  "last_updated": "Recent",
  "embedding_dimensions": 384,
  "model_name": "all-MiniLM-L6-v2"
}

Interactive API Documentation

๐Ÿ’ก Pro Tips & Best Practices

Search Query Optimization

  • Use descriptive phrases: "authentication middleware" vs just "auth"
  • Ask conceptual questions: "how to handle errors" vs "try catch"
  • Combine multiple concepts: "JWT token validation middleware"
  • Be domain-specific: "React form validation" vs "form validation"

Performance Tuning

  • File size limit: Adjust --max-mb based on your repository size and available memory
  • Debounce timing: Lower --debounce-sec for faster updates, higher for less CPU usage
  • Search results: Increase --k to see more results, decrease for faster queries

File Management

  • Index files: code_index.duckdb (database) and hnsw_code.idx (search index)
  • Location: Created in the current working directory
  • Cleanup: Delete these files to reset the index completely
  • Backup: Copy these files to backup your index

๐Ÿ—๏ธ Architecture Overview

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Git Repository โ”‚โ”€โ”€โ”€โ–ถโ”‚  File Scanner    โ”‚โ”€โ”€โ”€โ–ถโ”‚  Code Files     โ”‚
โ”‚   (.gitignore    โ”‚    โ”‚  (scan_repo)     โ”‚    โ”‚  (.py, .js, etc)โ”‚
โ”‚    aware)        โ”‚    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                                       โ”‚
                                                          โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Search Results โ”‚โ—€โ”€โ”€โ”€โ”‚  HNSW Index     โ”‚โ—€โ”€โ”€โ”€โ”‚  Embeddings     โ”‚
โ”‚   (ranked by    โ”‚    โ”‚  (fast vector   โ”‚    โ”‚  (ML model:     โ”‚
โ”‚    similarity)  โ”‚    โ”‚   search)        โ”‚    โ”‚   all-MiniLM)   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                 โ–ฒ                        โ”‚
                                 โ”‚                        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   User Query    โ”‚โ”€โ”€โ”€โ–ถโ”‚  Query Embedding โ”‚    โ”‚  DuckDB Storage โ”‚
โ”‚   ("parse JSON") โ”‚    โ”‚  (same model)   โ”‚    โ”‚  (persistent)   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                 โ–ฒ
                                 โ”‚
                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                   โ”‚   FastAPI Server โ”‚ โ†โ”€โ”€ MCP Integration
                   โ”‚   (HTTP API)     โ”‚     Claude, etc.
                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿค Contributing

We welcome contributions! Key areas for improvement:

  • Additional programming language support
  • Performance optimizations for large repositories
  • IDE/editor integrations
  • Advanced search features (regex, file filters, etc.)
  • Better error handling and user feedback
  • Enhanced MCP tool capabilities

๐Ÿ“„ License

MIT License - feel free to use this in your projects!


Ready to supercharge your code exploration? Get started in 60 seconds! ๐Ÿš€โœจ

Turboprop: Because finding code should be as smooth as flying.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

turboprop-0.1.1.tar.gz (37.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

turboprop-0.1.1-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file turboprop-0.1.1.tar.gz.

File metadata

  • Download URL: turboprop-0.1.1.tar.gz
  • Upload date:
  • Size: 37.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for turboprop-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e730fafd0dea4e92d957695d1c139b390c34952743affa8e358ce211e1af52d0
MD5 81a1244d0091f17a27cfbb4674ff8150
BLAKE2b-256 256785bf5599a5ca8489f10e2329da150b9e75c2238412ec752c38ed0a9273bd

See more details on using hashes here.

File details

Details for the file turboprop-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: turboprop-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 29.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.4

File hashes

Hashes for turboprop-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8e8c85cea8b7d330b4abbe5dea4d5bb32be7a2503dcdfdf20eaf584cd9e63417
MD5 8d9e76f5fbe7dccf247955ce8a4547fa
BLAKE2b-256 9ba5d4650e28b60cf4e4c49c418e01200af7ae37f7a847899e64cc1a802259db

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page