Skip to main content

LeIndex: AI-powered code search and indexing system with MCP integration

Project description

LeIndex

MCP Server Python License Version

AI-Powered Code Search That Actually Understands Your Code

Lightning-fast semantic code search with zero dependencies. Find code by meaning, not just by matching text.


LeIndex Architecture

The LeIndex experience - powerful, fast, and beautiful


โœจ What Makes LeIndex Special?

LeIndex isn't just another code search tool. It's your intelligent code companion that understands what you're looking for, not just where it might be typed.

Imagine searching for "authentication flow" and finding not just files containing those words, but the actual authentication logic, login handlers, session management, and security patterns - even if they're named completely differently. That's the magic of semantic search! ๐ŸŽฏ


๐Ÿš€ Quick Start (You'll Be Searching in Under 2 Minutes. It's Easier Than Making Coffee!)

One-Click Installation

The easiest way to get started:

Requirements

  • Python 3.10 or higher
  • 4GB RAM minimum (8GB+ for large codebases)
  • About 1GB disk space

Linux/Unix:

curl -sSL https://raw.githubusercontent.com/scooter-lacroix/LeIndex/master/install.sh | bash

macOS:

curl -sSL https://raw.githubusercontent.com/scooter-lacroix/LeIndex/master/install_macos.sh | bash

Windows:

irm https://raw.githubusercontent.com/scooter-lacroix/LeIndex/master/install.ps1 | iex

That's it. The installer will:

  • โœ… Install LeIndex MCP server
  • โœ… Detect your AI tools (Claude Code, Cursor, etc.)
  • โœ… Configure integrations automatically
  • โœ… Install optional skills for enhanced workflows

Manual installation? See below โ†“


# Install LeIndex - seriously, that's it
pip install leindex

# Index your codebase (no Docker, no databases, no headache)
leindex init /path/to/your/project
leindex index /path/to/your/project

# Search like a wizard
leindex-search "authentication logic"

# Or use it via MCP in Claude, Cursor, or your favorite AI assistant
# LeIndex MCP server does the heavy lifting automatically!

OR

PIP Install

pip install leindex

That's literally it. No Docker. No databases. No configuration files (unless you want them). Just works. โœจ

Verify It's Alive

leindex --version
# Output: LeIndex 2.0.2 - Ready to search! ๐Ÿš€

Install from Source (For the Adventurous)

git clone https://github.com/scooter-lacroix/leindex.git
cd leindex
pip install -e .

Boom! You're now searching your codebase at the speed of thought. ๐ŸŽ‰


๐ŸŽฏ Why Developers Love LeIndex

๐Ÿ”ฅ Zero Dependencies, Zero Drama

  • No Docker - Your laptop will thank you
  • No PostgreSQL - No database setup nightmares
  • No Elasticsearch - No Java memory leaks
  • No RabbitMQ - No message queue complexity
  • Just pure Python magic - pip install and you're done

โšก Blazing Fast Performance

  • LEANN vector search - Find similar code in milliseconds
  • Tantivy full-text search - Rust-powered Lucene goodness
  • Hybrid scoring - Best of both worlds: semantic + lexical
  • Handles 100K+ files - Scale from side projects to monorepos

๐Ÿง  Semantic Understanding

  • CodeRankEmbed embeddings - Understands code meaning and intent
  • Finds by concept - Search "error handling" and find try/except, error types, logging, and recovery patterns
  • Smart symbol search - Jump to definitions and references instantly
  • Regex power - For when you need precise pattern matching

๐Ÿ  Privacy-First & Self-Hosted

  • Your code stays yours - Nothing leaves your machine
  • Works offline - No internet required after installation
  • No telemetry - We don't track your searches
  • Enterprise-ready - Deploy on your own infrastructure

๐Ÿค– MCP-Native Design

  • First-class MCP support - Built from the ground up for Model Context Protocol
  • AI assistant ready - Works seamlessly with Claude, Cursor, Windsurf, and more
  • Token efficient - Saves ~200 tokens per session (no hook overhead!)
  • Optional skill integration - For complex multi-project workflows

๐ŸŽช The LeIndex Magic Show

๐Ÿ” Search That Reads Your Mind

# Search semantically
results = indexer.search("authentication flow")

# Get results that actually make sense:
# - Login handlers (even if named 'sign_in')
# - Session management (even if called 'user_state')
# - JWT verification (even if labeled 'token_check')
# - Password hashing (even if in 'crypto_utils')

๐Ÿ“Š The Secret Sauce (Technology Stack)

Component Technology Superpower
Vector Search LEANN Storage-efficient semantic similarity
Code Brain CodeRankEmbed Understands code meaning & intent
Text Search Tantivy Rust-powered Lucene (fast!)
Metadata SQLite Reliable ACID-compliant storage
Analytics DuckDB In-memory analytical queries
Async Engine asyncio Built-in Python async (no RabbitMQ needed!)

๐Ÿ—๏ธ Architecture That Makes Sense

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              The LeIndex Experience                     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚  โ”‚   MCP Server โ”‚โ—€โ”€โ–ถโ”‚ Core Engine โ”‚โ—€โ”€โ–ถโ”‚  LEANN    โ”‚ โ”‚
โ”‚  โ”‚  (Your AI    โ”‚     โ”‚ (The Brains)โ”‚     โ”‚ (Vectors) โ”‚ โ”‚
โ”‚  โ”‚   Assistant) โ”‚     โ”‚             โ”‚     โ”‚           โ”‚ โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚         โ”‚                   โ”‚                    โ”‚      โ”‚
โ”‚         โ”‚                   โ–ผ                    โ–ผ      โ”‚
โ”‚         โ”‚            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚         โ”‚            โ”‚ Query Router โ”‚โ—€โ”€โ–ถโ”‚ Tantivy   โ”‚ โ”‚
โ”‚         โ”‚            โ”‚  (Traffic    โ”‚     โ”‚(Full-Text)โ”‚ โ”‚
โ”‚         โ”‚            โ”‚   Cop)       โ”‚     โ”‚           โ”‚ โ”‚
โ”‚         โ”‚            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚         โ”‚                   โ”‚                    โ”‚      โ”‚
โ”‚         โ–ผ                   โ–ผ                    โ–ผ      โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚ CLI Tools    โ”‚    โ”‚ Data Access  โ”‚    โ”‚  SQLite   โ”‚  โ”‚
โ”‚  โ”‚ (Power User) โ”‚    โ”‚    Layer     โ”‚    โ”‚ (Metadata)โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”‚                              โ”‚                          โ”‚
โ”‚                              โ–ผ                          โ”‚
โ”‚                       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                  โ”‚
โ”‚                       โ”‚   DuckDB     โ”‚                  โ”‚
โ”‚                       โ”‚ (Analytics)  โ”‚                  โ”‚
โ”‚                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ’ก Everything runs locally. No cloud. No dependencies. Just speed.

๐ŸŽฏ Usage: Let's Search Some Code!

๐Ÿค– MCP Integration (The Cool Way)

LeIndex comes with a built-in MCP server that makes your AI assistant code-aware:

Available MCP Superpowers:

  • manage_project - Set up and manage indexing for your projects
  • search_content - Search code with semantic + full-text powers
  • get_diagnostics - Get project stats and health checks

Configuration in your MCP client (Claude, Cursor, etc.):

{
  "mcpServers": {
    "leindex": {
      "command": "leindex",
      "args": ["mcp"],
      "env": {}
    }
  }
}

Start the MCP server:

leindex mcp

Now your AI assistant can search your codebase like a pro! ๐ŸŽ‰

When to use what:

Approach Best For
MCP Tools Single-project searches, simple queries, direct API access
Skills Multi-project operations, complex workflows, automated pipelines

๐Ÿ Python API (For the Coders)

from leindex import LeIndex

# Initialize and index
indexer = LeIndex("~/my-awesome-project")
indexer.index()

# Search semantically - it understands meaning!
results = indexer.search("authentication flow")

# Filter like a boss
results = indexer.search(
    query="database connection",
    file_patterns=["*.py"],           # Only Python files
    exclude_patterns=["test_*.py"]     # But not tests
)

# Access the good stuff
for result in results:
    print(f"{result.file}:{result.line}")
    print(result.content)
    print(f"Relevance Score: {result.score}")

๐Ÿ”ง CLI Tools (For the Terminal Lovers)

# Initialize indexing for a project
leindex init /path/to/project

# Run the indexing (it's fast, we promise)
leindex index /path/to/project

# Search from terminal
leindex-search "authentication logic"

# Search with filters
leindex-search "database" --ext py --exclude test_*

โš™๏ธ Configuration (Optional but Powerful)

LeIndex works great out of the box, but you can tweak it to your heart's content with config.yaml:

# Data Access Layer (The Engine Room)
dal_settings:
  backend_type: "sqlite_duckdb"    # The good stuff
  db_path: "./data/leindex.db"     # Where metadata lives
  duckdb_db_path: "./data/leindex.db.duckdb"  # Analytics heaven

# Vector Store (Semantic Search Magic)
vector_store:
  backend_type: "leann"            # Storage-efficient vectors
  index_path: "./leann_index"      # Where vectors chill
  embedding_model: "nomic-ai/CodeRankEmbed"  # Code brain
  embedding_dim: 768               # Vector dimensions

# Async Processing (Speed Demon)
async_processing:
  enabled: true
  worker_count: 4                  # Parallel indexing
  max_queue_size: 10000            # Queue buffer

# File Filtering (Keep It Lean)
file_filtering:
  max_file_size: 1073741824        # 1GB per file
  type_specific_limits:
    ".py": 1073741824              # Python files up to 1GB
    ".json": 104857600             # JSON files up to 100MB

# Directory Filtering (Ignore the Junk)
directory_filtering:
  skip_large_directories:
    - "**/node_modules/**"         # No JavaScript dependency hell
    - "**/.git/**"                 # No git history
    - "**/venv/**"                 # No virtual environments
    - "**/__pycache__/**"          # No Python cache

๐Ÿ“Š Performance Stats (We're Not Slow)

Metric LeIndex Typical Code Search Difference
Indexing Speed ~10K files/min ~500 files/min 20x faster
Search Latency (p50) ~50ms ~500ms 10x faster
Search Latency (p99) ~200ms ~5s 25x faster
Max Scalability 100K+ files 10K files 10x more
Memory Usage <4GB >8GB 2x less

Benchmarks on 100K file Python codebase, standard hardware. Your mileage may vary, but it'll still be fast!


๐Ÿ†š The Evolution: Of LeIndex

LeIndex is a complete reimagining the code indexing experience:

  • โœ… CLI streamlined - Simple leindex commands
  • โœ… Environment unified - LEINDEX_* environment variables
  • โœ… Revolutionary stack - No external dependencies
  • โœ… Lightweight architecture - Pure Python with LEANN + Tantivy + SQLite + DuckDB

What we gained:

  • โœ… Simplicity
  • โœ… Speed
  • โœ… Token efficiency (~200 tokens/session saved)
  • โœ… Pure MCP architecture
  • โœ… Developer happiness

๐Ÿ“š Documentation That Doesn't Suck


๐Ÿงช Development (For the Curious)

Project Structure

leindex/
โ”œโ”€โ”€ src/leindex/              # The magic happens here
โ”‚   โ”œโ”€โ”€ dal/                  # Data Access Layer
โ”‚   โ”œโ”€โ”€ storage/              # Storage backends
โ”‚   โ”œโ”€โ”€ search/               # Search engines
โ”‚   โ”œโ”€โ”€ core_engine/          # Core indexing & search
โ”‚   โ”œโ”€โ”€ config_manager.py     # Config wizardry
โ”‚   โ”œโ”€โ”€ project_settings.py   # Project settings
โ”‚   โ”œโ”€โ”€ constants.py          # Shared constants
โ”‚   โ””โ”€โ”€ server.py             # MCP server
โ”œโ”€โ”€ tests/                    # Test suite
โ”œโ”€โ”€ config.yaml               # Configuration
โ””โ”€โ”€ pyproject.toml           # Project metadata

Running Tests

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

# Run with coverage (because we care)
pytest --cov=leindex tests/

๐Ÿค Contributing (Join the Party!)

We love contributions! Whether it's bug fixes, new features, documentation improvements, or just spreading the word - it's all appreciated.

Please see CONTRIBUTING.md for guidelines. We promise we're friendly! ๐Ÿ˜Š


๐Ÿ“œ License

MIT License - see LICENSE for details. Use it anywhere, modify it, share it. Go wild!


๐Ÿ™ Acknowledgments (Standing on Giants)

LeIndex is built on amazing open-source projects:

Massive thanks to all the contributors! ๐ŸŽ‰


๐Ÿ’ฌ Support & Community


Built with โค๏ธ for developers who love their code

โญ Star us on GitHub โ€” it makes us smile!

Ready to search smarter? Install LeIndex now ๐Ÿš€

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

leindex-1.0.0.tar.gz (396.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

leindex-1.0.0-py3-none-any.whl (436.2 kB view details)

Uploaded Python 3

File details

Details for the file leindex-1.0.0.tar.gz.

File metadata

  • Download URL: leindex-1.0.0.tar.gz
  • Upload date:
  • Size: 396.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for leindex-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ac1c42f7f256a12c08a3d922868a6e040f6a4aad350edea4d8bdd43fa338ada3
MD5 38008bf49a5159ebfa4d49044052f31a
BLAKE2b-256 242050c2e08d25f376e0488a460db54ccbe8d0eaf975c08bc02806486613cb87

See more details on using hashes here.

File details

Details for the file leindex-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: leindex-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 436.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for leindex-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7c34741d43a4610ab458c1ef159bcbea23767f25343e5e86b275b200917346f1
MD5 d6bd17213c24fffbec499d0ac59a9ecb
BLAKE2b-256 96901be1e4f766016e03ac08fad943b935c5f0df52c45e9ec03369d0534d3ea7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page