LeIndex: AI-powered code search and indexing system with MCP integration
Project description
LeIndex
AI-Powered Code Search That Actually Understands Your Code
Lightning-fast semantic code search with zero dependencies. Find code by meaning, not just by matching text.
The LeIndex experience - powerful, fast, and beautiful
โจ What Makes LeIndex Special?
LeIndex isn't just another code search tool. It's your intelligent code companion that understands what you're looking for, not just where it might be typed.
Imagine searching for "authentication flow" and finding not just files containing those words, but the actual authentication logic, login handlers, session management, and security patterns - even if they're named completely differently. That's the magic of semantic search! ๐ฏ
๐ Quick Start (You'll Be Searching in Under 2 Minutes. It's Easier Than Making Coffee!)
One-Click Installation
The easiest way to get started:
Requirements
- Python 3.10 or higher
- 4GB RAM minimum (8GB+ for large codebases)
- About 1GB disk space
Linux/Unix:
curl -sSL https://raw.githubusercontent.com/scooter-lacroix/LeIndex/master/install.sh | bash
macOS:
curl -sSL https://raw.githubusercontent.com/scooter-lacroix/LeIndex/master/install_macos.sh | bash
Windows:
irm https://raw.githubusercontent.com/scooter-lacroix/LeIndex/master/install.ps1 | iex
That's it. The installer will:
- โ Install LeIndex MCP server
- โ Detect your AI tools (Claude Code, Cursor, etc.)
- โ Configure integrations automatically
- โ Install optional skills for enhanced workflows
Manual installation? See below โ
# Install LeIndex - seriously, that's it
pip install leindex
# Index your codebase (no Docker, no databases, no headache)
leindex init /path/to/your/project
leindex index /path/to/your/project
# Search like a wizard
leindex-search "authentication logic"
# Or use it via MCP in Claude, Cursor, or your favorite AI assistant
# LeIndex MCP server does the heavy lifting automatically!
OR
PIP Install
pip install leindex
That's literally it. No Docker. No databases. No configuration files (unless you want them). Just works. โจ
Verify It's Alive
leindex --version
# Output: LeIndex 2.0.2 - Ready to search! ๐
Install from Source (For the Adventurous)
git clone https://github.com/scooter-lacroix/leindex.git
cd leindex
pip install -e .
Boom! You're now searching your codebase at the speed of thought. ๐
๐ฏ Why Developers Love LeIndex
๐ฅ Zero Dependencies, Zero Drama
- No Docker - Your laptop will thank you
- No PostgreSQL - No database setup nightmares
- No Elasticsearch - No Java memory leaks
- No RabbitMQ - No message queue complexity
- Just pure Python magic -
pip installand you're done
โก Blazing Fast Performance
- LEANN vector search - Find similar code in milliseconds
- Tantivy full-text search - Rust-powered Lucene goodness
- Hybrid scoring - Best of both worlds: semantic + lexical
- Handles 100K+ files - Scale from side projects to monorepos
๐ง Semantic Understanding
- CodeRankEmbed embeddings - Understands code meaning and intent
- Finds by concept - Search "error handling" and find try/except, error types, logging, and recovery patterns
- Smart symbol search - Jump to definitions and references instantly
- Regex power - For when you need precise pattern matching
๐ Privacy-First & Self-Hosted
- Your code stays yours - Nothing leaves your machine
- Works offline - No internet required after installation
- No telemetry - We don't track your searches
- Enterprise-ready - Deploy on your own infrastructure
๐ค MCP-Native Design
- First-class MCP support - Built from the ground up for Model Context Protocol
- AI assistant ready - Works seamlessly with Claude, Cursor, Windsurf, and more
- Token efficient - Saves ~200 tokens per session (no hook overhead!)
- Optional skill integration - For complex multi-project workflows
๐ช The LeIndex Magic Show
๐ Search That Reads Your Mind
# Search semantically
results = indexer.search("authentication flow")
# Get results that actually make sense:
# - Login handlers (even if named 'sign_in')
# - Session management (even if called 'user_state')
# - JWT verification (even if labeled 'token_check')
# - Password hashing (even if in 'crypto_utils')
๐ The Secret Sauce (Technology Stack)
| Component | Technology | Superpower |
|---|---|---|
| Vector Search | LEANN | Storage-efficient semantic similarity |
| Code Brain | CodeRankEmbed | Understands code meaning & intent |
| Text Search | Tantivy | Rust-powered Lucene (fast!) |
| Metadata | SQLite | Reliable ACID-compliant storage |
| Analytics | DuckDB | In-memory analytical queries |
| Async Engine | asyncio | Built-in Python async (no RabbitMQ needed!) |
๐๏ธ Architecture That Makes Sense
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ The LeIndex Experience โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโ โ
โ โ MCP Server โโโโถโ Core Engine โโโโถโ LEANN โ โ
โ โ (Your AI โ โ (The Brains)โ โ (Vectors) โ โ
โ โ Assistant) โ โ โ โ โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โ โผ โผ โ
โ โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ โ
โ โ โ Query Router โโโโถโ Tantivy โ โ
โ โ โ (Traffic โ โ(Full-Text)โ โ
โ โ โ Cop) โ โ โ โ
โ โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ โ
โ โ CLI Tools โ โ Data Access โ โ SQLite โ โ
โ โ (Power User) โ โ Layer โ โ (Metadata)โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโ โ
โ โ DuckDB โ โ
โ โ (Analytics) โ โ
โ โโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ก Everything runs locally. No cloud. No dependencies. Just speed.
๐ฏ Usage: Let's Search Some Code!
๐ค MCP Integration (The Cool Way)
LeIndex comes with a built-in MCP server that makes your AI assistant code-aware:
Available MCP Superpowers:
manage_project- Set up and manage indexing for your projectssearch_content- Search code with semantic + full-text powersget_diagnostics- Get project stats and health checks
Configuration in your MCP client (Claude, Cursor, etc.):
{
"mcpServers": {
"leindex": {
"command": "leindex",
"args": ["mcp"],
"env": {}
}
}
}
Start the MCP server:
leindex mcp
Now your AI assistant can search your codebase like a pro! ๐
When to use what:
| Approach | Best For |
|---|---|
| MCP Tools | Single-project searches, simple queries, direct API access |
| Skills | Multi-project operations, complex workflows, automated pipelines |
๐ Python API (For the Coders)
from leindex import LeIndex
# Initialize and index
indexer = LeIndex("~/my-awesome-project")
indexer.index()
# Search semantically - it understands meaning!
results = indexer.search("authentication flow")
# Filter like a boss
results = indexer.search(
query="database connection",
file_patterns=["*.py"], # Only Python files
exclude_patterns=["test_*.py"] # But not tests
)
# Access the good stuff
for result in results:
print(f"{result.file}:{result.line}")
print(result.content)
print(f"Relevance Score: {result.score}")
๐ง CLI Tools (For the Terminal Lovers)
# Initialize indexing for a project
leindex init /path/to/project
# Run the indexing (it's fast, we promise)
leindex index /path/to/project
# Search from terminal
leindex-search "authentication logic"
# Search with filters
leindex-search "database" --ext py --exclude test_*
โ๏ธ Configuration (Optional but Powerful)
LeIndex works great out of the box, but you can tweak it to your heart's content with config.yaml:
# Data Access Layer (The Engine Room)
dal_settings:
backend_type: "sqlite_duckdb" # The good stuff
db_path: "./data/leindex.db" # Where metadata lives
duckdb_db_path: "./data/leindex.db.duckdb" # Analytics heaven
# Vector Store (Semantic Search Magic)
vector_store:
backend_type: "leann" # Storage-efficient vectors
index_path: "./leann_index" # Where vectors chill
embedding_model: "nomic-ai/CodeRankEmbed" # Code brain
embedding_dim: 768 # Vector dimensions
# Async Processing (Speed Demon)
async_processing:
enabled: true
worker_count: 4 # Parallel indexing
max_queue_size: 10000 # Queue buffer
# File Filtering (Keep It Lean)
file_filtering:
max_file_size: 1073741824 # 1GB per file
type_specific_limits:
".py": 1073741824 # Python files up to 1GB
".json": 104857600 # JSON files up to 100MB
# Directory Filtering (Ignore the Junk)
directory_filtering:
skip_large_directories:
- "**/node_modules/**" # No JavaScript dependency hell
- "**/.git/**" # No git history
- "**/venv/**" # No virtual environments
- "**/__pycache__/**" # No Python cache
๐ Performance Stats (We're Not Slow)
| Metric | LeIndex | Typical Code Search | Difference |
|---|---|---|---|
| Indexing Speed | ~10K files/min | ~500 files/min | 20x faster |
| Search Latency (p50) | ~50ms | ~500ms | 10x faster |
| Search Latency (p99) | ~200ms | ~5s | 25x faster |
| Max Scalability | 100K+ files | 10K files | 10x more |
| Memory Usage | <4GB | >8GB | 2x less |
Benchmarks on 100K file Python codebase, standard hardware. Your mileage may vary, but it'll still be fast!
๐ The Evolution: Of LeIndex
LeIndex is a complete reimagining the code indexing experience:
- โ
CLI streamlined - Simple
leindexcommands - โ
Environment unified -
LEINDEX_*environment variables - โ Revolutionary stack - No external dependencies
- โ Lightweight architecture - Pure Python with LEANN + Tantivy + SQLite + DuckDB
What we gained:
- โ Simplicity
- โ Speed
- โ Token efficiency (~200 tokens/session saved)
- โ Pure MCP architecture
- โ Developer happiness
๐ Documentation That Doesn't Suck
- Installation Guide - Detailed setup instructions
- MCP Configuration - MCP server setup and examples
- Architecture Deep Dive - System design and internals
- API Reference - Complete API documentation
- Migration Guide - Upgrading from code-indexer
- Contributing - Join the fun!
๐งช Development (For the Curious)
Project Structure
leindex/
โโโ src/leindex/ # The magic happens here
โ โโโ dal/ # Data Access Layer
โ โโโ storage/ # Storage backends
โ โโโ search/ # Search engines
โ โโโ core_engine/ # Core indexing & search
โ โโโ config_manager.py # Config wizardry
โ โโโ project_settings.py # Project settings
โ โโโ constants.py # Shared constants
โ โโโ server.py # MCP server
โโโ tests/ # Test suite
โโโ config.yaml # Configuration
โโโ pyproject.toml # Project metadata
Running Tests
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Run with coverage (because we care)
pytest --cov=leindex tests/
๐ค Contributing (Join the Party!)
We love contributions! Whether it's bug fixes, new features, documentation improvements, or just spreading the word - it's all appreciated.
Please see CONTRIBUTING.md for guidelines. We promise we're friendly! ๐
๐ License
MIT License - see LICENSE for details. Use it anywhere, modify it, share it. Go wild!
๐ Acknowledgments (Standing on Giants)
LeIndex is built on amazing open-source projects:
- LEANN - Storage-efficient vector search
- Tantivy - Pure Python full-text search (Rust Lucene)
- DuckDB - Fast analytical database
- SQLite - Embedded relational database
- CodeRankEmbed - Code embeddings
- Model Context Protocol - AI integration
Massive thanks to all the contributors! ๐
๐ฌ Support & Community
- GitHub Issues: https://github.com/scooter-lacroix/leindex/issues
- Documentation: https://github.com/scooter-lacroix/leindex
- Star us on GitHub - It helps more people discover LeIndex! โญ
Built with โค๏ธ for developers who love their code
โญ Star us on GitHub โ it makes us smile!
Ready to search smarter? Install LeIndex now ๐
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file leindex-1.0.5.tar.gz.
File metadata
- Download URL: leindex-1.0.5.tar.gz
- Upload date:
- Size: 351.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4d4b5b6749b466f5047412a6b3125566dbdac970341175e5ef9d94901d53f13
|
|
| MD5 |
21fa53f382cc89881055b133db176770
|
|
| BLAKE2b-256 |
ff2b9ea5856248f9736f0c35b12c15cca5534047257baca494056305b6fc5203
|
File details
Details for the file leindex-1.0.5-py3-none-any.whl.
File metadata
- Download URL: leindex-1.0.5-py3-none-any.whl
- Upload date:
- Size: 388.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
661a1c2395ab3822970df51e952b7ce1664f1b64151ce4c29b09d75bcb885193
|
|
| MD5 |
0a973c90b3d746b4be916544637363f4
|
|
| BLAKE2b-256 |
d291e674d11911cb9aa04dc2a4ba8b17f7cd97aab350de1f36df141792166968
|