Skip to main content

Lightweight semantic code search engine — hybrid vector + FTS + AST graph + regex fusion + MCP server

Project description

codexlens-search

Semantic code search engine with MCP server for Claude Code.

2-stage vector search + FTS + RRF fusion + reranking — install once, configure API keys, ready to use.

Quick Start (Claude Code MCP)

Add to your project .mcp.json:

{
  "mcpServers": {
    "codexlens": {
      "command": "uvx",
      "args": ["--from", "codexlens-search[mcp]", "codexlens-mcp"],
      "env": {
        "CODEXLENS_EMBED_API_URL": "https://api.openai.com/v1",
        "CODEXLENS_EMBED_API_KEY": "${OPENAI_API_KEY}",
        "CODEXLENS_EMBED_API_MODEL": "text-embedding-3-small",
        "CODEXLENS_EMBED_DIM": "1536"
      }
    }
  }
}

That's it. Claude Code will auto-discover the tools: index_projectsearch_code.

Install

# Standard install (includes vector search + API clients)
uv pip install codexlens-search

# With MCP server for Claude Code
uv pip install codexlens-search[mcp]

Optional extras:

Extra Description
mcp MCP server (codexlens-mcp command)
gpu GPU-accelerated embedding (onnxruntime-gpu)
faiss-cpu FAISS ANN backend
watcher File watcher for auto-indexing

MCP Tools

Tool Description
search_code Semantic search with hybrid fusion + reranking
index_project Build or rebuild the search index
index_status Show index statistics
index_update Incremental sync (only changed files)
find_files Glob file discovery
list_models List models with cache status
download_models Download local fastembed models

MCP Configuration Examples

API Embedding Only (simplest)

{
  "mcpServers": {
    "codexlens": {
      "command": "uvx",
      "args": ["--from", "codexlens-search[mcp]", "codexlens-mcp"],
      "env": {
        "CODEXLENS_EMBED_API_URL": "https://api.openai.com/v1",
        "CODEXLENS_EMBED_API_KEY": "${OPENAI_API_KEY}",
        "CODEXLENS_EMBED_API_MODEL": "text-embedding-3-small",
        "CODEXLENS_EMBED_DIM": "1536"
      }
    }
  }
}

API Embedding + API Reranker (best quality)

{
  "mcpServers": {
    "codexlens": {
      "command": "uvx",
      "args": ["--from", "codexlens-search[mcp]", "codexlens-mcp"],
      "env": {
        "CODEXLENS_EMBED_API_URL": "https://api.openai.com/v1",
        "CODEXLENS_EMBED_API_KEY": "${OPENAI_API_KEY}",
        "CODEXLENS_EMBED_API_MODEL": "text-embedding-3-small",
        "CODEXLENS_EMBED_DIM": "1536",
        "CODEXLENS_RERANKER_API_URL": "https://api.jina.ai/v1",
        "CODEXLENS_RERANKER_API_KEY": "${JINA_API_KEY}",
        "CODEXLENS_RERANKER_API_MODEL": "jina-reranker-v2-base-multilingual"
      }
    }
  }
}

Multi-Endpoint Load Balancing

{
  "mcpServers": {
    "codexlens": {
      "command": "uvx",
      "args": ["--from", "codexlens-search[mcp]", "codexlens-mcp"],
      "env": {
        "CODEXLENS_EMBED_API_ENDPOINTS": "https://api1.example.com/v1|sk-key1|model,https://api2.example.com/v1|sk-key2|model",
        "CODEXLENS_EMBED_DIM": "1536"
      }
    }
  }
}

Format: url|key|model,url|key|model,...

Local Models (Offline, No API)

uv pip install codexlens-search[mcp]
codexlens-search download-models
{
  "mcpServers": {
    "codexlens": {
      "command": "codexlens-mcp",
      "env": {}
    }
  }
}

Pre-installed (no uvx)

{
  "mcpServers": {
    "codexlens": {
      "command": "codexlens-mcp",
      "env": {
        "CODEXLENS_EMBED_API_URL": "https://api.openai.com/v1",
        "CODEXLENS_EMBED_API_KEY": "${OPENAI_API_KEY}",
        "CODEXLENS_EMBED_API_MODEL": "text-embedding-3-small",
        "CODEXLENS_EMBED_DIM": "1536"
      }
    }
  }
}

CLI

codexlens-search --db-path .codexlens sync --root ./src
codexlens-search --db-path .codexlens search -q "auth handler" -k 10
codexlens-search --db-path .codexlens status
codexlens-search list-models
codexlens-search download-models

Environment Variables

Embedding

Variable Description Example
CODEXLENS_EMBED_API_URL Embedding API base URL https://api.openai.com/v1
CODEXLENS_EMBED_API_KEY API key sk-xxx
CODEXLENS_EMBED_API_MODEL Model name text-embedding-3-small
CODEXLENS_EMBED_API_ENDPOINTS Multi-endpoint: url|key|model,... See above
CODEXLENS_EMBED_DIM Vector dimension 1536

Reranker

Variable Description Example
CODEXLENS_RERANKER_API_URL Reranker API base URL https://api.jina.ai/v1
CODEXLENS_RERANKER_API_KEY API key jina-xxx
CODEXLENS_RERANKER_API_MODEL Model name jina-reranker-v2-base-multilingual

Tuning

Variable Default Description
CODEXLENS_BINARY_TOP_K 200 Binary coarse search candidates
CODEXLENS_ANN_TOP_K 50 ANN fine search candidates
CODEXLENS_FTS_TOP_K 50 FTS results per method
CODEXLENS_FUSION_K 60 RRF fusion k parameter
CODEXLENS_RERANKER_TOP_K 20 Results to rerank
CODEXLENS_EMBED_BATCH_SIZE 32 Max texts per API batch (auto-splits on 413)
CODEXLENS_EMBED_MAX_TOKENS 8192 Max tokens per text (truncate if exceeded, 0=no limit)
CODEXLENS_INDEX_WORKERS 2 Parallel indexing workers
CODEXLENS_MAX_FILE_SIZE 1000000 Max file size in bytes

Architecture

Query → [Embedder] → query vector
         ├→ [BinaryStore] → candidates (Hamming)
         │     └→ [ANNIndex] → ranked IDs (cosine)
         ├→ [FTS exact] → exact matches
         └→ [FTS fuzzy] → fuzzy matches
              └→ [RRF Fusion] → merged ranking
                    └→ [Reranker] → final top-k

Development

git clone https://github.com/catlog22/codexlens-search.git
cd codexlens-search
uv pip install -e ".[dev]"
pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codexlens_search-0.5.0.tar.gz (89.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codexlens_search-0.5.0-py3-none-any.whl (86.0 kB view details)

Uploaded Python 3

File details

Details for the file codexlens_search-0.5.0.tar.gz.

File metadata

  • Download URL: codexlens_search-0.5.0.tar.gz
  • Upload date:
  • Size: 89.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for codexlens_search-0.5.0.tar.gz
Algorithm Hash digest
SHA256 55c111f66ba4cd3f57a4884c16ee064a8707e345a48a2f715f8de0a60595674d
MD5 ea0e3e9c781f27a38212be3082e425fa
BLAKE2b-256 a5e5db8fb05b5ea7da1a46982913b25b00ec94a2f24efe2ad99d2c3be90591f2

See more details on using hashes here.

File details

Details for the file codexlens_search-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for codexlens_search-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ecf2ad053e2c64dc377bb60282d909ec8c635743d7c21f40387436f19c70fae2
MD5 0f9f4b1a87cf0a17cf51b69e0b8163f0
BLAKE2b-256 c22732fbcecc30774def19795bc408bf88918d488f9c305588595caa5a76edec

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page