MCP server for indexing and querying codebases using CocoIndex
Project description
light weight MCP for code that just works
A super light-weight, effective embedded MCP (AST-based) that understand and searches your codebase that just works! Using CocoIndex - an Rust-based ultra performant data transformation engine. No blackbox. Works for Claude, Codex, Cursor - any coding agent.
- Instant token saving by 70%.
- 1 min setup - Just claude/codex mcp add works!
🌟 Please help star CocoIndex if you like this project!
Deutsch | English | Español | français | 日本語 | 한국어 | Português | Русский | 中文
Get Started - zero config, let's go!!
Requires Python 3 (pip3 comes pre-installed with Python).
pip3 install -U cocoindex-code
Claude
claude mcp add cocoindex-code -- cocoindex-code
Codex
codex mcp add cocoindex-code -- cocoindex-code
OpenCode
opencode mcp add
Enter MCP server name: cocoindex-code
Select MCP server type: local
Enter command to run: cocoindex-code
Or use opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"cocoindex-code": {
"type": "local",
"command": [
"cocoindex-code"
]
}
}
}
Optionally, you can run cocoindex-code index to create or update the index. Without running it, the MCP server will automatically build and keep the index up-to-date in the background.
When Is the MCP Triggered?
Once configured, your coding agent (Claude Code, Codex, Cursor, etc.) automatically decides when semantic code search is helpful — especially for finding code by description, exploring unfamiliar codebases, fuzzy/conceptual matches, or locating implementations without knowing exact names.
You can also nudge the agent explicitly, e.g. "Use the cocoindex-code MCP to find how user sessions are managed." For persistent instructions, add guidance to your project's AGENTS.md or CLAUDE.md:
Use the cocoindex-code MCP server for semantic code search when:
- Searching for code by meaning or description rather than exact text
- Exploring unfamiliar parts of the codebase
- Looking for implementations without knowing exact names
- Finding similar code patterns or related functionality
Features
- Semantic Code Search: Find relevant code using natural language queries when grep doesn't work well, and save tokens immediately.
- Ultra Performant to code changes:⚡ Built on top of ultra performant Rust indexing engine. Only re-indexes changed files for fast updates.
- Multi-Language Support: Python, JavaScript/TypeScript, Rust, Go, Java, C/C++, C#, SQL, Shell
- Embedded: Portable and just works, no database setup required!
- Flexible Embeddings: By default, no API key required with Local SentenceTransformers - totally free! You can customize 100+ cloud providers.
Configuration
| Variable | Description | Default |
|---|---|---|
COCOINDEX_CODE_ROOT_PATH |
Root path of the codebase | Auto-discovered (see below) |
COCOINDEX_CODE_EMBEDDING_MODEL |
Embedding model (see below) | sbert/sentence-transformers/all-MiniLM-L6-v2 |
COCOINDEX_CODE_BATCH_SIZE |
Max batch size for local embedding model | 16 |
COCOINDEX_CODE_EXTRA_EXTENSIONS |
Additional file extensions to index (comma-separated, e.g. "inc:php,yaml,toml" — use ext:lang to override language detection) |
(none) |
Root Path Discovery
If COCOINDEX_CODE_ROOT_PATH is not set, the codebase root is discovered by:
- Finding the nearest parent directory containing
.cocoindex_code/ - Finding the nearest parent directory containing
.git/ - Falling back to the current working directory
Embedding model
By default - this project use a local SentenceTransformers model (sentence-transformers/all-MiniLM-L6-v2). No API key required and completely free!
Use a code specific embedding model can achieve better semantic understanding for your results, this project supports all models on Ollama and 100+ cloud providers.
Set COCOINDEX_CODE_EMBEDDING_MODEL to any LiteLLM-supported model, along with the provider's API key:
Ollama (Local)
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=ollama/nomic-embed-text \
-- cocoindex-code
Set OLLAMA_API_BASE if your Ollama server is not at http://localhost:11434.
OpenAI
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=text-embedding-3-small \
-e OPENAI_API_KEY=your-api-key \
-- cocoindex-code
Azure OpenAI
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=azure/your-deployment-name \
-e AZURE_API_KEY=your-api-key \
-e AZURE_API_BASE=https://your-resource.openai.azure.com \
-e AZURE_API_VERSION=2024-06-01 \
-- cocoindex-code
Gemini
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=gemini/text-embedding-004 \
-e GEMINI_API_KEY=your-api-key \
-- cocoindex-code
Mistral
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=mistral/mistral-embed \
-e MISTRAL_API_KEY=your-api-key \
-- cocoindex-code
Voyage (Code-Optimized)
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=voyage/voyage-code-3 \
-e VOYAGE_API_KEY=your-api-key \
-- cocoindex-code
Cohere
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=cohere/embed-english-v3.0 \
-e COHERE_API_KEY=your-api-key \
-- cocoindex-code
AWS Bedrock
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 \
-e AWS_ACCESS_KEY_ID=your-access-key \
-e AWS_SECRET_ACCESS_KEY=your-secret-key \
-e AWS_REGION_NAME=us-east-1 \
-- cocoindex-code
Nebius
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=nebius/BAAI/bge-en-icl \
-e NEBIUS_API_KEY=your-api-key \
-- cocoindex-code
Any model supported by LiteLLM works — see the full list of embedding providers.
GPU-optimised local model
If you have a GPU, nomic-ai/CodeRankEmbed delivers significantly better code retrieval than the default model. It is 137M parameters, requires ~1 GB VRAM, and has an 8192-token context window.
claude mcp add cocoindex-code \
-e COCOINDEX_CODE_EMBEDDING_MODEL=sbert/nomic-ai/CodeRankEmbed \
-e COCOINDEX_CODE_BATCH_SIZE=16 \
-- cocoindex-code
Note: Switching models requires re-indexing your codebase (the vector dimensions differ).
MCP Tools
search
Search the codebase using semantic similarity.
search(
query: str, # Natural language query or code snippet
limit: int = 10, # Maximum results (1-100)
offset: int = 0, # Pagination offset
refresh_index: bool = True # Refresh index before querying
)
The refresh_index parameter controls whether the index is refreshed before searching:
True(default): Refreshes the index to include any recent changesFalse: Skip refresh for faster consecutive queries
Returns matching code chunks with:
- File path
- Language
- Code content
- Line numbers (start/end)
- Similarity score
Supported Languages
| Language | Aliases | File Extensions |
|---|---|---|
| c | .c |
|
| cpp | c++ | .cpp, .cc, .cxx, .h, .hpp |
| csharp | csharp, cs | .cs |
| css | .css, .scss |
|
| dtd | .dtd |
|
| fortran | f, f90, f95, f03 | .f, .f90, .f95, .f03 |
| go | golang | .go |
| html | .html, .htm |
|
| java | .java |
|
| javascript | js | .js |
| json | .json |
|
| kotlin | .kt, .kts |
|
| markdown | md | .md, .mdx |
| pascal | pas, dpr, delphi | .pas, .dpr |
| php | .php |
|
| python | .py |
|
| r | .r |
|
| ruby | .rb |
|
| rust | rs | .rs |
| scala | .scala |
|
| solidity | .sol |
|
| sql | .sql |
|
| swift | .swift |
|
| toml | .toml |
|
| tsx | .tsx |
|
| typescript | ts | .ts |
| xml | .xml |
|
| yaml | .yaml, .yml |
Common generated directories are automatically excluded:
__pycache__/node_modules/target/dist/vendor/(Go vendored dependencies, matched by domain-based child paths)
Troubleshooting
sqlite3.Connection object has no attribute enable_load_extension
Some Python installations (e.g. the one pre-installed on macOS) ship with a SQLite library that doesn't enable extensions.
macOS fix: Install Python through Homebrew:
brew install python3
Then re-install cocoindex-code with the Homebrew Python:
pip3 install -U cocoindex-code
Large codebase / Enterprise
CocoIndex is an ultra effecient indexing engine that also works on large codebase at scale on XXX G for enterprises. In enterprise scenarios it is a lot more effecient to do index share with teammates when there are large repo or many repos. We also have advanced features like branch dedupe etc designed for enterprise users.
If you need help with remote setup, please email our maintainer linghua@cocoindex.io, happy to help!!
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cocoindex_code-0.1.9.tar.gz.
File metadata
- Download URL: cocoindex_code-0.1.9.tar.gz
- Upload date:
- Size: 17.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05e0b3788bc0441f88042102a589f4ba05a95c03c9dfca9f7d37172d4da28958
|
|
| MD5 |
7d361776d1d049c622de359a1244d81d
|
|
| BLAKE2b-256 |
f134f38dbbd7e39e8fddfdff21286dfd5b23f0328995afcedb40f2ea96a5efe5
|
Provenance
The following attestation bundles were made for cocoindex_code-0.1.9.tar.gz:
Publisher:
release.yml on cocoindex-io/cocoindex-code
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cocoindex_code-0.1.9.tar.gz -
Subject digest:
05e0b3788bc0441f88042102a589f4ba05a95c03c9dfca9f7d37172d4da28958 - Sigstore transparency entry: 1044992475
- Sigstore integration time:
-
Permalink:
cocoindex-io/cocoindex-code@ba692b00f455248efb3ee8ead2caa276be55d194 -
Branch / Tag:
refs/tags/v0.1.9 - Owner: https://github.com/cocoindex-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ba692b00f455248efb3ee8ead2caa276be55d194 -
Trigger Event:
release
-
Statement type:
File details
Details for the file cocoindex_code-0.1.9-py3-none-any.whl.
File metadata
- Download URL: cocoindex_code-0.1.9-py3-none-any.whl
- Upload date:
- Size: 20.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c33cb0ad80cb3986b14c4780c790e028433930ed5d5db2a47f82df0cc0408745
|
|
| MD5 |
cf8c22fb505910f0cbbe58c884790231
|
|
| BLAKE2b-256 |
ed922d76f1ab0c053a30e8aef6c0b940a4604a98c914a3e0fc6bd903db5aae46
|
Provenance
The following attestation bundles were made for cocoindex_code-0.1.9-py3-none-any.whl:
Publisher:
release.yml on cocoindex-io/cocoindex-code
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cocoindex_code-0.1.9-py3-none-any.whl -
Subject digest:
c33cb0ad80cb3986b14c4780c790e028433930ed5d5db2a47f82df0cc0408745 - Sigstore transparency entry: 1044992548
- Sigstore integration time:
-
Permalink:
cocoindex-io/cocoindex-code@ba692b00f455248efb3ee8ead2caa276be55d194 -
Branch / Tag:
refs/tags/v0.1.9 - Owner: https://github.com/cocoindex-io
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ba692b00f455248efb3ee8ead2caa276be55d194 -
Trigger Event:
release
-
Statement type: