Local-first code search via MCP/CLI
Project description
Coco[-S]earch is a local-first hybrid semantic code search tool. It combines vector similarity and keyword matching (via RRF fusion) to find code by meaning, not just text. Powered by CocoIndex for indexing, Tree-sitter for syntax-aware chunking and symbol extraction, PostgreSQL with pgvector for storage, and Ollama for local embeddings. No external APIs โ everything runs on your machine.
Available as a CLI, MCP server, or interactive REPL. Incremental indexing, .gitignore-aware. Supports 31+ languages with symbol-level filtering for 14+, plus domain-specific grammars for structured config files.
๐ Table of Contents
- โ ๏ธ Disclaimer
- ๐ Quick Start
- โจ Features
- ๐ฅ๏ธ Interfaces
- ๐ Where MCP Wins
- ๐ Useful Documentation
- ๐งฉ Components
- โ๏ธ How Search Works
- ๐ Supported Languages
- ๐ Supported Grammars
- ๐ง Configuration
- ๐งช Testing
- ๐ ๏ธ Troubleshooting
Disclaimer
This project was originally built for personal use โ a solo experiment in local-first, privacy-focused code search to accelerate self-onboarding to new codebases and explore spec-driven development. Initially scaffolded with GSD and refined by hand. Ships with a CLI, MCP tools, dashboards (TUI/WEB), a status API, reusable Claude SKILLS, and a Claude Code plugin for one-command setup.
Quick Start
-
Services:
# 1. Clone this repository and start infrastructure: git clone https://github.com/VioletCranberry/coco-s.git && cd coco-s # Docker volumes are bind-mounted to ./docker_data/ inside the repository, # so infrastructure must be started from the cloned repo directory. docker compose up -d # 2. Verify services are ready. uvx cocosearch config check
-
Indexing your projects:
# 3.1 Use WEB Dashboard: uvx cocosearch dashboard # 3.2 Use CLI: uvx cocosearch index . # 3.3 Use AI and MCP - see below.
-
Register with your AI assistant (pick one):
Option A โ Plugin (recommended):
claude plugin marketplace add VioletCranberry/coco-s claude plugin install cocosearch@cocosearch # All 7 skills + MCP server configured automatically
Option B โ Manual MCP registration:
claude mcp add --scope user cocosearch -- \ uvx cocosearch mcp --project-from-cwd
Note: The MCP server automatically opens a web dashboard in your browser on a random port. Set
COCOSEARCH_DASHBOARD_PORT=8080to pin it to a fixed port, orCOCOSEARCH_NO_DASHBOARD=1to disable it.Install skills manually (for development):
mkdir -p .claude/skills for skill in cocosearch-onboarding cocosearch-refactoring cocosearch-debugging cocosearch-quickstart cocosearch-explore cocosearch-new-feature cocosearch-subway; do ln -sfn "../../skills/$skill" ".claude/skills/$skill" done
Features
- ๐ Hybrid search -- combines semantic similarity and keyword matching via RRF fusion to find code by meaning and by text.
- ๐ท๏ธ Symbol filtering -- narrow results to functions, classes, methods, or interfaces; match symbol names with glob patterns.
- ๐ Context expansion -- results automatically expand to enclosing function/class boundaries using Tree-sitter, so you see complete units of code.
- โก Query caching -- exact and semantic cache for fast repeated queries (0.95 cosine threshold).
- ๐ฉบ Parse health tracking -- per-language parse status, failure details, and staleness warnings when the index drifts from your branch.
- ๐ Privacy-first -- everything runs locally. No external API calls, no telemetry.
Interfaces
Search your code four ways โ pick what fits your workflow:
| Interface | Best for | How to start |
|---|---|---|
| CLI | One-off searches, scripting, CI | cocosearch search "auth flow" |
| Interactive REPL | Exploratory sessions โ tweak filters, switch indexes, iterate on queries without restarting | cocosearch search --interactive |
| Web Dashboard | Visual search + index management in the browser โ filters, syntax-highlighted results, charts, dark/light theme | cocosearch dashboard |
| MCP Server | AI assistant integration (Claude Code, Claude Desktop, OpenCode) | cocosearch mcp --project-from-cwd |
CLI
# Index a project
uvx cocosearch index /path/to/project
# Search with natural language
uvx cocosearch search "authentication flow" --pretty
# Serve CocoSearch WEB dashboard
uvx cocosearch dashboard
# Start interactive REPL
uvx cocosearch search --interactive
# View index stats with parse health
uvx cocosearch stats --pretty
โฏ uv run cocosearch stats --pretty
Index: cocosearch
Source: GIT/personal/coco-s
Branch: main (0b6050b) ยท up to date
Status: Indexed
Files: 192 | Chunks: 2,023 | Size: 15.0 MB
Created: 2026-02-09 18:30
Last Updated: 2026-02-14 12:36 (0 days ago)
Language Distribution
โโโโโโโโโโโโโโโโณโโโโโโโโโณโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Language โ Files โ Chunks โ Distribution โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ py โ 162 โ 1648 โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ md โ 22 โ 267 โ โโโโโ โ
โ html โ 1 โ 100 โ โโ โ
โ json โ 3 โ 3 โ โ
โ toml โ 1 โ 2 โ โ
โ yaml โ 2 โ 2 โ โ
โ docker-compโฆ โ 1 โ 1 โ โ
โโโโโโโโโโโโโโโโดโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Grammar Distribution
โโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโณโโโโโโโโโณโโโโโโโโโโโณโโโโโโโโโโโโโโโโโ
โ Grammar โ Base Language โ Files โ Chunks โ Recognition % โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ docker-compose โ yaml โ 1 โ 1 โ 100.0% โ
โโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโดโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโโโโโโโ
Symbol Statistics
โโโโโโโโโโโโณโโโโโโโโ
โ Type โ Count โ
โกโโโโโโโโโโโโโโโโโโโฉ
โ function โ 927 โ
โ class โ 229 โ
โโโโโโโโโโโโดโโโโโโโโ
Parse health: 100.0% clean (162/162 files)
Parse Status by Language
โโโโโโโโโโโโณโโโโโโโโณโโโโโโณโโโโโโโโโโณโโโโโโโโณโโโโโโโโโโโโโ
โ Language โ Files โ OK โ Partial โ Error โ No Grammar โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ python โ 162 โ 162 โ 0 โ 0 โ 0 โ
โโโโโโโโโโโโดโโโโโโโโดโโโโโโดโโโโโโโโโโดโโโโโโโโดโโโโโโโโโโโโโ
# View index stats with parse health live
uvx cocosearch stats --live
# List all indexes
uvx cocosearch list --pretty
โโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโ
โ Name โ Table โ Branch โ Status โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ cocosearch โ codeindex_cocosearch__cocosearch_chunks โ main (ed00733) โ Indexed โ
โโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโ
For the full list of commands and flags, see CLI Reference.
Web Dashboard
cocosearch dashboard opens a browser UI at http://localhost:8080 with:
- Code search โ natural language queries with language, symbol type, and hybrid search filters. Results show syntax-highlighted snippets, score badges, match type, and symbol metadata.
- Index management โ create, reindex (incremental or fresh), and delete indexes from the browser.
- Observability โ language distribution charts, parse health breakdown, staleness warnings, storage metrics.
Dashboard screenshots
ย ย
ย ย
Interactive REPL
cocosearch search --interactive starts a persistent search session:
cocosearch> authentication middleware
[results...]
cocosearch> :lang python
Language filter: python
cocosearch> error handling in views
[results filtered to Python...]
cocosearch> :index other-project
Switched to index: other-project
Settings persist across queries โ change :limit, :lang, :context, or :index without restarting. Supports command history (up/down arrows) and inline filters (lang:python directly in queries).
Where MCP wins
For codebases of meaningful size, CocoSearch reduces the number of MCP tool calls needed to find relevant code โ often from 5-15 iterative grep/read cycles down to 1-2 semantic searches. This means fewer round-trips, less irrelevant content in the context window, and lower token consumption for exploratory and intent-based queries.
- Exploratory/semantic queries: "how does authentication work", "where is error handling done", "find the caching logic".
- Native approach: Claude does 5-15 iterative grep/glob/read cycles, each adding results to context. Lots of trial-and-error, irrelevant matches, and full-file reads.
- CocoSearch: 1 search_code call returns ranked, pre-chunked results with smart context expansion to function/class boundaries. Dramatically fewer tokens in context.
- Identifier search with fuzzy intent: "find the function that handles user signup".
- Native grep requires Claude to guess the exact name (grep "signup", grep "register", grep "create_user"...). Each miss costs a round-trip + tokens.
- CocoSearch's hybrid RRF (vector + keyword) handles this in 1 call.
- Filtered searches: language/symbol type/symbol name filtering is built-in. Native tools require Claude to manually assemble glob patterns and filter results.
Useful Documentation
- How It Works
- Architecture Overview
- Search Features
- Dogfooding
- MCP Configuration
- MCP Tools Reference
- CLI Reference
- Retrieval Logic
- Adding Languages
Components
- Ollama -- runs the embedding model (
nomic-embed-text) locally. - PostgreSQL + pgvector -- stores code chunks and their vector embeddings for similarity search.
- CocoSearch -- CLI and MCP server that coordinates indexing and search.
Available MCP Tools
index_codebase-- index a directory for semantic searchsearch_code-- search indexed code with natural language querieslist_indexes-- list all available indexesindex_stats-- get statistics and parse health for an indexclear_index-- remove an index from the database
Available Skills
- cocosearch-quickstart (SKILL.md): Use when setting up CocoSearch for the first time or indexing a new project. Guides through infrastructure check, indexing, and verification in under 2 minutes.
- cocosearch-debugging (SKILL.md): Use when debugging an error, unexpected behavior, or tracing how code flows through a system. Guides root cause analysis using CocoSearch semantic and symbol search.
- cocosearch-onboarding (SKILL.md): Use when onboarding to a new or unfamiliar codebase. Guides you through understanding architecture, key modules, and code patterns step-by-step using CocoSearch.
- cocosearch-refactoring (SKILL.md): Use when planning a refactoring, extracting code into a new module, renaming across the codebase, or splitting a large file. Guides impact analysis and safe step-by-step execution using CocoSearch.
- cocosearch-new-feature (SKILL.md): Use when adding new functionality โ a new command, endpoint, module, handler, or capability. Guides placement, pattern matching, and integration using CocoSearch.
- cocosearch-explore (SKILL.md): Use for codebase exploration โ answering questions about how code works, tracing flows, or researching a topic. Autonomous mode for subagent/plan mode research; interactive mode for user-facing "how does X work?" explanations.
- cocosearch-subway (SKILL.md): Use when the user wants to visualize codebase structure as an interactive London Underground-style subway map. AI-generated visualization using CocoSearch tools for exploration.
How Search Works
Query: "authentication flow"
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโผโโโโโโโโโโโ
โ Query Analysis โ Detect identifiers
โ (camelCase, etc.) โ โ auto-enable hybrid
โโโโโโโโโโโฌโโโโโโโโโโโ
โ
โโโโโโโโโโโผโโโโโโโโโโโ
โ Ollama Embedding โ nomic-embed-text
โ 768-dim vector โ (runs locally)
โโโโโโโโโโโฌโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโ
โ โ
โโโโโโโโโโโผโโโโโโโโโโโ โโโโโโโโโโโผโโโโโโโโโโโ
โ Vector Similarity โ โ Keyword Search โ
โ (pgvector cosine) โ โ (tsvector FTS) โ
โโโโโโโโโโโฌโโโโโโโโโโโ โโโโโโโโโโโฌโโโโโโโโโโโ
โ โ
โโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโผโโโโโโโโโโโ
โ RRF Fusion โ Reciprocal Rank Fusion
โ + Definition 2x โ merges both ranked lists
โโโโโโโโโโโฌโโโโโโโโโโโ
โ
โโโโโโโโโโโผโโโโโโโโโโโ
โ Symbol & Language โ --symbol-type function
โ Filtering โ --language python
โโโโโโโโโโโฌโโโโโโโโโโโ
โ
โโโโโโโโโโโผโโโโโโโโโโโ
โ Context Expansion โ Expand to enclosing
โ (Tree-sitter) โ function/class boundaries
โโโโโโโโโโโฌโโโโโโโโโโโ
โ
โโโโโโโโโโโผโโโโโโโโโโโ
โ Query Cache โ Exact hash + semantic
โ (LRU + 0.95) โ similarity fallback
โโโโโโโโโโโฌโโโโโโโโโโโ
โ
โผ
Ranked Results
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Supported Languages
CocoSearch indexes 31 programming languages. Symbol-aware languages (โ) support --symbol-type and --symbol-name filtering.
โโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโ
โ Language โ Extensions โ Symbols โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ C โ .c, .h โ โ โ
โ C++ โ .cpp, .cc, .cxx, .hpp, .hxx โ โ โ
โ C# โ .cs โ โ โ
โ CSS โ .css, .scss โ โ โ
โ DTD โ .dtd โ โ โ
โ Fortran โ .f, .f90, .f95, .f03 โ โ โ
โ Go โ .go โ โ โ
โ Groovy โ .groovy, .gradle โ โ โ
โ HTML โ .html, .htm โ โ โ
โ Java โ .java โ โ โ
โ Javascript โ .js, .mjs, .cjs, .jsx โ โ โ
โ JSON โ .json โ โ โ
โ Kotlin โ .kt, .kts โ โ โ
โ Markdown โ .md, .mdx โ โ โ
โ Pascal โ .pas, .dpr โ โ โ
โ Php โ .php โ โ โ
โ Python โ .py, .pyw, .pyi โ โ โ
โ R โ .r, .R โ โ โ
โ Ruby โ .rb โ โ โ
โ Rust โ .rs โ โ โ
โ Scala โ .scala โ โ โ
โ Solidity โ .sol โ โ โ
โ SQL โ .sql โ โ โ
โ Swift โ .swift โ โ โ
โ TOML โ .toml โ โ โ
โ Typescript โ .ts, .tsx, .mts, .cts โ โ โ
โ XML โ .xml โ โ โ
โ YAML โ .yaml, .yml โ โ โ
โ Bash โ .sh, .bash, .zsh โ โ โ
โ Dockerfile โ Dockerfile โ โ โ
โ HCL โ .tf, .hcl, .tfvars โ โ โ
โโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโ
How chunking works
Chunking strategy depends on the language:
- Tree-sitter chunking (~20 languages): CocoIndex's
SplitRecursivelyuses Tree-sitter internally to split at syntax-aware boundaries (function/class edges). Covers Python, JavaScript, TypeScript, Go, Rust, Java, C, C++, C#, Ruby, PHP, and others in CocoIndex's built-in list. - Custom handler chunking (6 languages): HCL, Dockerfile, Bash, Go Template, Scala, and Groovy use regex-based
CustomLanguageSpecseparators tuned for their syntax โ no Tree-sitter grammar available for these in CocoIndex. - Text fallback: Languages not recognized by either tier (Markdown, JSON, YAML, TOML, etc.) are split on blank lines and whitespace boundaries.
In short: CocoIndex's Tree-sitter tells you where to cut; the .scm files tell you what's inside each piece.
Independently of chunking, CocoSearch runs its own Tree-sitter queries (.scm files in src/cocosearch/indexer/queries/) to extract symbol metadata โ function, class, method, and interface names and signatures. This powers --symbol-type and --symbol-name filtering. Symbol extraction is available for 14 languages.
See Adding Languages for details on how these tiers work and how to add new languages or grammars.
Supported Grammars
Beyond language-level support, CocoSearch recognizes grammars โ domain-specific schemas within a base language. A language is matched by file extension (e.g., .yaml -> YAML), while a grammar is matched by file path and content patterns (e.g., .github/workflows/ci.yml containing on: + jobs: -> GitHub Actions). Grammars provide structured chunking and richer metadata compared to generic text chunking.
โโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Grammar โ File Format โ Path Patterns โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ docker-compose โ yaml โ docker-compose*.yml, docker-compose*.yaml, compose*.yml, compose*.yaml โ
โ github-actions โ yaml โ .github/workflows/*.yml, .github/workflows/*.yaml โ
โ gitlab-ci โ yaml โ .gitlab-ci.yml โ
โ helm-template โ gotmpl โ **/templates/*.yaml, **/templates/**/*.yaml, **/templates/*.yml, โ
โ โ โ **/templates/**/*.yml โ
โ helm-values โ yaml โ **/values.yaml, **/values-*.yaml โ
โ kubernetes โ yaml โ *.yaml, *.yml โ
โโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
How grammar matching works
Priority: Grammar match > Language match > TextHandler fallback.
A grammar is matched by file path patterns and optionally by content patterns. For example, a YAML file at .github/workflows/ci.yml containing on: + jobs: is recognized as GitHub Actions, not generic YAML. This enables structured chunking by job/step and richer metadata extraction (job names, service names, stages).
Configuration
Create cocosearch.yaml in your project root to customize indexing:
indexing:
# See also https://cocoindex.io/docs/ops/functions#supported-languages
include_patterns:
- "*.py"
- "*.js"
- "*.ts"
- "*.go"
- "*.rs"
exclude_patterns:
- "*_test.go"
- "*.min.js"
chunk_size: 1000 # bytes
chunk_overlap: 300 # bytes
Testing
Tests use pytest. All tests are unit tests, fully mocked, and require no infrastructure. Markers are auto-applied based on directory -- no need to add them manually.
uv run pytest # Run all unit tests
uv run pytest tests/unit/search/test_cache.py -v # Single file
uv run pytest -k "test_rrf_double_match" -v # Single test by name
uv run pytest tests/unit/handlers/ -v # Handler tests
Troubleshooting
Dashboard shows "Indexing" but CLI shows "Indexed"
The web dashboard and CLI now share a status sync mechanism: when the dashboard detects a live indexing thread, it corrects the database status so both interfaces agree. If you still see a discrepancy, check whether indexing is genuinely running (CPU usage, docker stats for Ollama activity).
Index appears stuck in "Indexing" status
After 1 hour with no progress updates, the status auto-recovers to "Indexed". You can also run cocosearch index . again to force a fresh index, which will reset the status.
High CPU after indexing appears complete
Ollama may still be processing embeddings in its queue. Check with docker stats or ps aux | grep ollama. CocoIndex may also perform background cleanup after the main indexing loop finishes.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cocosearch-0.1.3.tar.gz.
File metadata
- Download URL: cocosearch-0.1.3.tar.gz
- Upload date:
- Size: 154.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31d67d6dabcd53253b7b3d144168ec9f335e02bd5787a58a9b8c34418350019f
|
|
| MD5 |
251caf102fd6ef1ddf1c82a96558d3a2
|
|
| BLAKE2b-256 |
97376e8b2df1e9be2e21be52214bf8007031b320a3e9719c575ba9576a8686f2
|
File details
Details for the file cocosearch-0.1.3-py3-none-any.whl.
File metadata
- Download URL: cocosearch-0.1.3-py3-none-any.whl
- Upload date:
- Size: 188.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0fcf41c23b5be32c4e2d9cfed31f28efef422a1642edb418a9f7f6f9871dded2
|
|
| MD5 |
5e28a17cc8914436f5973c47b629ffe3
|
|
| BLAKE2b-256 |
722fef46d479587089a0ed455a18d25a015c989957c546dae98d8840500a9d0a
|