100% local MCP server for semantic + lexical code search: AST-aware chunking (tree-sitter), hybrid BM25+dense retrieval, optional code knowledge graph.

These details have not been verified by PyPI

Project links

Project description

Lynx

A 100% local MCP server for semantic code search — AST-aware chunking, hybrid BM25 + dense retrieval, and an optional code knowledge graph. Works with any MCP client (Claude Code, Cursor, Windsurf, Antigravity, ...).

Python 3.10+

Your AI assistant greps file names and guesses. Lynx gives it real retrieval over your code, your library docs, and your PDFs — without a single byte leaving your machine.

AST-aware indexing — tree-sitter parses 13+ languages and indexes whole functions/classes, not arbitrary text windows.
Hybrid retrieval — dense embeddings + code-tokenized BM25, fused with RRF; optional cross-encoder reranker.
Code knowledge graph (opt-in) — who-calls-what, inheritance, imports: ask "what breaks if I change this?" and get the actual blast radius.
Multi-source — index codebases, public docs sites (fetched once, on demand; JS-rendered SPAs supported via optional headless Chromium), and PDFs side by side.
Live index — a file watcher re-indexes saves in ~2s. No manual rebuild ritual.
Web manager UI — lynx manager ui gives you guided setup, a query playground, diagnostics, and client config snippets.

(Named after Lynceus, the Argonaut whose sharp eyes could find anything hidden.)

Quickstart

# 1. Install the CLI (isolated, no venv ritual)
pipx install lynx-mcp
#    or: uv tool install lynx-mcp

# 2. Create a config pointing at your project
lynx manager init

# 3. Build the index (downloads the ~130MB embedding model on first run)
lynx build

Then register Lynx in your MCP client (Claude Code shown; see the full guide for Cursor, Antigravity, and generic stdio clients — or let lynx manager ui generate the snippet for you):

{
  "mcpServers": {
    "lynx": {
      "command": "lynx",
      "args": ["serve", "--config", "/absolute/path/to/config.json"]
    }
  }
}

Prefer zero terminal? There are double-click installers for macOS and Windows.

The tools your AI gets

The tool set is fixed — it does not grow with the number of sources, so your client's tool list (and context window) stays small. Tools take a source argument where relevant.

Tool	What it answers
`search(query, source?)`	Primary hybrid search. Omit `source` to search every source at once (RRF-fused).
`deep_search(queries, source?)`	Escalation: tries multiple query phrasings until one passes a quality threshold.
`graph_query(operation, symbol?)`	`callers`, `callees`, `subclasses`, `superclasses`, `imports`, `neighbors`, `shortest_path`, `overview`, `surprising_connections`, `status`.
`find_definition(symbol)`	Where is X defined? (AST-precise when the graph is on, BM25 fallback otherwise.)
`find_usages(symbol)`	Every use of X — calls and non-call references (generics, decorators, docs).
`find_tests_for(symbol)`	Are there tests for X?
`find_similar(snippet)`	Does code like this already exist?
`search_diff(query, base?)`	Search only the files changed vs a base branch — built for code review.
`feedback(trying_to_do, tried, stuck)`	The agent files a report when the index couldn't answer — stored 100% locally, your signal for tuning sources.
`list_sources` / `get_rag_status` / `update_source_index`	Introspection and maintenance.

All retrieval tools carry MCP readOnlyHint annotations (clients can auto-approve them), and the server ships its usage playbook in the MCP handshake (instructions + a lynx://guide resource) — your agent knows how to query well without any rules-file setup.

How it works

your code ──► tree-sitter AST chunker ──► bge-small embeddings ──► ChromaDB
                                     └──► code-tokenized BM25 ─┐
query ───────────────────────────────────► RRF fusion ◄────────┘ ──► (optional reranker) ──► results

Everything runs locally: HuggingFace models are downloaded once, then Lynx switches to offline mode. No telemetry, no cloud index, no code upload. The only network access is the model download and the explicit webdoc fetch step you trigger yourself.

Why not just let the agent grep?

Grep is great when you know the identifier. It fails when you (or the agent) know the behavior: "where do we clamp the camera zoom?" matches nothing literal. Agentic grep also burns tokens — every wrong file the agent opens is context spent. Lynx answers behavioral queries in one tool call with file + line + symbol citations, and the graph layer answers structural questions (callers, inheritance) that grep fundamentally cannot — polymorphic dispatch leaves no textual trace.

Honest counterpoint: on a small repo that fits in the agent's context, built-in tools are fine. Lynx pays off on large codebases, on framework docs your model's training data has gone stale on, and on repeated sessions where re-exploring from scratch is waste.

Benchmarks (reproducible)

Lynx vs agentic grep: -58% tokens to answer; 4 vs 101 tool calls to map a class hierarchy

On the django/ package of Django 5.2 (883 files, ~158k lines), 20 behavioral questions with known ground-truth files — full methodology, per-task results, and an intentionally strong grep baseline in benchmarks/RESULTS.md:

	Agentic grep	Lynx
median tokens to answer (tool output + required follow-up read)	4,150	1,725
tool round-trips before the code is in context	2+	1 (chunks included, with symbol + file:line + score)
hit@1 / MRR	45% / 0.64	55% / 0.67
"what inherits from `Field`?" — full descendant tree (100 classes)	101 grep rounds	4 graph calls, same recall, file:line per edge

The ranking quality is comparable (Django's docstring-rich code is grep's best case — we say so in the report). The structural difference is not: every tool round-trip is a full model inference over the growing context, and class-relation questions force grep into one round per discovered class while graph_query reads resolved inheritance edges.

# reproduce
git clone --depth 1 --branch 5.2 https://github.com/django/django.git benchmarks/_target/django
python benchmarks/run_benchmark.py && python benchmarks/structural_demo.py

Lynx + Coral: SQL that joins your code to your live tools

Coral is a local SQL engine over your live tools — GitHub, Sentry, Linear, Datadog. Point it at Lynx's source spec and Lynx becomes a SQL schema too, so one query can start from what's happening — an error, a ticket, a PR — and Lynx tells you where it lives in your code.

The move that makes it powerful: you don't type the search query — the join takes it from each row of the other table, so live data is matched to code automatically. Your code never leaves the machine; only the live-data side hits an API. Two-command setup in docs/CORAL.md.

Example 1 — Prep a stack of code reviews at once. You maintain a repo with a dozen open PRs and vague titles ("fix flaky retries", "tweak checkout"). Before you assign reviewers, you want to know which part of the codebase each one is actually about.

SELECT p.number, p.title, h.file, h.symbol, h.score
FROM github.pulls p
CROSS JOIN lynx.search(q => p.title) h    -- for each open PR, find the code its title is about
WHERE p.owner = 'your-org' AND p.repo = 'your-repo' AND p.state = 'open'
ORDER BY p.number;

Coral hands each open PR's title to Lynx, which returns the code area it most likely concerns — by meaning, so a PR titled "fix flaky retries" lands on PaymentWebhook.ScheduleRetry even though the words don't match. Now you can route each review to whoever owns that area, or spot two PRs converging on the same file — without opening a single diff. Without it: open every PR, read the description and the diff, and build that map in your head.

Example 2 — A whole triage queue in one query. Monday morning: 30 unresolved Sentry issues, nobody's triaged them. Where does each one live in the code?

SELECT i.title, h.file, h.symbol, h.score
FROM sentry.issues i
CROSS JOIN lynx.search(q => i.title) h    -- for each issue, search the code with its own title
WHERE i.status = 'unresolved';

Here's the trick: for every issue, Coral takes its title (i.title) and feeds it into Lynx as the search query — one semantic code lookup per incident, all in a single statement. You get a correlation table — incident → most-likely code location — that you never had to build. Without it: open each issue, read it, switch to the editor, hunt for the code. Thirty times.

Example 3 — A new hire, a ticket, an unfamiliar repo. You just joined the team and get assigned "checkout shows the wrong tax." You've never opened this codebase and have no idea where tax is computed.

SELECT t.identifier, h.file, h.symbol
FROM linear.issues t
CROSS JOIN lynx.search(q => t.title) h    -- search the code using the ticket's own title
WHERE t.assignee = 'me' AND t.state = 'started'
LIMIT 5;

Same move as Example 2, but driven by your tickets: Coral hands each ticket's title to Lynx, which finds the matching code semantically — so "tax" lands on TaxCalculator.ComputeVat even though the words don't match:

ticket	file	symbol
LIN-482	`CheckoutTotals.cs`	`TaxCalculator.ComputeVat`

You go from "where do I even start?" to "it's in TaxCalculator" in one query — instead of pinging a teammate or reading half the repo to get oriented.

In SQL terms: lynx.sources is a table (your indexed sources); lynx.search(q => '…') is a ranked search function — add source => '…' or top_k => N to narrow it. Full setup and JOIN syntax: docs/CORAL.md.

Documentation


Full guide	Configuration, all source types (codebase / webdoc / PDF), retrieval internals, troubleshooting
Manager UI	Guided setup, playground, diagnostics
Use Lynx from Coral	SQL over your code search: `SELECT ... FROM lynx.search` joined with live GitHub/Sentry data
config.example.json	Annotated example configuration

Status

Actively developed by one author; APIs may still move before 1.x stabilizes. Issues and PRs welcome — the test suite runs with pytest and CI must stay green.

License

Apache 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.7.4

Jun 27, 2026

1.7.3

Jun 22, 2026

1.7.2

Jun 21, 2026

1.7.1

Jun 21, 2026

1.7.0

Jun 19, 2026

1.6.0

Jun 19, 2026

1.5.1

Jun 15, 2026

1.5.0

Jun 15, 2026

1.4.1

Jun 14, 2026

This version

1.4.0

Jun 14, 2026

1.3.1

Jun 14, 2026

1.3.0

Jun 12, 2026

1.2.0

Jun 12, 2026

1.1.2

Jun 12, 2026

1.1.1

Jun 12, 2026

1.1.0

Jun 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lynx_mcp-1.4.0.tar.gz (421.3 kB view details)

Uploaded Jun 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lynx_mcp-1.4.0-py3-none-any.whl (363.5 kB view details)

Uploaded Jun 14, 2026 Python 3

File details

Details for the file lynx_mcp-1.4.0.tar.gz.

File metadata

Download URL: lynx_mcp-1.4.0.tar.gz
Upload date: Jun 14, 2026
Size: 421.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for lynx_mcp-1.4.0.tar.gz
Algorithm	Hash digest
SHA256	`4e016a5eaeb30d921347b5f7ff350768d1ff827fcd37ea7d8a002184848a33f1`
MD5	`7cee189e58f54f02695cbb5ca9764a3b`
BLAKE2b-256	`0ef9acf2350fd5e5e50617a87752e738543711adc94ee86f02007de34115dc3d`

See more details on using hashes here.

File details

Details for the file lynx_mcp-1.4.0-py3-none-any.whl.

File metadata

Download URL: lynx_mcp-1.4.0-py3-none-any.whl
Upload date: Jun 14, 2026
Size: 363.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for lynx_mcp-1.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`edf2560917f67e5aebab8c2965d5f36c237c8be48d4e70b700ba79e49f281c1b`
MD5	`0c980c47961dfc829e09cd3a6fe746a5`
BLAKE2b-256	`3fd3c6ed8c971ec9e959b364d828220b01e4f2bc4cabff2dcf0a54349d606948`

See more details on using hashes here.

lynx-mcp 1.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Lynx

Quickstart

The tools your AI gets

How it works

Why not just let the agent grep?

Benchmarks (reproducible)

Lynx + Coral: SQL that joins your code to your live tools

Documentation

Status

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes