Local Python docs MCP server, accelerated with Rust

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

pydocs-mcp

Local, version-aware code & docs search for your AI coding agent — over the exact library versions installed on your machine.

Your AI assistant thinks you're on requests 2.28. You actually have 2.31. It calls a kwarg that was renamed two versions ago, your test fails, and you lose twenty minutes. The fix isn't a smarter prompt — it's giving the AI docs that match your lockfile, not the average of every StackOverflow answer it ever read.

pydocs-mcp indexes your project plus every installed dependency, right on your machine, in seconds. Your agent connects over MCP and gets answers grounded in your code — fully offline.

What you get

Matched to your install. Searches the exact versions sitting in your site-packages, so your agent stops inventing APIs from some older release.
Private & offline. Everything runs locally — no API keys, no uploads, no rate limits, no per-query fees.
Three ways to find code. Keyword, meaning, and LLM reasoning (see How it works) — on their own or fused into one ranked answer.
Knows how your code connects. Ask "what calls this?", "what does it call?", or "what does this class inherit?" — across your project and every dependency.
Lean, not bloated. Minimal dependencies — no PyTorch, no FAISS. A small local ONNX embedder plus the Rust TurboQuant vector store (turbovec), which packs embeddings ~16× smaller than float32 (a 1536-dim vector drops from 6,144 to 384 bytes; a 10M-doc corpus fits in 4 GB instead of 31 GB) and benchmarks faster than FAISS FastScan. The on-disk index stays tiny and search stays quick.
Cheap to keep current. Edit a doc and only the changed chunks are re-embedded — partial re-ingestion, not a full rebuild — while unchanged packages are skipped in under 100 ms. A Rust core does the heavy lifting.

How it works

Three steps, all on your machine (see the diagram above):

Index — pydocs-mcp scans your project and installed deps into a local SQLite database (code chunks, metadata, and a graph of how everything references everything else) plus a compact TurboQuant .tq vector file for meaning-based search. Re-running is cheap: unchanged packages are skipped, and when a file does change, only its changed chunks are re-embedded.
Search — each query can use three complementary modes and fuse them into one ranked list:
- Keyword — instant, exact matches for names, error strings, and signatures.
- Meaning — dense embeddings find the right code even when your words differ from the docs', via a small model that runs locally.
- Reasoning — for broad or structural questions, an LLM walks your code's map (titles + summaries, no embeddings) to pick the best spots.
Answer — results flow back to your agent through two simple tools: search (find by relevance) and lookup (jump to a known name, or trace its callers, callees, and inheritance).

The only call that ever leaves your machine is the optional reasoning mode — and only if you turn it on with your own key.

Quick start

pip install -e .                  # pure Python, works everywhere
# …or with the Rust core for speed:
pip install maturin && maturin develop --release

Linux needs OpenBLAS for the vector store (macOS and Windows already ship it):

sudo apt-get install -y libopenblas-pthread-dev

Then index your project and start the server:

pydocs-mcp serve .                            # index project + deps, serve over MCP (stdio)
pydocs-mcp serve . --gpu                      # …same, with CUDA-accelerated embeddings
pydocs-mcp search "batch inference"           # the same search, from the CLI
pydocs-mcp lookup requests.auth.HTTPBasicAuth --show inherits

Embeddings run on CPU by default. Add --gpu to serve / index (or the benchmark runner) to move all embedder inference — FastEmbed, the sentence_transformers provider, and PyLate — onto CUDA. It's a latency knob only: no YAML change, no re-index, identical results. Needs the matching GPU runtime — see INSTALL.md.

Live re-indexing (optional)

If you edit code while you want the index to stay fresh, install the watch extras and pick one of two modes — both debounce edits to .py, .md, and .ipynb files into a single reindex.

pip install 'pydocs-mcp[watch]'
pydocs-mcp serve . --watch   # MCP server + watcher (for AI clients)
pydocs-mcp watch .            # watcher only (no MCP server; index stays fresh for CLI `search` / `lookup`)

Both modes share the same YAML tunables: debounce, file extensions, and ignored paths live under serve.watch.* in your pydocs-mcp.yaml (see DOCUMENTATION.md).

Point Claude Code, Cursor, or Continue.dev at it over stdio — copy-paste client configs are in DOCUMENTATION.md, and install troubleshooting (including the libopenblas fallback) is in INSTALL.md.

How it compares

pydocs-mcp, Context7, and Neuledge Context all feed docs to an AI agent over MCP, but optimize for different things. They aren't mutually exclusive — an agent can mount all three and route by intent.

	pydocs-mcp	Context7	Neuledge Context
Deployment	Local stdio MCP server	Hosted MCP (`mcp.context7.com`)	Local stdio MCP server
Doc source	Your installed Python deps + your own project, indexed in place	Curated community docs hosted by Upstash	Community registry (~100+ libraries), pulled then queried locally
Version match	Exactly what's in your `site-packages` — automatic	Library + version chosen in the prompt	Latest from the registry
Languages	Python	Multi-language	Multi-language (~100+ libraries)
Retrieval	Keyword (BM25) + dense embeddings + LLM tree reasoning, fused via RRF or weighted scores	Not publicly documented	BM25 over SQLite FTS5
Code-structure queries	Reference graph — `lookup(show=callers\|callees\|inherits)`	None (doc retrieval only)	None (doc retrieval only)
Indexes your code	Yes — under the `__project__` package	No	No
Privacy	Fully offline with the default embedder — zero network calls	Queries hit Upstash; OAuth + API key	Local once packages are downloaded
Dependencies	Lean — no PyTorch, no FAISS (Rust TurboQuant store + small ONNX embedder)	Hosted service (nothing to install)	Local service
Cost	$0 — OSS (MIT); no keys, limits, or fees	Free tier (rate-limited) + paid plans	$0 — OSS (Apache-2.0)

In short: choose pydocs-mcp for offline, version-matched Python retrieval where you also navigate code structure; Context7 for hosted, multi-language docs; Neuledge for a local-first multi-language registry.

Benchmarked, not hand-waved

pydocs-mcp ships a real benchmark harness that scores retrieval quality on public benchmarks (RepoQA, DS-1000) and head-to-head against Context7 and Neuledge — with confidence intervals and plots. See benchmarks/README.md.

Retrieval methods & R&D

Each method below is a named step under python/pydocs_mcp/retrieval/steps/, addressable from YAML. The default chunk_search.yaml composes BM25 + single-vector dense fused via RRF; everything else is opt-in via a preset swap (--config), with no behavioral change for default installs.

Keyword — BM25 over SQLite FTS5

Full-text search with porter stemming and the unicode61 tokenizer. Free, instant, and the baseline that every other method composes with through the fusion steps below.

Single-vector dense — FastEmbed + TurboQuant

Embedder. FastEmbed with BAAI/bge-small-en-v1.5 by default — runs on CPU via ONNX, no PyTorch, no torch download. OpenAI text-embedding-3-small is the optional alternative for users with an API key. Pass --gpu to run the on-device embedders (FastEmbed / sentence_transformers) on CUDA instead — same vectors, lower latency.
Bigger on-device model — the sentence_transformers provider. For stronger dense recall without an API key, switch to Qwen/Qwen3-Embedding-0.6B served via sentence-transformers (torch). It is GPU-reliable — torch frees CUDA memory between sequential index-builds — and the weights download at runtime on first use. Install the extra (pip install 'pydocs-mcp[sentence-transformers]', ~1-5 GB with torch), then set it in your YAML:
```
embedding:
  provider: sentence_transformers
  model_name: Qwen/Qwen3-Embedding-0.6B
  dim: 1024
  # Optional. Token cap (attention is O(seq^2) — the OOM guard). Omit to
  # use the embedder's own default (2048).
  max_seq_length: 2048
  # Optional. L2-normalize output (default true).
  normalize: true
  # Optional. Named asymmetric query prompt; omit to use the model's own.
  query_prompt_name: query
```
The default remains bge-small; the sentence_transformers provider is opt-in.
Vector store. TurboQuant (turbovec) — Online Vector Quantization with near-optimal distortion. ~16× smaller than float32 (a 1536-dim vector drops from 6,144 to 384 bytes; a 10 M-doc corpus fits in 4 GB instead of 31 GB) and faster than FAISS FastScan at the same recall. Persists as a .tq sidecar next to the SQLite DB.

Late-interaction (multi-vector / MaxSim) — opt-in

The flagship R&D backend. One vector per token instead of one pooled vector per chunk; queries score via ColBERT's MaxSim — for each query token, take the maximum cosine to any document token, then sum. Higher recall on long, structurally distant queries (often the hard cases for single-vector retrievers).

Method. ColBERT late interaction (Khattab & Zaharia, SIGIR 2020).
Engine. PLAID (Santhanam et al., CIKM 2022) via fast-plaid — a Rust-backed IVF + residual-decompression engine. Persists as a per-project directory sidecar at ~/.pydocs-mcp/{slug}.plaid/.
Embedder. PyLate (arXiv:2508.03555) with the default model lightonai/LateOn-Code — late-interaction trained on code.
Lighter-weight model — lightonai/LateOn-Code-edge. For a smaller per-token footprint, point the same PyLate path at lightonai/LateOn-Code-edge (48-dim token vectors instead of LateOn-Code's 128) in your YAML:
```
late_interaction:
  enabled: true
  provider: pylate
  model_name: lightonai/LateOn-Code-edge
  embedding_dim: 48
  document_length: 2048
  query_length: 256
```
The default stays LateOn-Code; LateOn-Code-edge is opt-in.
SQLite + fast-plaid coupling. A chunk_multi_vector_ids mapping table bridges SQLite's chunk_id to fast-plaid's plaid_doc_id. The shipped FilterAdapter Protocol pushes metadata filters down to SQLite, then the result chunk-id list is passed as subset= to fast-plaid's MaxSim search — so MaxSim is always bounded to the SQLite-eligible candidates and the two engines stay in their own id spaces.
Enable. pip install 'pydocs-mcp[late-interaction]', set late_interaction.enabled: true in your YAML, then point --config at the shipped chunk_search_late_interaction.yaml preset.

Hybrid fusion

Reciprocal Rank Fusion (RRF) — Cormack, Clarke & Buettcher, SIGIR 2009. Rank-only 1 / (k + rank) with k=60 default; the workhorse for combining BM25 + dense, or BM25 + late-interaction.
Weighted Score Interpolation (WSI) — score-space α · score_a + (1 − α) · score_b with min-max normalization, for cases where the score distributions are well-calibrated and rank isn't enough. α is tunable from YAML.

LLM tree reasoning — opt-in

A vectorless mode for broad, structural questions ("walk me through the request lifecycle"). Instead of embedding text, an LLM walks the code map — module / class titles plus short summaries — and picks the best spots itself. Inspired by PageIndex (VectifyAI)'s reasoning-over-tree-of-contents approach.

Three shipped presets under python/pydocs_mcp/pipelines/: tree_only.yaml, chunk_search_with_tree_reasoning_parallel.yaml (run alongside chunk search, fuse via WSI), and chunk_search_with_tree_reasoning_after.yaml (use chunk search as the candidate pool, let the LLM re-rank). Provider / model / temperature / max_tokens are tuned under the llm: section of YAML; any OpenAI-compatible endpoint works.

Code reference graph

Beyond embeddings, pydocs-mcp captures a graph of how code references code during indexing: CALLS, IMPORTS, INHERITS, and optional MENTIONS (backtick-quoted dotted names in markdown). The same surface answers an AI's "what calls this?" / "what does this extend?" questions through the lookup(show=…) MCP tool:

pydocs-mcp lookup requests.auth.HTTPBasicAuth --show inherits
pydocs-mcp lookup my_module.Parser.parse --show callers

Capture is on by default and tunable under reference_graph: in YAML (toggle, kinds-to-emit, output bounds).

Learn more

DOCUMENTATION.md — how it works in depth: retrieval pipeline, reference graph, cache, configuration, database schema, and the full CLI reference.
EXTENSIONS.md — extend it: new vector-store backends, pipeline steps, and fusion strategies.
benchmarks/README.md — the evaluation harness.
INSTALL.md — installation & troubleshooting.
CLAUDE.md — architecture & contributor guide.

Sources & references

Benchmarks

RepoQA — Evaluating Long Context Code Understanding · arXiv:2406.06025 (2024)
DS-1000 — A Natural and Reliable Benchmark for Data Science Code Generation · arXiv:2211.11501 (2023)
CodeRAG-Bench — Can Retrieval Augment Code Generation? · arXiv:2406.14497 (2024)

Vectors & retrieval

TurboQuant — Online Vector Quantization with Near-optimal Distortion Rate · arXiv:2504.19874 (Google Research, 2025); implemented by turbovec
FAISS — the similarity-search library used as the speed/storage baseline above
FastEmbed with BAAI/bge-small-en-v1.5 — the default on-device embedder for the single-vector dense mode
PyLate with lightonai/LateOn-Code — the default model for the opt-in late-interaction (multi-vector / MaxSim) mode · PyLate: Flexible Training and Retrieval for Late Interaction Models · arXiv:2508.03555 (LightOn, 2025)
ColBERT — Efficient and Effective Passage Search via Contextualized Late Interaction over BERT · arXiv:2004.12832 (Khattab & Zaharia, SIGIR 2020) — the late-interaction architecture
PLAID — An Efficient Engine for Late Interaction Retrieval · arXiv:2205.09707 (Santhanam et al., CIKM 2022) — implemented by fast-plaid, the engine pydocs-mcp uses for MaxSim scoring
Reciprocal Rank Fusion — Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods · Cormack, Clarke & Buettcher, SIGIR 2009 — the rank-fusion baseline (k=60)
PageIndex — inspiration for the LLM tree-reasoning mode

Protocol & comparable tools

Model Context Protocol — the MCP standard
Context7 · Neuledge Context

License: MIT.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

msobroza

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

Jun 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydocs_mcp-0.3.0.tar.gz (7.3 MB view details)

Uploaded Jun 10, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pydocs_mcp-0.3.0-cp311-abi3-win_amd64.whl (1.2 MB view details)

Uploaded Jun 10, 2026 CPython 3.11+Windows x86-64

pydocs_mcp-0.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.4 MB view details)

Uploaded Jun 10, 2026 CPython 3.11+manylinux: glibc 2.17+ x86-64

pydocs_mcp-0.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded Jun 10, 2026 CPython 3.11+manylinux: glibc 2.17+ ARM64

pydocs_mcp-0.3.0-cp311-abi3-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded Jun 10, 2026 CPython 3.11+macOS 11.0+ ARM64

File details

Details for the file pydocs_mcp-0.3.0.tar.gz.

File metadata

Download URL: pydocs_mcp-0.3.0.tar.gz
Upload date: Jun 10, 2026
Size: 7.3 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pydocs_mcp-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`6b7e54757829118762b9be8dec1ab5b10151344c4635e4aaa471a3234320dd7b`
MD5	`885ae9952575cc33d129a1bf037226b5`
BLAKE2b-256	`011812388d4e97c618559aab7fe02869083978c55297222cab7377a9c7f1c1de`

See more details on using hashes here.

File details

Details for the file pydocs_mcp-0.3.0-cp311-abi3-win_amd64.whl.

File metadata

Download URL: pydocs_mcp-0.3.0-cp311-abi3-win_amd64.whl
Upload date: Jun 10, 2026
Size: 1.2 MB
Tags: CPython 3.11+, Windows x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pydocs_mcp-0.3.0-cp311-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`8eeb541fdf765494c0839238d7640c551ecdcee0c22c629a06b9d5f8623f2a4f`
MD5	`a761ece2df5af63237e152b2a0bd965f`
BLAKE2b-256	`7573e4acd45ac2e87b59cb5d3d50ff8a351fe8290c97a9f8d9fd25db134ec3b6`

See more details on using hashes here.

File details

Details for the file pydocs_mcp-0.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: pydocs_mcp-0.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Jun 10, 2026
Size: 1.4 MB
Tags: CPython 3.11+, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pydocs_mcp-0.3.0-cp311-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`62f57c3356da50c42711fd322d69f96677f813ccf9da55f7347d69337f70506b`
MD5	`4502dfe817f6a583966d5205651b8324`
BLAKE2b-256	`d99ddb863526a392a5965a0df997cef2d385c9fdf8b02af93a49e90204b1c27f`

See more details on using hashes here.

File details

Details for the file pydocs_mcp-0.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: pydocs_mcp-0.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: Jun 10, 2026
Size: 1.4 MB
Tags: CPython 3.11+, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pydocs_mcp-0.3.0-cp311-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`045d3820bb061c44464df566e57149f3832e71ff36ad11cc16a0c6f301cd5df1`
MD5	`9b082b6a4fb4b5903f89518701fcf7f1`
BLAKE2b-256	`4a0f3bf502ecb25ce59acf217d0f985132e327b5448cb056292c5cbd35fd1ba5`

See more details on using hashes here.

File details

Details for the file pydocs_mcp-0.3.0-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: pydocs_mcp-0.3.0-cp311-abi3-macosx_11_0_arm64.whl
Upload date: Jun 10, 2026
Size: 1.3 MB
Tags: CPython 3.11+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for pydocs_mcp-0.3.0-cp311-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`8bcc301a60768e4edec3f2383ce1bb5b32b48b4dccad4645aeb548b762ac36c0`
MD5	`e48f80aa940e1b1413a50eceb069e1a4`
BLAKE2b-256	`1ca89c650bbcc1574a62ac1ebc5c22bb15e02cc3f4e30166c419fda806b5101e`

See more details on using hashes here.

pydocs-mcp 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

pydocs-mcp

What you get

How it works

Quick start

Live re-indexing (optional)

How it compares

Benchmarked, not hand-waved

Retrieval methods & R&D

Keyword — BM25 over SQLite FTS5

Single-vector dense — FastEmbed + TurboQuant

Late-interaction (multi-vector / MaxSim) — opt-in

Hybrid fusion

LLM tree reasoning — opt-in

Code reference graph

Learn more

Sources & references

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes