Semantic search engine for markdown documents. MCP server with multi-provider embeddings and Milvus vector storage.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Markdown-FastRAG-MCP

A semantic search engine for markdown documents. An MCP server with non-blocking background indexing, multi-provider embeddings (Gemini, OpenAI, Vertex AI, Voyage), and Milvus / Zilliz Cloud vector storage — designed for multi-agent concurrent access.

This project is a fork of Zackriya-Solutions/MCP-Markdown-RAG, heavily extended for production multi-agent use. Original project is licensed under Apache 2.0.

Ask "what are the tradeoffs of microservices?" and find your notes about service boundaries, distributed systems, and API design — even if none of them mention "microservices."

graph LR
    A["Claude Code"] --> M["Milvus Standalone<br/>(Docker)"]
    B["Codex"] --> M
    C["Copilot"] --> M
    D["Antigravity"] --> M
    M --> V["Shared Document Index"]

Quick Start

pip install markdown-fastrag-mcp

Add to your MCP host config:

{
  "mcpServers": {
    "markdown-rag": {
      "command": "uvx",
      "args": ["markdown-fastrag-mcp"],
      "env": {
        "EMBEDDING_PROVIDER": "gemini",
        "GEMINI_API_KEY": "${GEMINI_API_KEY}",
        "MILVUS_ADDRESS": "http://localhost:19530"
      }
    }
  }
}

Tip: Omit MILVUS_ADDRESS for local-only use (defaults to SQLite-based Milvus Lite).

Features

Semantic matching — finds conceptually related content, not just keyword hits
Multi-provider embeddings — Gemini, OpenAI, Vertex AI, Voyage, or local models
Async background indexing — non-blocking index_documents returns instantly with job_id; poll with get_index_status
Event-loop-safe threading — all sync I/O runs in worker threads via asyncio.to_thread
Smart incremental indexing — mtime/size fast-path skips unchanged files without reading them
3-way delta scan — classifies files as new/modified/deleted in one walk; new files skip Milvus delete
Smart chunk merging — small chunks below MIN_CHUNK_TOKENS are merged with siblings; parent header context injected
Empty chunk filtering — frontmatter-only and structural-only chunks (headers/separators with no prose) are dropped at indexing and filtered at search time
Short chunk drop — final chunks below MIN_FINAL_TOKENS (default 150) are dropped with per-chunk stderr logging
Reconciliation sweep — after each index run, queries all Milvus paths and deletes orphan vectors whose source files no longer exist on disk
Search dedup — per-file result limiting prevents a single document from dominating results
Scoped search & pruning — scope_path filters results to subdirectories; pruning never wipes unrelated data
Batch embedding & insert — concurrent batches with 429 retry, chunked Milvus inserts under gRPC 64MB limit
Shell reindex CLI — reindex.py for large-scale indexing with real-time progress logs

📚 Documentation

Document	Description
Embedding Providers	All 6 providers: setup, auth, tuning, rate limiting
Milvus / Zilliz Setup	Lite vs Standalone vs Zilliz Cloud, Docker Compose, troubleshooting
Indexing Architecture	Non-blocking flow, `to_thread`, 3-way delta, reconciliation sweep
Optimization	Chunk merging, header injection, batch insert, search dedup

Tools

Tool	Description
`index_documents`	Start background index job, returns `job_id` instantly
`get_index_status`	Poll job status (`running` / `succeeded` / `failed`)
`search_documents`	Semantic search with relevance scores and file paths
`clear_index`	Reset vector database and tracking state

How It Works

flowchart LR
    A["📁 Markdown Files"] -->|"walk + filter"| B["🔍 Delta Scan<br/>mtime/size"]
    B -->|changed| C["✂️ Chunk + Merge"]
    B -->|unchanged| SKIP["⏭️ Skip"]
    B -->|deleted| PRUNE["🗑️ Prune"]
    C --> D["🧠 Embed"]
    D -->|"batch insert"| E["💾 Milvus"]

    F["🔎 Query"] --> D
    D -->|"k×5"| G["📊 Dedup + Top-K"]

    style A fill:#2d3748,color:#e2e8f0
    style D fill:#553c9a,color:#e9d8fd
    style E fill:#2a4365,color:#bee3f8
    style G fill:#22543d,color:#c6f6d5
    style PRUNE fill:#742a2a,color:#fed7d7

Configuration

Core

Variable	Default	Description
`EMBEDDING_PROVIDER`	`local`	`gemini`, `openai`, `openai-compatible`, `vertex`, `voyage`
`EMBEDDING_DIM`	`768`	Vector dimension
`MILVUS_ADDRESS`	`.db/milvus_markdown.db`	Milvus address or local file path
`MARKDOWN_WORKSPACE`	—	Lock workspace root

Indexing

Variable	Default	Description
`MARKDOWN_CHUNK_SIZE`	`2048`	Token chunk size
`MARKDOWN_CHUNK_OVERLAP`	`100`	Token overlap between chunks
`MIN_CHUNK_TOKENS`	`300`	Small-chunk merge threshold
`MIN_FINAL_TOKENS`	`150`	Drop final chunks below this token count
`DEDUP_MAX_PER_FILE`	`1`	Max results per file (`0` = off)
`EMBEDDING_BATCH_SIZE`	`250`	Texts per API call
`EMBEDDING_CONCURRENT_BATCHES`	`4`	Parallel batches
`EMBEDDING_BATCH_DELAY_MS`	`0`	Delay (ms) between batch waves
`MILVUS_INSERT_BATCH`	`5000`	Rows per Milvus insert (gRPC 64MB limit)

Tip: Defaults work well for most vaults. Adjust MIN_CHUNK_TOKENS / MIN_FINAL_TOKENS if short notes are being dropped unexpectedly. Changes require a force reindex (reindex.py --force).

See Embedding Providers for full auth and tuning options.

Performance

Metric	Result
Unchanged files — hash computations	0 (mtime/size fast-path)
Changed file — embed + insert	~3 seconds
No changes — full scan	instant
Full reindex (1300 files, 23K chunks)	~7–8 minutes

License

Apache 2.0 — see LICENSE for full text.

This project is a fork of MCP-Markdown-RAG by Zackriya Solutions. Original project is licensed under Apache 2.0; this fork maintains the same license.

Key additions over upstream:

Multi-provider embeddings (Gemini, Vertex AI, OpenAI, Voyage)
Milvus vector store replacing Qdrant
Non-blocking background indexing with asyncio.to_thread
3-way delta scan (new/modified/deleted)
Smart chunk merging with parent header injection
Empty chunk filtering (frontmatter-only / structural-only drop)
Short chunk drop (final chunks below 150 tokens with per-chunk logging)
Reconciliation sweep (Milvus↔disk ghost vector cleanup)
Scoped search & pruning, batch embedding, shell CLI
VS Code Copilot MCP compatibility (dummy params for zero-required-arg tools)

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

junnyjun

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.7.1

Feb 18, 2026

1.7.0

Feb 17, 2026

1.6.0

Feb 16, 2026

1.5.0

Feb 15, 2026

1.4.0

Feb 15, 2026

1.3.0

Feb 15, 2026

1.2.1

Feb 15, 2026

1.2.0

Feb 15, 2026

1.1.0

Feb 15, 2026

1.0.0

Feb 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdown_fastrag_mcp-1.7.1.tar.gz (2.3 MB view details)

Uploaded Feb 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

markdown_fastrag_mcp-1.7.1-py3-none-any.whl (2.3 MB view details)

Uploaded Feb 18, 2026 Python 3

File details

Details for the file markdown_fastrag_mcp-1.7.1.tar.gz.

File metadata

Download URL: markdown_fastrag_mcp-1.7.1.tar.gz
Upload date: Feb 18, 2026
Size: 2.3 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for markdown_fastrag_mcp-1.7.1.tar.gz
Algorithm	Hash digest
SHA256	`555a9d9321e82b7296965b78141e59e7e2e8e44936cd764218db09fac51a031f`
MD5	`6fd4f159cb1318c2a7bccea4c4880817`
BLAKE2b-256	`0116a01c576ab50d4e8dc383fa98a1d6a75f78211e44a07530994e407ef3a2a7`

See more details on using hashes here.

File details

Details for the file markdown_fastrag_mcp-1.7.1-py3-none-any.whl.

File metadata

Download URL: markdown_fastrag_mcp-1.7.1-py3-none-any.whl
Upload date: Feb 18, 2026
Size: 2.3 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for markdown_fastrag_mcp-1.7.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`99e2ad77b965e41a10dca4f3a8d39fb790f644f1967a6e3a8c190f428d4a623d`
MD5	`4d32eb33ba40362cb071383f76e74632`
BLAKE2b-256	`34e692e6b8bfffaeab393b6b87f48797cc1ace9feafe78e23e878c7827cca39f`

See more details on using hashes here.

markdown-fastrag-mcp 1.7.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Markdown-FastRAG-MCP

Quick Start

Features

📚 Documentation

Tools

How It Works

Configuration

Core

Indexing

Performance

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes