Skip to main content

Generic markdown vault MCP server with FTS5 + semantic search

Project description

markdown-vault-mcp

A generic markdown collection MCP server with FTS5 full-text search, semantic vector search, frontmatter-aware indexing, and incremental reindexing.

Point it at a directory of Markdown files (an Obsidian vault, a docs folder, a Zettelkasten) and it exposes search, read, write, and edit tools over the Model Context Protocol.

Features

  • Full-text search — SQLite FTS5 with BM25 scoring, porter stemming
  • Semantic search — cosine similarity over embedding vectors (Ollama, OpenAI, or Sentence Transformers)
  • Hybrid search — Reciprocal Rank Fusion combining FTS5 and vector results
  • Frontmatter-aware — indexes YAML frontmatter fields, supports required field enforcement
  • Incremental reindexing — hash-based change detection, only re-processes modified files
  • Write operations — create, edit, delete, rename documents with automatic index updates
  • Git integration — optional auto-commit and push on every write via GIT_ASKPASS
  • MCP tools — 13 tools including search, read, write, edit, delete, rename, and admin operations

Installation

From PyPI

pip install markdown-vault-mcp

With optional dependencies:

pip install markdown-vault-mcp[mcp]           # FastMCP server
pip install markdown-vault-mcp[embeddings-api] # Ollama/OpenAI embeddings via HTTP
pip install markdown-vault-mcp[all]            # MCP + API embeddings (lightweight)
pip install markdown-vault-mcp[all-local]      # + sentence-transformers (large, GPU)

From source

git clone https://github.com/pvliesdonk/markdown-vault-mcp.git
cd markdown-vault-mcp
pip install -e ".[all,dev]"

Docker

The Docker image uses the [all] extra (MCP + API embeddings) and does not include sentence-transformers or PyTorch, keeping it lightweight. Use Ollama or OpenAI for embeddings.

docker pull ghcr.io/pvliesdonk/markdown-vault-mcp:latest

Quick Start

As a library

from pathlib import Path
from markdown_vault_mcp import Collection

collection = Collection(source_dir=Path("/path/to/vault"))
results = collection.search("query text", limit=10)

As an MCP server

Set the required environment variable and start the server:

export MARKDOWN_VAULT_MCP_SOURCE_DIR=/path/to/vault
markdown-vault-mcp serve

With Docker Compose

  1. Copy an example env file:

    cp examples/obsidian-readonly.env .env
    
  2. Edit .env to set MARKDOWN_VAULT_MCP_SOURCE_DIR to the absolute path of your vault on the host.

  3. Start the service:

    docker compose up -d
    
  4. Check the logs:

    docker compose logs -f markdown-vault-mcp
    

Example env files

File Description
examples/obsidian-readonly.env Obsidian vault, read-only, Ollama embeddings
examples/obsidian-readwrite.env Obsidian vault, read-write with git auto-commit
examples/ifcraftcorpus.env Strict frontmatter enforcement, read-only corpus

For reverse proxy (Traefik) and authentication (mcp-auth-proxy) setup, see docs/deployment.md.

Configuration

All configuration is via environment variables with the MARKDOWN_VAULT_MCP_ prefix.

Variable Default Required Description
MARKDOWN_VAULT_MCP_SERVER_NAME markdown-vault-mcp No MCP server name shown to clients (useful for multi-instance setups)
MARKDOWN_VAULT_MCP_INSTRUCTIONS generic description No System-level instructions injected into LLM context
MARKDOWN_VAULT_MCP_SOURCE_DIR Yes Path to the markdown vault directory
MARKDOWN_VAULT_MCP_READ_ONLY true No Set to false to enable write operations
MARKDOWN_VAULT_MCP_INDEX_PATH in-memory No Path to the SQLite FTS5 index file (set for persistence across restarts)
MARKDOWN_VAULT_MCP_EMBEDDINGS_PATH disabled No Path to the numpy embeddings file (required to enable semantic search)
MARKDOWN_VAULT_MCP_STATE_PATH {SOURCE_DIR}/.markdown_vault_mcp/state.json No Path to the change-tracking state file
MARKDOWN_VAULT_MCP_INDEXED_FIELDS No Comma-separated frontmatter fields to index in FTS5
MARKDOWN_VAULT_MCP_REQUIRED_FIELDS No Comma-separated frontmatter fields required on every document
MARKDOWN_VAULT_MCP_EXCLUDE No Comma-separated glob patterns to exclude (e.g. .obsidian/**,.trash/**)
MARKDOWN_VAULT_MCP_GIT_TOKEN No GitHub PAT for auto-commit and push on writes (via GIT_ASKPASS)
MARKDOWN_VAULT_MCP_GIT_PUSH_DELAY_S 30 No Seconds of idle before pushing (0 = push only on shutdown)
MARKDOWN_VAULT_MCP_OLLAMA_MODEL nomic-embed-text No Ollama embedding model name
MARKDOWN_VAULT_MCP_OLLAMA_CPU_ONLY false No Force Ollama to use CPU only
EMBEDDING_PROVIDER auto-detect No Embedding provider: ollama, openai, or sentence-transformers (not prefixed)
OLLAMA_HOST http://localhost:11434 No Ollama server URL (not prefixed)
OPENAI_API_KEY No OpenAI API key for OpenAI embedding provider (not prefixed)

MCP Tools

Tool Description
search Hybrid full-text + semantic search with optional frontmatter filters
read Read a document's content by relative path
write Create or overwrite a document (with optional frontmatter)
edit Replace a unique text span in a document
delete Delete a document and its index entries
rename Rename/move a document, updating all index entries
list_documents List all indexed document paths (with optional folder and glob pattern filter)
list_folders List all folder paths in the vault
list_tags List all unique frontmatter tag values
reindex Force a full reindex of the vault
stats Get collection statistics (document count, chunk count, etc.)
build_embeddings Build or rebuild vector embeddings for semantic search
embeddings_status Check embedding provider and index status

Write tools (write, edit, delete, rename) are only available when MARKDOWN_VAULT_MCP_READ_ONLY=false.

Development

git clone https://github.com/pvliesdonk/markdown-vault-mcp.git
cd markdown-vault-mcp
uv pip install -e ".[all,dev]"

# Run tests
uv run python -m pytest tests/ -x -q

# Lint and format
ruff check src/ tests/
ruff format src/ tests/

# Type check
mypy src/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdown_vault_mcp-1.1.1.tar.gz (270.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

markdown_vault_mcp-1.1.1-py3-none-any.whl (50.3 kB view details)

Uploaded Python 3

File details

Details for the file markdown_vault_mcp-1.1.1.tar.gz.

File metadata

  • Download URL: markdown_vault_mcp-1.1.1.tar.gz
  • Upload date:
  • Size: 270.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for markdown_vault_mcp-1.1.1.tar.gz
Algorithm Hash digest
SHA256 5db21ac35400e694ea7ee9110b81b122ad1141e30c7de97359e8762081f1e5fd
MD5 64170fe2f9c5325044e6c937a9dabfdd
BLAKE2b-256 e7b231ffb5ec661a3503ad2001f01023d5c05ffe925391efc81aa5b85a981cb1

See more details on using hashes here.

Provenance

The following attestation bundles were made for markdown_vault_mcp-1.1.1.tar.gz:

Publisher: release.yml on pvliesdonk/markdown-vault-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file markdown_vault_mcp-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for markdown_vault_mcp-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d980eb471668139c5f17e24118b2f8ab3193b90cb796425dc88e07417695da5e
MD5 652b25863676ce7d15327331a0ba84dd
BLAKE2b-256 e11f5efa35398500fe6055fc5adcd37ebf4c1477ee0aa3e62eec64c1660f0918

See more details on using hashes here.

Provenance

The following attestation bundles were made for markdown_vault_mcp-1.1.1-py3-none-any.whl:

Publisher: release.yml on pvliesdonk/markdown-vault-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page