Skip to main content

Generic markdown vault MCP server with FTS5 + semantic search

Project description

markdown-vault-mcp

A generic markdown collection MCP server with FTS5 full-text search, semantic vector search, frontmatter-aware indexing, and incremental reindexing.

Point it at a directory of Markdown files (an Obsidian vault, a docs folder, a Zettelkasten) and it exposes search, read, write, and edit tools over the Model Context Protocol.

Features

  • Full-text search — SQLite FTS5 with BM25 scoring, porter stemming
  • Semantic search — cosine similarity over embedding vectors (Ollama, OpenAI, or Sentence Transformers)
  • Hybrid search — Reciprocal Rank Fusion combining FTS5 and vector results
  • Frontmatter-aware — indexes YAML frontmatter fields, supports required field enforcement
  • Incremental reindexing — hash-based change detection, only re-processes modified files
  • Write operations — create, edit, delete, rename documents with automatic index updates
  • Git integration — optional auto-commit and push on every write via GIT_ASKPASS
  • MCP tools — 13 tools including search, read, write, edit, delete, rename, and admin operations

Installation

From PyPI

pip install markdown-vault-mcp

With optional dependencies:

pip install markdown-vault-mcp[mcp]           # FastMCP server
pip install markdown-vault-mcp[embeddings-api] # Ollama/OpenAI embeddings via HTTP
pip install markdown-vault-mcp[all]            # MCP + API embeddings (lightweight)
pip install markdown-vault-mcp[all-local]      # + sentence-transformers (large, GPU)

From source

git clone https://github.com/pvliesdonk/markdown-vault-mcp.git
cd markdown-vault-mcp
pip install -e ".[all,dev]"

Docker

The Docker image uses the [all] extra (MCP + API embeddings) and does not include sentence-transformers or PyTorch, keeping it lightweight. Use Ollama or OpenAI for embeddings.

docker pull ghcr.io/pvliesdonk/markdown-vault-mcp:latest

Quick Start

As a library

from pathlib import Path
from markdown_vault_mcp import Collection

collection = Collection(source_dir=Path("/path/to/vault"))
results = collection.search("query text", limit=10)

As an MCP server

Set the required environment variable and start the server:

export MARKDOWN_VAULT_MCP_SOURCE_DIR=/path/to/vault
markdown-vault-mcp serve

With Docker Compose

  1. Copy an example env file:

    cp examples/obsidian-readonly.env .env
    
  2. Edit .env to set MARKDOWN_VAULT_MCP_SOURCE_DIR to the absolute path of your vault on the host.

  3. Start the service:

    docker compose up -d
    
  4. Check the logs:

    docker compose logs -f markdown-vault-mcp
    

Example env files

File Description
examples/obsidian-readonly.env Obsidian vault, read-only, Ollama embeddings
examples/obsidian-readwrite.env Obsidian vault, read-write with git auto-commit
examples/ifcraftcorpus.env Strict frontmatter enforcement, read-only corpus

For reverse proxy (Traefik) and authentication (mcp-auth-proxy) setup, see docs/deployment.md.

Configuration

All configuration is via environment variables with the MARKDOWN_VAULT_MCP_ prefix.

Variable Default Required Description
MARKDOWN_VAULT_MCP_SERVER_NAME markdown-vault-mcp No MCP server name shown to clients (useful for multi-instance setups)
MARKDOWN_VAULT_MCP_INSTRUCTIONS generic description No System-level instructions injected into LLM context
MARKDOWN_VAULT_MCP_SOURCE_DIR Yes Path to the markdown vault directory
MARKDOWN_VAULT_MCP_READ_ONLY true No Set to false to enable write operations
MARKDOWN_VAULT_MCP_INDEX_PATH in-memory No Path to the SQLite FTS5 index file (set for persistence across restarts)
MARKDOWN_VAULT_MCP_EMBEDDINGS_PATH disabled No Path to the numpy embeddings file (required to enable semantic search)
MARKDOWN_VAULT_MCP_STATE_PATH {SOURCE_DIR}/.markdown_vault_mcp/state.json No Path to the change-tracking state file
MARKDOWN_VAULT_MCP_INDEXED_FIELDS No Comma-separated frontmatter fields to index in FTS5
MARKDOWN_VAULT_MCP_REQUIRED_FIELDS No Comma-separated frontmatter fields required on every document
MARKDOWN_VAULT_MCP_EXCLUDE No Comma-separated glob patterns to exclude (e.g. .obsidian/**,.trash/**)
MARKDOWN_VAULT_MCP_GIT_TOKEN No GitHub PAT for auto-commit and push on writes (via GIT_ASKPASS)
MARKDOWN_VAULT_MCP_GIT_PUSH_DELAY_S 30 No Seconds of idle before pushing (0 = push only on shutdown)
MARKDOWN_VAULT_MCP_OLLAMA_MODEL nomic-embed-text No Ollama embedding model name
MARKDOWN_VAULT_MCP_OLLAMA_CPU_ONLY false No Force Ollama to use CPU only
EMBEDDING_PROVIDER auto-detect No Embedding provider: ollama, openai, or sentence-transformers (not prefixed)
OLLAMA_HOST http://localhost:11434 No Ollama server URL (not prefixed)
OPENAI_API_KEY No OpenAI API key for OpenAI embedding provider (not prefixed)

MCP Tools

Tool Description
search Hybrid full-text + semantic search with optional frontmatter filters
read Read a document's content by relative path
write Create or overwrite a document (with optional frontmatter)
edit Replace a unique text span in a document
delete Delete a document and its index entries
rename Rename/move a document, updating all index entries
list_documents List all indexed document paths (with optional folder and glob pattern filter)
list_folders List all folder paths in the vault
list_tags List all unique frontmatter tag values
reindex Force a full reindex of the vault
stats Get collection statistics (document count, chunk count, etc.)
build_embeddings Build or rebuild vector embeddings for semantic search
embeddings_status Check embedding provider and index status

Write tools (write, edit, delete, rename) are only available when MARKDOWN_VAULT_MCP_READ_ONLY=false.

Development

git clone https://github.com/pvliesdonk/markdown-vault-mcp.git
cd markdown-vault-mcp
uv pip install -e ".[all,dev]"

# Run tests
uv run python -m pytest tests/ -x -q

# Lint and format
ruff check src/ tests/
ruff format src/ tests/

# Type check
mypy src/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdown_vault_mcp-1.2.0.tar.gz (270.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

markdown_vault_mcp-1.2.0-py3-none-any.whl (50.5 kB view details)

Uploaded Python 3

File details

Details for the file markdown_vault_mcp-1.2.0.tar.gz.

File metadata

  • Download URL: markdown_vault_mcp-1.2.0.tar.gz
  • Upload date:
  • Size: 270.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for markdown_vault_mcp-1.2.0.tar.gz
Algorithm Hash digest
SHA256 c8138dc6baf34b9a5c2a2eb263e05a464e95dde44860e746482418771c8abed8
MD5 c95576e3a8867686b994c52c82eca274
BLAKE2b-256 61ed5a15b278333cab0e4fad68e786e76c5c81c1b9dd8ccb702106f580fd432f

See more details on using hashes here.

Provenance

The following attestation bundles were made for markdown_vault_mcp-1.2.0.tar.gz:

Publisher: release.yml on pvliesdonk/markdown-vault-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file markdown_vault_mcp-1.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for markdown_vault_mcp-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 948aac8a19aefd80aa18ff7b883bd0c328977635c7301bbf0332a6cb9c3a6bfd
MD5 332c9a44a086ee097c5b63d4fb85d5c2
BLAKE2b-256 79eac6e3e6f9f65fa8e6e086614133f987db02633f2adf66f4c5eb8733842aa1

See more details on using hashes here.

Provenance

The following attestation bundles were made for markdown_vault_mcp-1.2.0-py3-none-any.whl:

Publisher: release.yml on pvliesdonk/markdown-vault-mcp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page