Skip to main content

Drop-in MCP server template with SQLite FTS5 search backend. ~300 lines, no vector DB, no embedding API, runs on a Pi.

Project description

mcp-fts5-starter

Drop-in MCP server template with SQLite FTS5 search backend. ~300 lines, no vector DB, no embedding API, runs on a Pi.

test License Status Python

The problem

You want to expose a corpus of notes, docs, or clippings to Claude (or any MCP client) as a search tool. Most tutorials reach for a vector DB, an embedding API, and a 500MB Docker image to retrieve a few thousand markdown files. For a small-to-medium corpus running on a single machine, that's overkill.

mcp-fts5-starter is the boring, dependable option:

  • SQLite FTS5 for full-text search — built into Python's sqlite3, no service to run
  • MCP server scaffold with a few example tools (search, list, read)
  • One-file ingest script that walks a directory of markdown files, parses frontmatter, and indexes them
  • No embeddings, no vectors, no GPU — and no API bill

Drop the template into a new repo, point it at a folder, and you have a working MCP server in under 10 minutes.

When to use this (and when not to)

Use this if your corpus is:

  • Small-to-medium (up to ~100k documents)
  • Mostly text (markdown, code, prose) where keyword + tag matching is enough
  • Running on a single machine, Pi, or laptop
  • Something you want to set up once and forget

Don't use this if you need:

  • True semantic search across rephrased queries — pair this with embeddings, or use a different tool
  • Multi-tenant search across millions of docs — use a real search backend (Elastic, Meilisearch, Qdrant)
  • Memory decay / TTL on entries — see forget-rag (which also uses FTS5 but for a different purpose)

Sibling projects

Repo Angle
mcp-fts5-starter (this) MCP server deployment template — how to wire FTS5 + MCP together
forget-rag RAG library with memory decay — three-tier forgetting on top of FTS5

Both use SQLite FTS5 under the hood, but solve different problems. Need a starter? Here. Need decay logic? Forget-rag.

Quick demo

The repo ships with a small synthetic corpus under data/sample/ and a one-shot script that builds an index and runs a few representative queries against it:

git clone https://github.com/zx22413/mcp-fts5-starter
cd mcp-fts5-starter
uv sync                          # or: pip install -e .
python scripts/build-sample.py

Sample output:

Rebuilding index at data/sample/index.db
  indexed 7 doc(s): 7 written, 0 failed

Query: 'BM25 weights'
  - BM25 ranking                concepts/bm25.md
  - Why not just use a vector   notes/why-not-vector-db.md

Query: 'hybrid search'
  - Reciprocal rank fusion      concepts/rrf.md
  - Why not just use a vector   notes/why-not-vector-db.md

Query: 'tokenizer' [doc_type=notes]
  - Tokenization trade-offs     notes/tokenization-tradeoffs.md
  - Why not just use a vector   notes/why-not-vector-db.md
  - Incremental indexing        notes/incremental-indexing.md

To launch the MCP server against the same corpus (e.g. for use from Claude Code), point at the directory and the index file:

MCP_FTS5_CORPUS=data/sample MCP_FTS5_DB=data/sample/index.db \
  mcp-fts5-starter serve

Architecture

See docs/architecture.md for the design pillars (FTS5-first, embeddings opt-in, generic schema/tools, incremental sync), what didn't survive extraction from the upstream project, and a comparison table for when BM25 / hybrid / hosted vector DB each makes sense.

Examples

  • examples/claude-code/ — drop-in .mcp.json for Claude Code, plus how-to and troubleshooting. Same shape works for Claude Desktop.
  • examples/raw-jsonrpc/ — talk to the server using bare JSON-RPC over stdio (no MCP SDK). Useful when writing a custom client or debugging a transport-level issue.

Status

🚧 Alpha. Core indexer + 4 MCP tools + sample corpus + architecture doc are in. MCP-client config example, CI, and the v0.1.0 release are next — see ROADMAP below.

Roadmap to v0.1

  • 1. Initial scaffold
  • 2. Generic MCP tool layer (search, list, read, index)
  • 3. Generic FTS5 schema with BM25 tuning notes
  • 4. Sample corpus + one-command demo (scripts/build-sample.py)
  • 5. Architecture doc — docs/architecture.md
  • 6. examples/ — Claude Code config + raw JSON-RPC over stdio
  • 7. CI workflows (test on push/PR × py3.11/3.12/3.13; publish on release via OIDC)
  • 8. v0.1.0 release + blog post

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_fts5_starter-0.1.0.tar.gz (80.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_fts5_starter-0.1.0-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file mcp_fts5_starter-0.1.0.tar.gz.

File metadata

  • Download URL: mcp_fts5_starter-0.1.0.tar.gz
  • Upload date:
  • Size: 80.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mcp_fts5_starter-0.1.0.tar.gz
Algorithm Hash digest
SHA256 35ded7d68f40204922a8982cfb767d192890383fba47c52df408afac34a43e53
MD5 1e02ad7ca3f1fe2f3712ea1a8a76ad80
BLAKE2b-256 1c2248cc993cbfd911ebd1a96f27241b7c6b4e79dcde96f5f186b5fa14ab1723

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_fts5_starter-0.1.0.tar.gz:

Publisher: publish.yml on zx22413/mcp-fts5-starter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mcp_fts5_starter-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_fts5_starter-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 568cb2562957c541e03c6d13ba84c218acc418e45cbece7563dfb3d0c268a3e5
MD5 8f5dbaf146bcb90b2f215feb36646e04
BLAKE2b-256 65506dc42b9e6f88c13f57966d5d9325e5773ca75cd0914c2ea8c1991427bb59

See more details on using hashes here.

Provenance

The following attestation bundles were made for mcp_fts5_starter-0.1.0-py3-none-any.whl:

Publisher: publish.yml on zx22413/mcp-fts5-starter

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page