Drop-in MCP server template with SQLite FTS5 search backend. ~300 lines, no vector DB, no embedding API, runs on a Pi.
Project description
mcp-fts5-starter
Drop-in MCP server template with SQLite FTS5 search backend. ~300 lines, no vector DB, no embedding API, runs on a Pi.
The problem
You want to expose a corpus of notes, docs, or clippings to Claude (or any MCP client) as a search tool. Most tutorials reach for a vector DB, an embedding API, and a 500MB Docker image to retrieve a few thousand markdown files. For a small-to-medium corpus running on a single machine, that's overkill.
mcp-fts5-starter is the boring, dependable option:
- SQLite FTS5 for full-text search — built into Python's
sqlite3, no service to run - MCP server scaffold with a few example tools (
search,list,read) - One-file ingest script that walks a directory of markdown files, parses frontmatter, and indexes them
- No embeddings, no vectors, no GPU — and no API bill
Drop the template into a new repo, point it at a folder, and you have a working MCP server in under 10 minutes.
When to use this (and when not to)
Use this if your corpus is:
- Small-to-medium (up to ~100k documents)
- Mostly text (markdown, code, prose) where keyword + tag matching is enough
- Running on a single machine, Pi, or laptop
- Something you want to set up once and forget
Don't use this if you need:
- True semantic search across rephrased queries — pair this with embeddings, or use a different tool
- Multi-tenant search across millions of docs — use a real search backend (Elastic, Meilisearch, Qdrant)
- Memory decay / TTL on entries — see forget-rag (which also uses FTS5 but for a different purpose)
Sibling projects
| Repo | Angle |
|---|---|
mcp-fts5-starter (this) |
MCP server deployment template — how to wire FTS5 + MCP together |
forget-rag |
RAG library with memory decay — three-tier forgetting on top of FTS5 |
Both use SQLite FTS5 under the hood, but solve different problems. Need a starter? Here. Need decay logic? Forget-rag.
Quick demo
The repo ships with a small synthetic corpus under data/sample/ and a
one-shot script that builds an index and runs a few representative
queries against it:
git clone https://github.com/zx22413/mcp-fts5-starter
cd mcp-fts5-starter
uv sync # or: pip install -e .
python scripts/build-sample.py
Sample output:
Rebuilding index at data/sample/index.db
indexed 7 doc(s): 7 written, 0 failed
Query: 'BM25 weights'
- BM25 ranking concepts/bm25.md
- Why not just use a vector notes/why-not-vector-db.md
Query: 'hybrid search'
- Reciprocal rank fusion concepts/rrf.md
- Why not just use a vector notes/why-not-vector-db.md
Query: 'tokenizer' [doc_type=notes]
- Tokenization trade-offs notes/tokenization-tradeoffs.md
- Why not just use a vector notes/why-not-vector-db.md
- Incremental indexing notes/incremental-indexing.md
To launch the MCP server against the same corpus (e.g. for use from Claude Code), point at the directory and the index file:
MCP_FTS5_CORPUS=data/sample MCP_FTS5_DB=data/sample/index.db \
mcp-fts5-starter serve
For a hosted deployment, swap stdio for sse or streamable-http:
mcp-fts5-starter serve --transport sse --host 0.0.0.0 --port 8765
Architecture & benchmarks
docs/architecture.md— design pillars (FTS5-first, embeddings opt-in, generic schema/tools, incremental sync), what didn't survive extraction from the upstream project, and a comparison table for when BM25 / hybrid / hosted vector DB each makes sense.docs/benchmark.md— reproducible benchmark at 100 / 1,000 / 10,000 docs, plus the perf bug it surfaced.
Examples
examples/claude-code/— drop-in.mcp.jsonfor Claude Code, plus how-to and troubleshooting. Same shape works for Claude Desktop.examples/raw-jsonrpc/— talk to the server using bare JSON-RPC over stdio (no MCP SDK). Useful when writing a custom client or debugging a transport-level issue.
Status
✅ v0.1.0 shipped (PyPI · GitHub Release · launch post).
Roadmap to v0.1
- 1. Initial scaffold
- 2. Generic MCP tool layer (
search,list,read,index) - 3. Generic FTS5 schema with BM25 tuning notes
- 4. Sample corpus + one-command demo (
scripts/build-sample.py) - 5. Architecture doc —
docs/architecture.md - 6.
examples/— Claude Code config + raw JSON-RPC over stdio - 7. CI workflows (test on push/PR × py3.11/3.12/3.13; publish on release via OIDC)
- 8. v0.1.0 release (PyPI) + launch post
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_fts5_starter-0.2.0.tar.gz.
File metadata
- Download URL: mcp_fts5_starter-0.2.0.tar.gz
- Upload date:
- Size: 92.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
40801fffa698a06b41df8c1068207fd791e0f58ac3f7b2dabb1b6b322ba434e1
|
|
| MD5 |
d57320d83b9fda35a729c7a23e43a178
|
|
| BLAKE2b-256 |
3886da52d88f85316b1b24f4451780e7bb21c29de1031d8672ae71ea20a671b0
|
Provenance
The following attestation bundles were made for mcp_fts5_starter-0.2.0.tar.gz:
Publisher:
publish.yml on zx22413/mcp-fts5-starter
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mcp_fts5_starter-0.2.0.tar.gz -
Subject digest:
40801fffa698a06b41df8c1068207fd791e0f58ac3f7b2dabb1b6b322ba434e1 - Sigstore transparency entry: 1462566546
- Sigstore integration time:
-
Permalink:
zx22413/mcp-fts5-starter@4c295086284ca945d56ea6343a69fde4ec213149 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/zx22413
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4c295086284ca945d56ea6343a69fde4ec213149 -
Trigger Event:
release
-
Statement type:
File details
Details for the file mcp_fts5_starter-0.2.0-py3-none-any.whl.
File metadata
- Download URL: mcp_fts5_starter-0.2.0-py3-none-any.whl
- Upload date:
- Size: 21.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9c317dc89fe9ee153341e2f37f35aa90f48565df5976be622d9c41d0fb97d1a
|
|
| MD5 |
850459f56d062672ac4a686a2f9d36cf
|
|
| BLAKE2b-256 |
219c16cc22bd642f3f2930311b7866dc73b21ccfb6f686f8589444f9b1f70550
|
Provenance
The following attestation bundles were made for mcp_fts5_starter-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on zx22413/mcp-fts5-starter
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mcp_fts5_starter-0.2.0-py3-none-any.whl -
Subject digest:
d9c317dc89fe9ee153341e2f37f35aa90f48565df5976be622d9c41d0fb97d1a - Sigstore transparency entry: 1462566964
- Sigstore integration time:
-
Permalink:
zx22413/mcp-fts5-starter@4c295086284ca945d56ea6343a69fde4ec213149 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/zx22413
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@4c295086284ca945d56ea6343a69fde4ec213149 -
Trigger Event:
release
-
Statement type: