Lightweight agent-native knowledge base. SQLite + FTS5 + vec0 + MCP server.
Project description
kb-mcp
An agent-native knowledge base.
pip install kb-mcp — give any LLM agent a structured, queryable, local-first second brain.
The problem
Knowledge bases for humans (Notion, Obsidian) and for search engines (Elasticsearch, vector DBs) leave a gap: LLM agents need a knowledge layer that speaks their protocol and assumes the reader is a model, not a person.
kb-mcp fills it.
| Obsidian / Notion | Vector DBs (Chroma / LanceDB) | kb-mcp |
|
|---|---|---|---|
| Reader-optimised for | Humans | Embeddings | LLM agents |
| Protocol | Web UI | SDK | MCP (stdio) |
| Schema | Free-form | Free-form | Typed (project / decision / lesson / ...) |
| Default storage | Cloud / proprietary | Local files | SQLite + FTS5 |
| Setup | Sign up | pip install + configure |
pip install and go |
Features
- 🧠 Agent-native. Every document is reachable from any MCP client via 12 tools, 4 Resources, and 2 Prompts.
- 📐 Schema-first. Six built-in document types
(
project,decision,lesson,glossary,person,faq) — extensible via Python subclassing. - 🔍 Full-text search. Three modes: lexical (BM25), fuzzy (trigram),
and hybrid (combined). Optional semantic search with
sqlite-vec. - 📜 Version history. Every create/update/delete is recorded. Restore any previous version, diff between versions, or recover soft-deleted docs.
- 🔗 Typed links. Documents reference other documents; backlinks are automatic.
- 📝 Markdown friendly. Round-trip import/export with frontmatter. Aliases let agents reference the same doc from different contexts.
- 🗄️ Multi-vault. Isolated knowledge bases per project or team.
Switch between vaults with
kb vault switch. - 🪶 Zero deps by default. SQLite ships with Python.
pip install kb-mcpand you're done. - 🔒 Local-first. Your data stays on your machine. No cloud, no telemetry, no phone-home.
Quickstart
pip install kb-mcp
kb init
kb add --type project --title "kb-mcp" --tags kb,mcp,open-source --body "Agent-native knowledge base."
kb search "mcp server"
# Expose to any MCP client
kb serve
That's it. Five commands, zero config files.
👉 Full walkthrough: docs/quickstart.md
Vault quickstart (multi-space)
# Create isolated knowledge bases
kb vault create work --desc "Work projects"
kb vault create personal --desc "Personal learning"
# Switch between them
kb vault switch work
kb add --type project --title "My Work App" --body "..."
kb vault switch personal
kb add --type lesson --title "Rust tutorial notes" --body "..."
# Each vault has its own SQLite database
kb vault list
Team sync (Git)
Share a team vault via any Git remote:
# A member: set up
kb vault create team --desc "Team shared knowledge"
kb vault init-git # git init + .gitignore
kb vault commit -m "Initial KB" # export → git commit
kb vault push origin main # push to remote
# B member: clone and start using
git clone <remote-url> ~/.local/share/kb-mcp-custom/
KB_MCP_HOME=~/.local/share/kb-mcp-custom kb vault list
KB_MCP_HOME=~/.local/share/kb-mcp-custom kb vault pull # pull → import
# Daily workflow:
# Make changes...
kb add --type decision --title "Use SQLite FTS5"
# Share:
kb vault commit -m "Add ADR about FTS5"
kb vault push
# Get teammates' changes:
kb vault pull
Both push and pull accept optional <remote> <branch> arguments:
kb vault push origin main. When omitted, defaults are origin and main.
The sync is text-based: the vault's Markdown files go to a md/ subdirectory
under Git, while the binary .db stays local and .gitignored.
First-time import from a Git repo
If you have an existing Git repository with md/ (exported Markdown files) and
want to import it into a local vault, three approaches:
① Use --sync-dir (recommended for ongoing sync)
# Clone the repo first
git clone <remote-url> ~/my-vault-repo
# Create a vault pointing at the clone
kb vault create my-vault --desc "Team KB"
kb vault init-git --sync-dir ~/my-vault-repo
# Import the Markdown files into SQLite
kb vault pull
This links the vault to the git clone so future kb vault commit / push / pull
all work without extra arguments.
② Direct kb import (one-shot, no git link)
kb vault create my-vault
kb import ~/my-vault-repo/md/
Fastest for a one-off import, but subsequent kb vault commit won't know
where to export to unless you also run kb vault init-git --sync-dir.
③ Override KB_MCP_HOME (isolated vault directory)
git clone <remote-url> ~/.local/share/kb-mcp-custom/
KB_MCP_HOME=~/.local/share/kb-mcp-custom kb vault list
KB_MCP_HOME=~/.local/share/kb-mcp-custom kb vault pull
Puts everything under a custom directory — useful for side-by-side vaults or testing.
kb import — bulk-import Markdown files
Import an entire directory of Markdown files into the vault at once.
kb import <directory> [--dry-run] [--json]
<directory>— path to a directory of.mdfiles (searched recursively)--dry-run— parse & validate every file without writing anything--json— output the import report as JSON
Frontmatter format
Every .md file can begin with a YAML frontmatter block (between --- delimiters).
The body is everything after the closing ---.
---
type: decision
title: Use SQLite FTS5 over Elasticsearch
tags:
- architecture
- database
created_at: 2025-01-15T10:00:00Z
updated_at: 2025-01-20T14:30:00Z
---
# Body text goes here
Any valid Markdown.
Required fields
| Field | Type | Description |
|---|---|---|
type |
str |
Document type. One of project, decision, lesson, glossary, person, faq, or a custom type of your own. |
title |
str |
Document title. Non-empty, max 512 characters. |
Optional fields
| Field | Type | Default | Description |
|---|---|---|---|
tags |
list[str] |
[] |
Tags for filtering and grouping. Each tag: lowercase, alphanumeric + _/-. Max 64 tags. |
source |
str |
(auto) | Overridden by the file's relative path during import. Usually you don't set this. |
created_at |
str |
now UTC |
ISO-8601 datetime. 2025-01-15T10:00:00Z or 2025-01-15T10:00:00+00:00. |
updated_at |
str |
now UTC |
Same format as created_at. |
links |
list[dict] |
[] |
Outgoing document links. Each entry has to (target doc ID) and optional rel (default "relates-to"). |
Note: Unknown frontmatter keys are silently passed through and ignored.
Examples per document type
Project — a repository or initiative overview:
---
type: project
title: kb-mcp
tags:
- mcp
- knowledge-base
- python
---
Agent-native knowledge base. SQLite + FTS5 + vec0 + MCP server.
Decision — an Architecture Decision Record (ADR):
---
type: decision
title: Use SQLite FTS5 over Elasticsearch
tags:
- architecture
- database
created_at: 2025-01-15T10:00:00Z
links:
- to: proj/kb-mcp
rel: governs
- to: glossary/fts5
---
## Context
We need full-text search that works offline with zero setup.
## Decision
Use SQLite FTS5 — it ships with Python, requires no external process,
and handles our scale.
## Consequences
+ No infra to manage
- No distributed search
Lesson — a post-mortem or lesson learned:
---
type: lesson
title: Don't cross-connection last_insert_rowid
tags:
- sqlite
- bug
created_at: 2025-02-01T08:30:00Z
---
`last_insert_rowid()` is connection-scoped, not transaction-scoped.
If you INSERT on connection A but call `last_insert_rowid()` on
connection B, you get 0 — silently.
Glossary — a term definition:
---
type: glossary
title: FTS5
tags:
- sqlite
- search
---
SQLite's full-text search engine. Supports BM25 ranking, prefix queries,
and incremental merge. Ships as a compile-time option in the standard
`sqlite3` module.
Person — profile of a person the agent should know about:
---
type: person
title: Zhang Bei
tags:
- team
- maintainer
---
Owner of kb-mcp. Uses Hermes agent framework. Active in the MCP community.
FAQ — a frequently asked question:
---
type: faq
title: Why SQLite?
tags:
- faq
- architecture
---
**Q:** Why SQLite instead of PostgreSQL or a vector DB?
**A:** SQLite ships with Python — zero deps. For a local-first agent
knowledge base, it's fast enough, and FTS5 + vec0 cover search.
Import behavior
- Recursive walk — all
.mdfiles (excluding hidden files/dirs) are found. - Frontmatter parsed — each file is read and its YAML frontmatter extracted.
- Document constructed —
type+titleare validated (required); missing fields raise errors collected per-file. - Source-based dedup — if a document with the same
sourcepath already exists, it is updated in-place (preservingidandcreated_at). Otherwise a new document is inserted. - Links created — if a file's frontmatter has a
linksfield, each link is created viastore.link(). Failed links (e.g. target doc not found) are reported as errors. - Report generated — a summary showing inserted / updated / skipped / error counts.
$ kb import ./docs/
Imported 12 files: 8 inserted, 3 updated, 1 error
Errors:
./docs/broken.md: frontmatter missing required field 'type'
Use --dry-run to see what would happen before making changes:
$ kb import ./docs/ --dry-run
Would import 12 files: 8 inserted, 3 updated, 0 errors
Export
The reverse — dump every document in the vault as .md files:
kb export <directory> [--force]
- Each document becomes a
<slug>.mdfile (based on its ID). - Collisions get a numeric suffix (
kb-mcp-2.md). - Pre-existing files are not overwritten unless
--forceis passed. - After export, each document's
sourcefield is updated in the DB so a subsequentkb importof the same directory matches correctly. - Outgoing document links (from
store.outlinks()) are included in each file's frontmatter as alinkslist, so import-export round-trips preserve document relationships.
Document types
| Type | ID prefix | Purpose | Example |
|---|---|---|---|
project |
proj |
Repo / initiative background | kb-mcp, micro-app-fork |
decision |
dec |
Architecture Decision Record (ADR) | "Use SQLite FTS5 over Elasticsearch" |
lesson |
lesson |
Post-mortem / lessons learned | "Don't last_insert_rowid() across multi-INSERT batches" |
glossary |
glossary |
Term definitions | FTS5, MCP, ADR |
person |
person |
People the agent should recognise | "Zhang Bei, owner, uses Hermes" |
faq |
faq |
Frequently asked questions | "Why SQLite?" |
Every document gets a stable ID auto-generated from its type and title
(e.g. decision/use-sqlite-fts5-over-elasticsearch). IDs are permanent —
once created, the type, id, and created_at fields are immutable.
Subclass kb_mcp.schema.Document to add your own.
MCP integration
Add to ~/.config/claude_desktop_config.json (or any MCP client):
{
"mcpServers": {
"kb": {
"command": "kb",
"args": ["serve"]
}
}
}
The agent sees 12 tools:
| Tool | Description | Example |
|---|---|---|
kb_search |
Full-text search (lexical / fuzzy / hybrid) | kb_search("FTS5 search") |
kb_get |
Fetch document by id (also resolves aliases) | kb_get("dec/use-sqlite-fts5") |
kb_add |
Create a new document | kb_add(type="lesson", title="…", body="…") |
kb_update |
Patch fields on an existing document | kb_update(id="…", title="New title") |
kb_delete |
Soft-delete a document | kb_delete("proj/kb-mcp") |
kb_list |
Browse documents with type/tag filters | kb_list(type="decision") |
kb_link |
Create a typed edge between documents | kb_link(from="dec/…", to="proj/…", rel="governs") |
kb_unlink |
Remove typed edges | kb_unlink(from="dec/…", to="proj/…") |
kb_history |
View document version history | kb_history("doc/…") |
kb_restore |
Restore to a previous version | kb_restore("doc/…", version=3) |
kb_diff |
Field-level diff between versions | kb_diff("doc/…", v1=1, v2=3) |
kb_restore_deleted |
Restore a soft-deleted document | kb_restore_deleted("doc/…") |
... plus 4 Resources — each returns a structured view:
| Resource URI | Returns |
|---|---|
kb://doc/{id} |
Full document with body, metadata, and backlinks |
kb://search/{query} |
Search results with relevance scores |
kb://links/{id} |
All typed edges for a document (outgoing + incoming) |
kb://doctor |
Health check report (integrity, missing refs, schema stats) |
... and 2 Prompts for richer agent interaction:
| Prompt | Purpose |
|---|---|
new-doc |
Guided multi-step doc creation (walks the agent through type/title/body/tags) |
search-expert |
Expert search strategist — picks the best search mode for the query |
You can also serve a specific vault:
kb serve --vault project-x
Development
git clone https://github.com/your-org/kb-mcp
cd kb-mcp
pip install -e ".[dev]"
pytest # unit + E2E (real SQLite temp file, no mocks)
ruff check .
mypy src/
👉 Spec: docs/requirements.md · Architecture: docs/architecture.md · CLI reference: docs/cli-reference.md
Roadmap
| Version | Scope | Status |
|---|---|---|
| v0.1.0 | CLI + MCP server + SQLite/FTS5 + 6 doc types + Markdown I/O | ✅ shipped |
| v0.2.0 | Fuzzy/trigram search, semantic search (sqlite-vec), tools/CLI completion | ✅ shipped |
| v0.3.0 | MCP Resources & Prompts, version history (restore/diff), aliases | ✅ shipped |
| v0.4.0 | Multi-vault (isolated knowledge bases), vault CLI + MCP vault selection | ✅ shipped |
| v0.5.0 | Local embedding models, CJK tokenizer, knowledge graph visualization | exploring |
| v0.6.0 | Plugin system, external sync (Notion/GitHub), LLM-native enhancements | exploring |
| v1.0.0 | Postgres backend, multi-user auth, hosted mode | exploring |
Status
beta. The API and storage format are stable since v0.3.0. Pin minor versions
(kb-mcp>=0.3,<0.5) in production if you prefer conservative upgrades.
Contributing
Issues and PRs welcome. See CONTRIBUTING.md.
By participating, you agree to abide by the Code of Conduct.
License
MIT — do what you want, just keep the copyright notice.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kb_mcp_lite-0.5.1.tar.gz.
File metadata
- Download URL: kb_mcp_lite-0.5.1.tar.gz
- Upload date:
- Size: 134.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9eabe428a514742f6dfe71fd12969d3240bd1605fc1b4bf97d71a2e454fbf954
|
|
| MD5 |
a27c38d41084644e24fc0a5241a8da35
|
|
| BLAKE2b-256 |
492a991702407b23683b95bce2d8d8b18300af4532a7b461d6705ec741300572
|
Provenance
The following attestation bundles were made for kb_mcp_lite-0.5.1.tar.gz:
Publisher:
publish.yml on HelloTomBruce/kb-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kb_mcp_lite-0.5.1.tar.gz -
Subject digest:
9eabe428a514742f6dfe71fd12969d3240bd1605fc1b4bf97d71a2e454fbf954 - Sigstore transparency entry: 2006571896
- Sigstore integration time:
-
Permalink:
HelloTomBruce/kb-mcp@7da133ce97a45930764fc12919a6878bfb535cb6 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/HelloTomBruce
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7da133ce97a45930764fc12919a6878bfb535cb6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file kb_mcp_lite-0.5.1-py3-none-any.whl.
File metadata
- Download URL: kb_mcp_lite-0.5.1-py3-none-any.whl
- Upload date:
- Size: 111.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5d9c09aef5881ceeb38983d5d0dd8cbf11ee086908582485f58a99a6bb6a8a04
|
|
| MD5 |
da0f54d421d26b7d3d690290bdbd4118
|
|
| BLAKE2b-256 |
c5578b90fae867af7d70b66431c5dc18b706041d4fa00766c62f93114b7189c1
|
Provenance
The following attestation bundles were made for kb_mcp_lite-0.5.1-py3-none-any.whl:
Publisher:
publish.yml on HelloTomBruce/kb-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
kb_mcp_lite-0.5.1-py3-none-any.whl -
Subject digest:
5d9c09aef5881ceeb38983d5d0dd8cbf11ee086908582485f58a99a6bb6a8a04 - Sigstore transparency entry: 2006572006
- Sigstore integration time:
-
Permalink:
HelloTomBruce/kb-mcp@7da133ce97a45930764fc12919a6878bfb535cb6 -
Branch / Tag:
refs/tags/v0.5.1 - Owner: https://github.com/HelloTomBruce
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@7da133ce97a45930764fc12919a6878bfb535cb6 -
Trigger Event:
push
-
Statement type: