MCP server wrapping the Smalt's storage surface (read/write/link/claim/search) for ParkviewLab's CoGrind project.
Project description
smalt-mcp
MCP server wrapping the Smalt's storage surface (read / write / link / claim / search) for ParkviewLab's CoGrind project. Thinnest viable wrapper around markdown + LanceDB; no agentic logic. Single-writer to a given Smalt.
To cobalt-grinding what deco-assaying is to tree-sitter: a clean MCP-shaped wrapper around a deterministic capability.
Status
Storage substrate complete. The full storage-substrate surface is wired up: 17 tools across three permission tiers, with auto-indexer-trigger on writes and hybrid (FTS + vector + alias, RRF-fused) search. Track A of CoGrind's M2.7 plan — see cobalt-grinding/docs/plan.md for the full design.
The scientific-method surface (proposals / experiments / gaps) is not part of smalt-mcp — it lives in a separate MCP server, ebony-enriching, the lab-notebook substrate. Cobalt-grinding's cognitive systems read from both substrates and orchestrate cross-substrate writes.
Run
Same five-mode pattern as deco-assaying. Pick whichever fits.
| Mode | When to use |
|---|---|
1. uvx (one-off) |
Try it once, no install. |
2. uv tool install (pinned daemon) |
Run it occasionally, want it on $PATH. |
| 3. macOS LaunchAgent | Persistent daemon on a Mac. |
| 4. Linux systemd user unit | Persistent daemon on Linux. |
| 5. Docker / docker compose | Container deployment. |
In every mode the server listens on PORT (default 35833). Sanity-check:
curl http://127.0.0.1:35833/health
From source (current; until first release)
git clone https://github.com/ParkviewLab/smalt-mcp.git
cd smalt-mcp
uv sync
SMALT_DIR=~/Documents/Smalt uv run python -m smalt_mcp
Docker (after first release)
docker pull ghcr.io/parkviewlab/smalt-mcp:latest
docker run --rm \
-p 35833:35833 \
-e SMALT_SCOPE=read_only \
-v smalt-data:/data \
ghcr.io/parkviewlab/smalt-mcp:latest
Or use docker-compose.yml.
Endpoints
POST /sse— MCP Streamable HTTP transport. Tools.GET /health— liveness probe ({ok, version, uptime_seconds}).GET /admin/version— server identity + scope + configured Smalt path.GET /docs— OpenAPI / Swagger UI for the HTTP routes.
HTTP responses are gzipped when the client sends Accept-Encoding: gzip.
MCP tools
Three permission tiers controlled by SMALT_SCOPE. A caller at tier N sees and may call any tool whose required scope is ≤ N.
read_only (8 tools):
status— Smalt path, existence, LanceDB tables, page count, single-writer mutex state, embedding provider.list_pages— indexed pages, filtered bytype/prefix.read_page— full page (frontmatter + body); falls back to alias lookup on miss.find_by_alias— every page whosealiaseslist contains the given alias.incoming_links— "what links to this page" (the inverse oftraverse).traverse— outgoing edges from a page; optional label filter.search— hybrid FTS + vector + alias, RRF-fused; every hit carriesid,aliases,title,type,snippet,score.list_domains— ConceptPages flaggedis_domain: true.
read_write (+5 tools):
bootstrap— initialize the canonical layout + LanceDB tables; idempotent.write_page—create(always-mangle: caller-id becomes slug-prefix + 22-char UUID4 suffix; original id preserved in aliases) orupdate(requires existing canonical id). Runs the incremental indexer.write_pages— batch of writes; validate-all-then-act; single indexer pass at the end.add_link— append an outgoing link to a page'slinks_out; duplicate detection.add_claim— append aClaimto a page'sclaims; duplicate-id detection.
remove_destructive (+4 tools):
remove_page— cascading delete (file + pages row + embeddings row + outgoing + incoming links + claims).update_claim— replace one claim by id;new_claim.idmust equalclaim_id.remove_claim— remove one claim by id.remove_link— remove edges by(from_id, to_id, label?); omitlabelto drop every edge between the pair.
For the proposal / experiment / gap surface (writing hypotheses, recording experiment runs, queueing knowledge gaps), use ebony-enriching — the lab-notebook substrate. Both servers are independent: cobalt-grinding's cognitive systems orchestrate any cross-substrate flow.
Configuration
| Env var | Default | Purpose |
|---|---|---|
PORT |
35833 |
HTTP listen port. |
HOST |
0.0.0.0 |
HTTP bind address. |
SMALT_DIR |
~/Documents/Smalt |
Path to the Smalt this server wraps. Call the bootstrap MCP tool once to initialize. |
SMALT_SCOPE |
read_write |
read_only, read_write, or remove_destructive. Tiered: caller at tier N sees every tool whose required scope is ≤ N. |
EMBEDDING_PROVIDER |
fastembed |
Embedding backend. fastembed is the only one wired up; voyage / openai are placeholders. |
EMBEDDING_MODEL |
BAAI/bge-small-en-v1.5 |
Model name passed to the provider. |
EMBEDDING_DIM |
384 |
Must match the model. |
SMALT_INTERNAL_TOKEN |
(unset) | Reserved for future per-client scope routing; not yet enforced. |
Operations: backup and restore
The Smalt is a directory of markdown files (plus a rebuildable LanceDB index). Use Restic directly against SMALT_DIR for backup — there's no dedicated backup endpoint on the server, and it's intentional: Restic's content-defined chunk-level deduplication needs to see raw file content. Pointing it at the live directory gives real per-file dedup, real incremental snapshots, and a restore-as-directory-tree workflow that's strictly better than any server-side archive-export endpoint would deliver.
Backup
restic backup "$SMALT_DIR" --exclude "index/lance"
Excluding index/lance/ is the recommended default — the LanceDB store is rebuildable from the markdown in pages/. Including it would roughly double snapshot size with bytes that the indexer can regenerate post-restore.
You can run this against a live smalt-mcp (the server only holds the corpus mutex briefly, during the commit phase of a write). For a strictly point-in-time snapshot, stop the server first.
Restore
# 1. Stop smalt-mcp (otherwise it could race the restore).
# 2. Restore from the latest snapshot to a staging dir.
restic restore latest --target /staging
# 3. Move the restored Smalt into place.
mv /staging/<path-restic-recorded>/Smalt "$SMALT_DIR"
# 4. Start smalt-mcp pointing at the restored SMALT_DIR.
SMALT_DIR="$SMALT_DIR" uv run python -m smalt_mcp # or your usual run mode
Then trigger an index rebuild via the MCP bootstrap tool. bootstrap is idempotent — it'll detect the restored markdown and rebuild the LanceDB index from it (since we excluded index/lance/ from the backup). A planned future tool (reindex_all) will be the cleaner explicit version of this for the restore use case; until it ships, bootstrap is the right call.
Remote Smalts (running on a host where Restic can't reach the filesystem)
Mount SMALT_DIR locally via SSHFS (or equivalent), then restic backup against the mount. Same per-file dedup as the local case, with one extra hop. If even SSHFS isn't possible (very restricted deployment), the prior approach of building a tar.gz server-side and piping to restic backup --stdin is technically possible but defeats Restic's dedup — every snapshot becomes one opaque binary blob. Not recommended.
Why no /admin/backup endpoint
Earlier iterations of smalt-mcp briefly shipped a GET /admin/backup streaming tar.gz endpoint. It was removed in v0.12.0 after we realized the Restic-native pattern is strictly better for the common case. The endpoint design (streaming tar.gz via stdlib tarfile, best-effort consistency, scope-filtered downloads) was sound; the question was whether to ship a half-good answer (opaque blob, zero dedup) or the right answer (Restic against the filesystem). We chose the latter.
Releasing
Tag-driven via the release workflow on push of a v* tag. Use the ParkviewLab/dev-tools helpers — they enforce the SSOT-tag-CI loop (pyproject.toml is the only place the version lives; CI verifies the pushed tag matches before publishing).
git bump patch # 0.1.5 → 0.1.6, committed
git release # annotated tag v0.1.6 from pyproject.toml
git push --follow-tags # CI fires
Don't have the helpers? Install once: git clone https://github.com/ParkviewLab/dev-tools.git ~/dev-tools && cd ~/dev-tools && ./install.sh.
License
MIT. See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smalt_mcp-1.2.0.tar.gz.
File metadata
- Download URL: smalt_mcp-1.2.0.tar.gz
- Upload date:
- Size: 210.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a64d11ae93dffde86e94483e868ccbc50799650f01811432c9858a9f46e2961d
|
|
| MD5 |
eb155f65e8a46e59778c97ea65b3fe55
|
|
| BLAKE2b-256 |
ca2ca261e422ee312e7efc1472c2ca146544dc6562f1510bb099f3f216042725
|
Provenance
The following attestation bundles were made for smalt_mcp-1.2.0.tar.gz:
Publisher:
release.yml on ParkviewLab/smalt-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
smalt_mcp-1.2.0.tar.gz -
Subject digest:
a64d11ae93dffde86e94483e868ccbc50799650f01811432c9858a9f46e2961d - Sigstore transparency entry: 1566403076
- Sigstore integration time:
-
Permalink:
ParkviewLab/smalt-mcp@f638fb1680e68998f84516470fc55d14bed19875 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/ParkviewLab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f638fb1680e68998f84516470fc55d14bed19875 -
Trigger Event:
push
-
Statement type:
File details
Details for the file smalt_mcp-1.2.0-py3-none-any.whl.
File metadata
- Download URL: smalt_mcp-1.2.0-py3-none-any.whl
- Upload date:
- Size: 92.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b6af67fae007e51499c6d049a05f8487a9879e251cf886860f218fd0101b208c
|
|
| MD5 |
ee876dfc2d3ccc34e4399440701072f3
|
|
| BLAKE2b-256 |
6184b3ba88229719eaf7c6b14b7987ba58657a63854f0e9fe3728124a8f2fd0c
|
Provenance
The following attestation bundles were made for smalt_mcp-1.2.0-py3-none-any.whl:
Publisher:
release.yml on ParkviewLab/smalt-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
smalt_mcp-1.2.0-py3-none-any.whl -
Subject digest:
b6af67fae007e51499c6d049a05f8487a9879e251cf886860f218fd0101b208c - Sigstore transparency entry: 1566403083
- Sigstore integration time:
-
Permalink:
ParkviewLab/smalt-mcp@f638fb1680e68998f84516470fc55d14bed19875 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/ParkviewLab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@f638fb1680e68998f84516470fc55d14bed19875 -
Trigger Event:
push
-
Statement type: