Hybrid semantic search CLI + daemon for markdown vaults. Four-channel RRF fusion (BM25 + vector + knowledge graph + title), native CJK support, fully local.
Project description
SeekLink
Hybrid semantic search for your markdown vault.
SeekLink searches your personal knowledge base using four channels in parallel — keyword matching, semantic similarity, knowledge graph, and title/alias lookup — then fuses the results for high-recall, high-precision retrieval. Optional cross-encoder reranking via MLX gives an extra precision boost on Apple Silicon.
Built for people who take notes seriously and want an AI that understands their knowledge structure, not just their text.
What it does
You: seeklink search "agent memory systems" --vault ~/notes
→ 8 related notes across topics, ranked by relevance
You: seeklink search "记忆保持力" --vault ~/notes --title-weight 0.5
→ surfaces raw log entries alongside polished articles
You: seeklink daemon --vault ~/notes
→ resident mode: first search ~2s, every search after ~10ms
Architecture
Query: "agent memory systems"
│
├── BM25 (FTS5 + jieba) ──── keyword match ──────── weight 1.0
├── Vector (jina-v2-zh) ──── semantic similarity ── weight 1.0
├── Indegree ─────────────── well-linked = quality ─ weight 0.3
└── Title/Alias (FTS5) ──── exact name match ────── weight 1.5
│
└── RRF Fusion → top candidates
│
└── [optional] Qwen3-Reranker-0.6B (MLX) → precision boost
│
└── ranked results
Four-channel Reciprocal Rank Fusion, with optional cross-encoder reranking for Apple Silicon. Everything runs locally — no API keys, no cloud.
Requirements
- Python 3.11+
- ~330 MB disk for the embedding model (downloaded on first run)
- ~700 MB disk for the reranker model (if enabled; Apple Silicon recommended)
Install
uv tool install seeklink
# or
pip install seeklink
Quick start
# Search your vault
seeklink search "machine learning" --vault /path/to/vault
# Check index health
seeklink status --vault /path/to/vault
# Index a new or changed file
seeklink index path/to/note.md --vault /path/to/vault
# Full vault rebuild
seeklink index --vault /path/to/vault
# Start the resident daemon (keeps models in memory for fast queries)
seeklink daemon --vault /path/to/vault
CLI reference
seeklink search
seeklink search "query" --vault PATH [options]
Options:
--top-k N Number of results (default: 10)
--tags TAG [TAG] Filter by tags (AND semantics)
--folder PREFIX Filter by folder (e.g. "notes/")
--title-weight F Override title channel weight (default: 1.5)
Raise toward 3.0 for "find the article" queries;
lower toward 0.5 for "surface raw moments" queries.
seeklink daemon
Starts a Unix-socket daemon that keeps the embedding model (and reranker, if enabled) resident in memory. First query after startup takes ~2s (model warmup); subsequent queries return in ~10ms without reranker or ~2s with reranker.
The daemon auto-spawns when needed — you don't have to start it manually. It never auto-exits; kill it with kill or restart your machine.
seeklink daemon --vault PATH
seeklink index
seeklink index [PATH] --vault VAULT
Without PATH: full vault re-index (detects unchanged files via content hash).
With PATH: index a single file.
seeklink status
seeklink status --vault PATH
Shows index stats and freshness warnings. If files have changed since last index, prints a warning to stderr.
How search works
SeekLink runs four search channels in parallel and merges results with Reciprocal Rank Fusion:
- BM25 (FTS5 + jieba): keyword match on chunk content. Handles CJK natively via jieba tokenization.
- Vector (jina-embeddings-v2-base-zh): semantic similarity. Finds conceptually related notes even when they use different words or languages.
- Indegree: notes that many other notes link to rank higher — a lightweight quality signal from your knowledge graph.
- Title/Alias (FTS5): matches against note titles and
aliasesfrontmatter. Weight 1.5 gives a modest boost without overwhelming content matches.
Why title weight is 1.5 (not higher)
Many personal knowledge bases contain a mix of titled articles (permanent notes, literature reviews) and untitled process notes (daily logs, journal entries, quick captures). A high title weight systematically buries untitled content — even when it's the most relevant result for the query. The default of 1.5 keeps title matching useful for precise [[alias]] lookups while letting content-based matches compete on their own merits. Override with --title-weight per query if needed.
Optional: cross-encoder reranking
When enabled (default on Apple Silicon), the top-20 RRF candidates are re-scored by Qwen3-Reranker-0.6B running on MLX (Metal GPU). This reads each (query, passage) pair with full cross-attention — more accurate than vector similarity alone, at the cost of ~1-2s per query.
Disable with: export SEEKLINK_RERANKER_MODEL=""
Frontmatter
SeekLink works with any markdown file — no special formatting required.
If your notes have YAML frontmatter, SeekLink uses it for extra features:
---
tags: [ai, machine-learning]
aliases: [ML, Machine Learning]
---
- Tags enable filtered search:
seeklink search "query" --tags ai - Aliases are searchable and used for wikilink resolution —
[[ML]]resolves to the note with that alias
How it stores data
Everything lives in .seeklink/seeklink.db inside your vault — a single SQLite database with:
- FTS5 full-text index (jieba-tokenized for CJK)
- sqlite-vec for 768-dim vector similarity search
- A wikilink graph (parsed from
[[links]]in your notes)
Notes are chunked (~400 tokens), embedded with jina-embeddings-v2-base-zh, and indexed incrementally. Delete .seeklink/ to rebuild from scratch.
Configuration
| Variable | Default | Description |
|---|---|---|
SEEKLINK_VAULT |
. |
Path to vault root |
SEEKLINK_EMBEDDER_MODEL |
jinaai/jina-embeddings-v2-base-zh |
Embedding model (fastembed-supported) |
SEEKLINK_RERANKER_MODEL |
mlx-community/Qwen3-Reranker-0.6B-mxfp8 |
Reranker model (set to "" to disable) |
What changed in v0.2
- CLI-first: MCP server removed. All interaction via
seeklink search/index/status/daemon. - Daemon mode: Unix-socket resident server with auto-spawn. Models stay loaded for fast queries.
- Reranker: Qwen3-Reranker-0.6B via MLX on Apple Silicon. Optional, default enabled.
- Freshness check: bidirectional mtime scan replaces the file watcher. Warns on stale/new/deleted files.
- Title weight 1.5: down from 3.0, so log entries and journal notes compete fairly with titled permanent notes.
- Leaner deps:
mcpandwatchfilesremoved. 4 runtime dependencies instead of 6.
Contributing
git clone https://github.com/simonsysun/seeklink
cd seeklink
uv sync --dev
uv run python -m pytest tests/ -q
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seeklink-0.2.0.tar.gz.
File metadata
- Download URL: seeklink-0.2.0.tar.gz
- Upload date:
- Size: 54.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e24d04453b8f653dabe8795044af126c9cd7c7919d3915a4e7fbdff3e9a71acd
|
|
| MD5 |
c65e270525384f9de89feff6dca0e58d
|
|
| BLAKE2b-256 |
09f2d8d06569ccfdf1398fc3c9f7e5fd5f0c225b9562ec668bd45fb57f8fd1fc
|
File details
Details for the file seeklink-0.2.0-py3-none-any.whl.
File metadata
- Download URL: seeklink-0.2.0-py3-none-any.whl
- Upload date:
- Size: 40.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65fb8015d7ff56f61f90eb2634c77cd4bf59421be7aa1d563252602f79539b57
|
|
| MD5 |
07fc708becd478417910fffdca88b911
|
|
| BLAKE2b-256 |
9870a06b2ad0b0e0a151c23d471880c8500b789a2dad08f51cc069ac09cbd2b6
|