Local-first second-brain CLI for Obsidian vaults: hybrid search, typed knowledge graph, MCP server.
Project description
smolbren
Local-first second-brain CLI for Obsidian vaults.
Turns a folder of Markdown into a queryable knowledge base:
- Hybrid search — vector (sqlite-vec) + keyword (FTS5/BM25), fused with Reciprocal Rank Fusion
- Typed knowledge graph — self-wired from frontmatter relations, wikilinks, and regex patterns
- Embedding cache keyed by content hash — renaming or moving chunks costs zero embedding calls
- Watch mode — re-ingests on file changes with per-file debouncing
Install
Requires Python 3.12+ and a running Ollama with the embedding model pulled:
ollama pull nomic-embed-text
From PyPI (once published):
uv tool install smolbren
From source:
git clone https://github.com/junaidrahim/smolbren && cd smolbren
uv sync
uv run smolbren --help
Quickstart
cd /path/to/vault
smolbren init # writes .smolbren/config.toml + db
smolbren ingest # parse → chunk → upsert → embed
smolbren search "who's on call?"
smolbren graph neighbors people/jane --depth 2
smolbren stats
Run smolbren ingest --watch to keep the index live while you edit.
Commands
| Command | What it does |
|---|---|
init |
Scaffold .smolbren/ and write the default config |
ingest [--watch] [--no-embed] |
Parse, chunk, upsert, embed. --watch stays running with debounced re-ingest. |
embed |
Embed any chunks that don't yet have a vector (cache-aware) |
search QUERY [--mode hybrid|vector|keyword] [--top-k N] |
Semantic / keyword / hybrid search |
stats |
Page / chunk / edge counts and type distribution |
graph neighbors SLUG [--type T] [--depth N] [--direction out|in|both] |
BFS reachable neighbors |
graph path SRC DST |
Shortest path between two slugs |
graph stats [--top N] |
Node/edge counts, components, top-degree nodes |
Every command accepts --vault PATH (defaults to $SMOLBREN_VAULT, then cwd) and --json for machine-readable output.
How it works
- Ingest — Markdown is parsed with frontmatter, chunked by H2 with a token-window overlap, and stored in SQLite (WAL). Code fences are stripped before edge extraction so
`[[fake]]`doesn't pollute the graph. - Embed — Chunks without a vector are embedded via Ollama (
nomic-embed-text, 768d, L2-normalized). The cache is keyed on(content_hash, model)so chunks that move between files don't re-hit the model. - Search — Vector via sqlite-vec; keyword via FTS5 with BM25. Hybrid overfetches both branches, fuses with RRF (
score = weight / (k + rank)), then applies a multiplicative backlink boost (score × (1 + boost·log(1+backlinks))) before slicing to top-k. Boost coefficient lives in config; set to0to disable. - Graph — Edges come from frontmatter relations (allow-listed types:
works_on,owns,member_of,reports_to,depends_on,attended,decided_in,references,mentions), wikilinks (asmentions), and a small set of regex patterns (X works at [[Y]],X owns [[Y]],attended [[Z]],depends on [[W]],decided in [[V]]). Loaded into a NetworkXMultiDiGraphand cached process-locally; the cache invalidates on any edge mutation via a DB-side version counter.
Config
.smolbren/config.toml:
[embeddings]
model = "nomic-embed-text"
ollama_url = "http://localhost:11434"
[chunking]
strategy = "h2"
max_chunk_tokens = 512
overlap_tokens = 50
[search]
rrf_k = 60
backlink_boost = 0.15
hybrid_weights = [1.0, 1.0]
[ignore]
patterns = ["**/.obsidian/**", "**/node_modules/**"]
Development
uv sync
uv run pytest
uv run ruff check
uv run mypy
Release process
Releases are fully automated. Every push to main is parsed for
Conventional Commits; if any
commit since the last tag is feat:, fix:, perf:, or contains
BREAKING CHANGE, python-semantic-release
bumps the version, commits the bump back to main (with [skip ci]),
tags it, creates a GitHub Release, and publishes the wheel + sdist to
PyPI via the pypi trusted-publisher environment.
Bump rules:
| Prefix | Bump |
|---|---|
feat: |
minor |
fix:, perf: |
patch |
any commit body with BREAKING CHANGE: |
major |
chore:, docs:, ci:, style:, test:, refactor:, build: |
none |
Non-conforming commit messages are ignored (no version bump). Reference
implementation: .github/workflows/publish.yaml and the
[tool.semantic_release] block in pyproject.toml.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smolbren-1.0.0.tar.gz.
File metadata
- Download URL: smolbren-1.0.0.tar.gz
- Upload date:
- Size: 27.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7c47e240faec354476be5e64d47632dcbd2f914c3a5d9d4cfd845d2ed1a80dd
|
|
| MD5 |
20c6dadd721e24580e17e54810a1bab3
|
|
| BLAKE2b-256 |
4e5d2fc78fc1cea1427312e8fc888c6af73fbe535a19e817bb471a1377d83c53
|
File details
Details for the file smolbren-1.0.0-py3-none-any.whl.
File metadata
- Download URL: smolbren-1.0.0-py3-none-any.whl
- Upload date:
- Size: 32.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6848c3e2ced2134aa489783c545c209be9b571c0b53038b6878183b8c4c3233c
|
|
| MD5 |
33f2987b793e92fd7d7264753a5192dc
|
|
| BLAKE2b-256 |
a97e677c0f255fa59ad7e2de4c5573b78b7fb93f068dc9be9eaa18f715c01070
|