Local semantic search over your browser bookmarks — on-device embeddings, no cloud.
Project description
mindmark
Your bookmarks, finally searchable. Ask in natural language; mindmark remembers what you saved.
100% local. No cloud, no API keys, nothing leaves your machine. Embeddings run on-device via fastembed (ONNX, ~130 MB one-time model download).
What it does
mindmark index <file>— parse an exported Netscape bookmarks HTML file, embed every bookmark locally, and store vectors in SQLite.mindmark find "natural-language query"— semantic search over titles, folder paths, domains, and URL slugs. Returns top-K with cosine-similarity scores.mindmark open "query"— open the top result in your default browser.mindmark stats— show index size, model used, top domains, and top folders.
Works offline after the first index run (model cached locally by fastembed).
Install
Requires Python 3.9+.
Option 1 — pipx (recommended, isolated + globally on PATH)
pipx install mindmark
Option 2 — regular pip / venv
python -m venv .venv
# Windows: .venv\Scripts\activate
# macOS/Linux: source .venv/bin/activate
pip install .
Option 3 — editable install for development
pip install -e .[dev]
Quick start
1. Export your bookmarks
- Edge —
edge://favorites→⋯→ Export favorites → save HTML - Chrome —
chrome://bookmarks→⋮→ Export bookmarks → save HTML - Firefox —
Ctrl+Shift+O→ Import and Backup → Export Bookmarks to HTML
2. Build the index
mindmark index ~/Downloads/bookmarks.html
First run downloads the embedding model (~130 MB) to ~/.cache/fastembed (or %LOCALAPPDATA%\fastembed on Windows). Subsequent runs are offline.
3. Search in natural language
mindmark find "python async tutorial"
mindmark find "react hooks best practices" -k 5
mindmark find "helm chart examples" --domain github.com
mindmark find "docker compose setup" --folder devops
4. Open a result directly
mindmark open "k8s cheat sheet" # opens the best match
mindmark find "docker setup" --open 2 # opens result #2 from the list
Tip: alias it to something even shorter.
alias mm='mindmark open'
mm "docker setup"
5. JSON for scripting / fzf / Alfred / Raycast
mindmark find "istio service mesh" --json | jq '.[].url'
Re-indexing
Just rerun mindmark index <file>. It clears and rebuilds the bookmarks table. The model download is cached, so re-indexing 800 bookmarks takes seconds after the first time.
Storage layout
| What | Where | Notes |
|---|---|---|
| SQLite index | ~/.mindmark/index.db |
override with --db or MINDMARK_DB |
| Alternate home dir | ~/.mindmark/ |
override with MINDMARK_HOME |
| Embedding model | ~/.cache/fastembed/ |
managed by fastembed |
Filters
--domain github.com— only results whose domain containsgithub.com--folder work/kusto— only results inside folder paths containingwork/kusto-k 20— return top 20 instead of top 10
Swap the embedding model
mindmark index bookmarks.html --model BAAI/bge-small-en-v1.5 # default, 384-dim
mindmark index bookmarks.html --model sentence-transformers/all-MiniLM-L6-v2
mindmark index bookmarks.html --model BAAI/bge-base-en-v1.5 # 768-dim, higher quality, slower
Switching models triggers a full re-embed automatically. See the fastembed supported models list.
How it works
- Parse — a small stateful tokenizer reads the Netscape bookmarks HTML and extracts every
<A>with its ancestor<H3>folder stack, so each bookmark knows its full folder path. - Embed — each bookmark becomes the string
title | folder: Dev/Tools | domain: github.com | path: docs setup guideand is passed through a BGE/MiniLM ONNX model. Vectors are L2-normalized so cosine similarity = dot product. - Store — vectors live as
float32BLOBs in a single SQLite file. For 800–10,000 bookmarks this is dramatically simpler than a vector DB and still sub-millisecond. - Search — encode the query, take the dot product against all vectors, return the top-K.
Publishing to PyPI
First-time setup
- Create an account at pypi.org
- Generate an API token at pypi.org/manage/account/token/
- Install the build tools:
pip install build twine
Test on TestPyPI first (recommended)
python -m build
python -m twine upload --repository testpypi dist/*
# verify it works:
pipx install --index-url https://test.pypi.org/simple/ mindmark
Publish to PyPI
python -m build
python -m twine upload dist/*
Twine will prompt for your API token (use __token__ as the username). After uploading, anyone can install with:
pipx install mindmark
Alternative distribution methods
GitHub release + pipx
python -m build
gh release create v0.1.0 dist/*
# Users install:
pipx install https://github.com/sukanth/mindmark/releases/download/v0.1.0/mindmark-0.1.0-py3-none-any.whl
Standalone executable (no Python required)
pip install pyinstaller
pyinstaller --onefile -n mindmark -p src src/mindmark/__main__.py
# dist/mindmark (macOS/Linux) or dist/mindmark.exe (Windows)
Ship the binary in a GitHub release. First launch still downloads the ONNX model (~130 MB).
Docker
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir .
ENTRYPOINT ["mindmark"]
docker build -t mindmark .
docker run --rm -v $HOME/.mindmark:/root/.mindmark \
-v $HOME/Downloads:/downloads mindmark \
index /downloads/bookmarks.html
Tests
pip install -e .[dev]
pytest -q
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mindmark-0.1.0.tar.gz.
File metadata
- Download URL: mindmark-0.1.0.tar.gz
- Upload date:
- Size: 12.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5fed8d386dc3a549ca7471899f26aa187029f6e4688981c1245ff39e43f04418
|
|
| MD5 |
d815bd33732792a7009d330406eee469
|
|
| BLAKE2b-256 |
c045293ec69508ab2de4e2c03e6eb215e31ef0cf10ff02ad4b201810ac1f8540
|
Provenance
The following attestation bundles were made for mindmark-0.1.0.tar.gz:
Publisher:
publish.yml on sukanth/mindmark
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mindmark-0.1.0.tar.gz -
Subject digest:
5fed8d386dc3a549ca7471899f26aa187029f6e4688981c1245ff39e43f04418 - Sigstore transparency entry: 1325649967
- Sigstore integration time:
-
Permalink:
sukanth/mindmark@ebbc2d44ee6b5fb93873bd21416856b269e90d1d -
Branch / Tag:
refs/heads/master - Owner: https://github.com/sukanth
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ebbc2d44ee6b5fb93873bd21416856b269e90d1d -
Trigger Event:
push
-
Statement type:
File details
Details for the file mindmark-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mindmark-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
85bd5cbbc04c64ffdc221cb45a97e2628acbaaa26cf2bd858cd20dcf42f61696
|
|
| MD5 |
b5a4a36135fa09872e3ce2f22cb37c88
|
|
| BLAKE2b-256 |
1efbcdd4fc7f7803ce230d266363336022ef5ad6dda2bad7eddc378b43dc8726
|
Provenance
The following attestation bundles were made for mindmark-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on sukanth/mindmark
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mindmark-0.1.0-py3-none-any.whl -
Subject digest:
85bd5cbbc04c64ffdc221cb45a97e2628acbaaa26cf2bd858cd20dcf42f61696 - Sigstore transparency entry: 1325650160
- Sigstore integration time:
-
Permalink:
sukanth/mindmark@ebbc2d44ee6b5fb93873bd21416856b269e90d1d -
Branch / Tag:
refs/heads/master - Owner: https://github.com/sukanth
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ebbc2d44ee6b5fb93873bd21416856b269e90d1d -
Trigger Event:
push
-
Statement type: