Skip to main content

Local semantic search over your browser bookmarks — on-device embeddings, no cloud.

Project description

mindmark

Your bookmarks, finally searchable. Ask in natural language; mindmark remembers what you saved.

100% local. No cloud, no API keys, nothing leaves your machine. Embeddings run on-device via fastembed (ONNX, ~130 MB one-time model download).

mindmark demo


What it does

  • mindmark index <file> — parse an exported Netscape bookmarks HTML file, embed every bookmark locally, and store vectors in SQLite.
  • mindmark find "natural-language query" — semantic search over titles, folder paths, domains, and URL slugs. Returns top-K with cosine-similarity scores.
  • mindmark open "query" — open the top result in your default browser.
  • mindmark stats — show index size, model used, top domains, and top folders.

Works offline after the first index run (model cached locally by fastembed).


Install

Requires Python 3.9+.

Option 1 — pipx (recommended, isolated + globally on PATH)

pipx install mindmark

Option 2 — regular pip / venv

python -m venv .venv
# Windows: .venv\Scripts\activate
# macOS/Linux: source .venv/bin/activate
pip install .

Option 3 — editable install for development

pip install -e .[dev]

Quick start

1. Export your bookmarks

  • Edgeedge://favoritesExport favorites → save HTML
  • Chromechrome://bookmarksExport bookmarks → save HTML
  • FirefoxCtrl+Shift+OImport and BackupExport Bookmarks to HTML

2. Build the index

mindmark index ~/Downloads/bookmarks.html

First run downloads the embedding model (~130 MB) to ~/.cache/fastembed (or %LOCALAPPDATA%\fastembed on Windows). Subsequent runs are offline.

3. Search in natural language

mindmark find demo

mindmark find "python async tutorial"
mindmark find "react hooks best practices" -k 5
mindmark find "helm chart examples" --domain github.com
mindmark find "docker compose setup" --folder devops

4. Open a result directly

mindmark open "k8s cheat sheet"           # opens the best match
mindmark find "docker setup" --open 2     # opens result #2 from the list

Tip: alias it to something even shorter.

alias mm='mindmark open'
mm "docker setup"

5. JSON for scripting / fzf / Alfred / Raycast

mindmark find "istio service mesh" --json | jq '.[].url'

Re-indexing

Just rerun mindmark index <file>. It clears and rebuilds the bookmarks table. The model download is cached, so re-indexing 800 bookmarks takes seconds after the first time.

Storage layout

What Where Notes
SQLite index ~/.mindmark/index.db override with --db or MINDMARK_DB
Alternate home dir ~/.mindmark/ override with MINDMARK_HOME
Embedding model ~/.cache/fastembed/ managed by fastembed

Filters

  • --domain github.com — only results whose domain contains github.com
  • --folder work/kusto — only results inside folder paths containing work/kusto
  • -k 20 — return top 20 instead of top 10

Swap the embedding model

mindmark index bookmarks.html --model BAAI/bge-small-en-v1.5       # default, 384-dim
mindmark index bookmarks.html --model sentence-transformers/all-MiniLM-L6-v2
mindmark index bookmarks.html --model BAAI/bge-base-en-v1.5        # 768-dim, higher quality, slower

Switching models triggers a full re-embed automatically. See the fastembed supported models list.


How it works

  1. Parse — a small stateful tokenizer reads the Netscape bookmarks HTML and extracts every <A> with its ancestor <H3> folder stack, so each bookmark knows its full folder path.
  2. Embed — each bookmark becomes the string title | folder: Dev/Tools | domain: github.com | path: docs setup guide and is passed through a BGE/MiniLM ONNX model. Vectors are L2-normalized so cosine similarity = dot product.
  3. Store — vectors live as float32 BLOBs in a single SQLite file. For 800–10,000 bookmarks this is dramatically simpler than a vector DB and still sub-millisecond.
  4. Search — encode the query, take the dot product against all vectors, return the top-K.

Publishing to PyPI

First-time setup

  1. Create an account at pypi.org
  2. Generate an API token at pypi.org/manage/account/token/
  3. Install the build tools:
pip install build twine

Test on TestPyPI first (recommended)

python -m build
python -m twine upload --repository testpypi dist/*
# verify it works:
pipx install --index-url https://test.pypi.org/simple/ mindmark

Publish to PyPI

python -m build
python -m twine upload dist/*

Twine will prompt for your API token (use __token__ as the username). After uploading, anyone can install with:

pipx install mindmark

Alternative distribution methods

GitHub release + pipx
python -m build
gh release create v0.1.0 dist/*

# Users install:
pipx install https://github.com/sukanth/mindmark/releases/download/v0.1.0/mindmark-0.1.0-py3-none-any.whl
Standalone executable (no Python required)
pip install pyinstaller
pyinstaller --onefile -n mindmark -p src src/mindmark/__main__.py
# dist/mindmark (macOS/Linux) or dist/mindmark.exe (Windows)

Ship the binary in a GitHub release. First launch still downloads the ONNX model (~130 MB).

Docker
FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir .
ENTRYPOINT ["mindmark"]
docker build -t mindmark .
docker run --rm -v $HOME/.mindmark:/root/.mindmark \
    -v $HOME/Downloads:/downloads mindmark \
    index /downloads/bookmarks.html

Contributing

Contributions are welcome! See CONTRIBUTING.md for setup instructions and guidelines.

Tests

pip install -e .[dev]
pytest -q

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mindmark-0.1.2.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mindmark-0.1.2-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file mindmark-0.1.2.tar.gz.

File metadata

  • Download URL: mindmark-0.1.2.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mindmark-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0a7f8a4a3933e735c2e4f705026613bf71ddbab045ad4bb9faa058704dccf842
MD5 9417a10d094b1bf6639557c43f1b6943
BLAKE2b-256 bc6fe7b8039ea19c51cb3e7470fe6367403a5a4cc1fc000c7398f1437c746dd7

See more details on using hashes here.

Provenance

The following attestation bundles were made for mindmark-0.1.2.tar.gz:

Publisher: publish.yml on sukanth/mindmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mindmark-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: mindmark-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mindmark-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a68b548ca88ef1c4e34b43461796e3fd73dc6d6bdcac9f941c3ccf2354e601f0
MD5 56f91e1b69d847c81568acf6b3592615
BLAKE2b-256 c3d74e52caed0369e6a1c94b7d1c19e68fb5701e4bd9ee982850d6a701cea845

See more details on using hashes here.

Provenance

The following attestation bundles were made for mindmark-0.1.2-py3-none-any.whl:

Publisher: publish.yml on sukanth/mindmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page