Skip to main content

Local semantic search over your browser bookmarks โ€” on-device embeddings, no cloud.

Project description

๐Ÿ”– mindmark

Your bookmarks, finally searchable.
Ask in natural language โ€” mindmark remembers what you saved.

PyPI Python License: MIT CI Platform

100% local ยท No cloud ยท No API keys ยท Nothing leaves your machine

mindmark demo

Table of Contents


โœจ Features

Command What it does
mindmark index <file> Parse an exported bookmarks HTML file, embed every bookmark locally, store vectors in SQLite
mindmark find "query" Semantic search over titles, folders, domains, and URL slugs โ€” returns top-K with similarity scores
mindmark open "query" Search and open the best match in your default browser
mindmark stats Show index size, model info, top domains, and top folders

๐Ÿ”Œ Works offline after the first run. Embeddings run on-device via fastembed (ONNX Runtime, ~130 MB one-time model download).


๐Ÿ“‹ Prerequisites

Requirement Details
Python 3.9+ python.org/downloads โ€” on Windows, check "Add Python to PATH" during setup
pip Bundled with Python โ€” verify with pip --version or pip3 --version
Internet Needed only once to download the embedding model (~130 MB). Everything after that is offline
๐Ÿ’ก Windows tip โ€” Python PATH

If you installed Python from the Microsoft Store, python and pip are already on your PATH.
If you installed from python.org, make sure you checked "Add Python to PATH" during setup.


๐Ÿ“ฆ Install

Recommended โ€” pipx (isolated + globally on PATH)

pipx install mindmark
Don't have pipx?
pip install --user pipx && pipx ensurepath    # then restart your terminal

Or on macOS with Homebrew: brew install pipx

Alternative โ€” pip with a virtual environment

macOS / Linux:

python3 -m venv .venv && source .venv/bin/activate
pip install mindmark

Windows (PowerShell):

python -m venv .venv; .venv\Scripts\Activate.ps1
pip install mindmark

Windows (Command Prompt):

python -m venv .venv && .venv\Scripts\activate.bat
pip install mindmark
Editable install for development
git clone https://github.com/sukanth/mindmark.git
cd mindmark
pip install -e .[dev]

โšก Quick Start

1๏ธโƒฃ Export your bookmarks

Browser How
Edge edge://favorites โ†’ โ‹ฏ โ†’ Export favorites โ†’ save as HTML
Chrome chrome://bookmarks โ†’ โ‹ฎ โ†’ Export bookmarks โ†’ save as HTML
Firefox Ctrl+Shift+O (Cmd+Shift+O on macOS) โ†’ Import and Backup โ†’ Export Bookmarks to HTML

2๏ธโƒฃ Build the index

# macOS / Linux
mindmark index ~/Downloads/bookmarks.html

# Windows (PowerShell)
mindmark index "$env:USERPROFILE\Downloads\bookmarks.html"

First run downloads the embedding model (~130 MB) and caches it locally. Every run after that is instant and fully offline.

3๏ธโƒฃ Search in natural language

mindmark find demo

mindmark find "python async tutorial"
mindmark find "react hooks best practices" -k 5
mindmark find "helm chart examples" --domain github.com
mindmark find "docker compose setup" --folder devops

4๏ธโƒฃ Open a result directly

mindmark open "k8s cheat sheet"           # opens the best match
mindmark find "docker setup" --open 2     # opens result #2 from the list
๐Ÿ’ก Tip โ€” create a short alias

macOS / Linux โ€” add to ~/.bashrc or ~/.zshrc:

alias mm='mindmark open'
mm "docker setup"

Windows โ€” add to your PowerShell $PROFILE:

Set-Alias mm mindmark
mm open "docker setup"

5๏ธโƒฃ JSON output for scripting

Pipe results into fzf, jq, Alfred, Raycast, PowerToys Run, or any tool that accepts JSON:

# macOS / Linux
mindmark find "istio service mesh" --json | jq '.[].url'

# Windows (PowerShell)
mindmark find "istio service mesh" --json | ConvertFrom-Json | ForEach-Object { $_.url }

๐Ÿ“– Usage

Filters

Narrow down results without changing your query:

mindmark find "useful tools" --domain github.com     # only github.com results
mindmark find "useful tools" --folder work/kusto      # only bookmarks in matching folders
mindmark find "useful tools" -k 20                    # return top 20 instead of 10

Re-indexing

Just rerun mindmark index <file>. It clears and rebuilds the index. The model is cached, so re-indexing 800+ bookmarks takes only seconds.

Swap the embedding model

mindmark index bookmarks.html --model BAAI/bge-small-en-v1.5              # default, 384-dim
mindmark index bookmarks.html --model sentence-transformers/all-MiniLM-L6-v2
mindmark index bookmarks.html --model BAAI/bge-base-en-v1.5               # 768-dim, higher quality

Switching models triggers a full re-embed automatically. See the fastembed supported models list.


๐Ÿง  How It Works

Bookmarks HTML                                  "python async tutorial"
      โ”‚                                                  โ”‚
      โ–ผ                                                  โ–ผ
  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚ Parse  โ”‚โ”€โ”€โ”€โ–ถโ”‚  Embed   โ”‚โ”€โ”€โ”€โ–ถโ”‚  Store   โ”‚     โ”‚  Embed   โ”‚
  โ”‚  HTML  โ”‚    โ”‚ (ONNX)   โ”‚    โ”‚ (SQLite) โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚  query   โ”‚
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                      โ”‚                โ”‚
                                      โ–ผ                โ–ผ
                                โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                โ”‚  Dot-product similarity  โ”‚
                                โ”‚   โ†’ top-K results        โ”‚
                                โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  1. Parse โ€” A stateful tokenizer reads the Netscape bookmarks HTML and extracts every link with its full folder path.
  2. Embed โ€” Each bookmark becomes a rich text string (title | folder | domain | path) and is passed through a BGE/MiniLM ONNX model. Vectors are L2-normalized.
  3. Store โ€” Vectors live as float32 blobs in a single SQLite file. For 800โ€“10,000 bookmarks this is simpler than a vector DB and still sub-millisecond.
  4. Search โ€” Encode the query, compute dot products against all vectors, return the top-K.

๐Ÿ—‚๏ธ Storage Layout

What macOS / Linux Windows Override
Index database ~/.mindmark/index.db %LOCALAPPDATA%\mindmark\index.db --db flag or MINDMARK_DB env var
Home directory ~/.mindmark/ %LOCALAPPDATA%\mindmark\ MINDMARK_HOME env var
Embedding model ~/.cache/fastembed/ %LOCALAPPDATA%\fastembed\ Managed by fastembed

๐Ÿ—‘๏ธ Uninstall

pipx uninstall mindmark    # if installed with pipx
pip uninstall mindmark      # if installed with pip
Remove stored data (optional)

The index and cached model are stored outside the package:

macOS / Linux:

rm -rf ~/.mindmark              # index database
rm -rf ~/.cache/fastembed        # cached embedding model (~130 MB)

Windows (PowerShell):

Remove-Item -Recurse "$env:LOCALAPPDATA\mindmark"     # index database
Remove-Item -Recurse "$env:LOCALAPPDATA\fastembed"     # cached embedding model

If you set a custom MINDMARK_HOME, remove that directory instead.


๐Ÿ› ๏ธ Development

Contributions are welcome! See CONTRIBUTING.md for full details.

git clone https://github.com/sukanth/mindmark.git
cd mindmark
pip install -e .[dev]
pytest -q
Publishing to PyPI

First-time setup

  1. Create an account at pypi.org
  2. Generate an API token at pypi.org/manage/account/token/
  3. Install build tools: pip install build twine

Test on TestPyPI first (recommended)

python -m build
python -m twine upload --repository testpypi dist/*
pipx install --index-url https://test.pypi.org/simple/ mindmark

Publish to PyPI

python -m build
python -m twine upload dist/*

Use __token__ as the username when prompted.

Alternative distribution methods

GitHub release

python -m build
gh release create v0.1.0 dist/*
# Users install:
pipx install https://github.com/sukanth/mindmark/releases/download/v0.1.0/mindmark-0.1.0-py3-none-any.whl

Standalone executable (no Python required)

pip install pyinstaller
pyinstaller --onefile -n mindmark -p src src/mindmark/__main__.py
# Creates: dist/mindmark (macOS/Linux) or dist/mindmark.exe (Windows)

Docker

FROM python:3.11-slim
WORKDIR /app
COPY . .
RUN pip install --no-cache-dir .
ENTRYPOINT ["mindmark"]
docker build -t mindmark .
docker run --rm -v $HOME/.mindmark:/root/.mindmark \
    -v $HOME/Downloads:/downloads mindmark \
    index /downloads/bookmarks.html

๐Ÿ“„ License

MIT โ€” see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mindmark-0.1.3.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mindmark-0.1.3-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file mindmark-0.1.3.tar.gz.

File metadata

  • Download URL: mindmark-0.1.3.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mindmark-0.1.3.tar.gz
Algorithm Hash digest
SHA256 75a578899f1d1e74375810dfeecce0f1c3dc15285914e4c15a1a5459b0672419
MD5 77b8e36989a25b534ad76a06a6f05967
BLAKE2b-256 28f1e6d8bd18d9f2581910e169d79b66fa0b9bf565f631c5332bc50e876ed093

See more details on using hashes here.

Provenance

The following attestation bundles were made for mindmark-0.1.3.tar.gz:

Publisher: publish.yml on sukanth/mindmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mindmark-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: mindmark-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for mindmark-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 16c05325dc8e8d5ab67280926867a5a1a1b9f568c23ac499a1191a998f4bee68
MD5 3dcba1c98ea3728161285312460cc0bc
BLAKE2b-256 fba64dc3f113afeeeb31eadc7199c6b3c7e646afa3f4a71a560a46b83d0394ef

See more details on using hashes here.

Provenance

The following attestation bundles were made for mindmark-0.1.3-py3-none-any.whl:

Publisher: publish.yml on sukanth/mindmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page