Skip to main content

Private semantic search engine for your local files

Project description

DeskSearch
DeskSearch

Search your files by meaning, not just keywords. 100% local, zero cloud.

PyPI version License: MIT

DeskSearch Demo


What is DeskSearch?

A private search engine for your local files. It combines keyword matching (BM25) with semantic search (dense vectors) so you can find files by what they mean, not just what they say — and nothing ever leaves your machine.

Install

Desktop App (recommended)

Download the latest release for your platform:

  • macOS.dmg (Apple Silicon)
  • Windows.exe installer
  • Linux.AppImage

👉 Download from Releases

Just open the app — no Python or terminal needed. It handles everything automatically.

pip (for CLI / developer use)

pip install desksearch
desksearch

On first run, the setup wizard detects your document folders, indexes them, and opens a web UI at localhost:3777.

Features

  • Desktop app — standalone app with system tray, global hotkey (Cmd+Shift+Space / Ctrl+Shift+Space), and built-in folder browser
  • Hybrid search — BM25 (via tantivy) + dense retrieval (FAISS), merged with Reciprocal Rank Fusion
  • Fully private — everything runs locally. No cloud, no API keys, no data leaves your machine
  • Fast — keyword search in <10ms, semantic search in <200ms
  • 30+ file formats — PDF, DOCX, Markdown, HTML, Jupyter notebooks, source code, CSV, LaTeX, and more
  • Web UI — clean dark-mode interface with folder browser, file explorer, and real-time indexing
  • CLI — search, index, and manage everything from the terminal
  • Background daemon — watches folders for changes and re-indexes automatically
  • Plugins — extend with custom parsers, rerankers, or data connectors

How It Works

  Your Files (PDF, DOCX, Markdown, Code, ...)
                    │
                    ▼
        ┌───────────────────────┐
        │   Parse → Chunk → Embed  │
        └─────┬───────────┬─────┘
              │           │
              ▼           ▼
        ┌─────────┐ ┌──────────┐
        │  BM25   │ │  Dense   │
        │(tantivy)│ │ (FAISS)  │
        └────┬────┘ └────┬─────┘
             │           │
             └─────┬─────┘
                   ▼
          Reciprocal Rank Fusion
                   │
                   ▼
          Ranked Results + Snippets
  1. Parse — extract text from 30+ formats
  2. Chunk — split into overlapping passages (512 chars, 64 overlap)
  3. Embed — generate 384-dim vectors with all-MiniLM-L6-v2 (runs locally)
  4. Search — query both indexes in parallel, fuse results with RRF

CLI Usage

desksearch                          # Start web UI (setup wizard on first run)
desksearch search "your query"      # Search from terminal
desksearch search "ML papers" -n 5  # Limit results
desksearch index ~/Documents        # Index a folder
desksearch status                   # Show index stats
desksearch daemon start             # Run in background with file watcher
desksearch config                   # View/edit configuration

Configuration

Config lives at ~/.desksearch/config.json:

Setting Default Description
index_paths ~/Documents, ~/Desktop Folders to index
embedding_model all-MiniLM-L6-v2 Embedding model
chunk_size 512 Characters per chunk
port 3777 Web UI port
max_file_size_mb 50 Skip files larger than this

Plugins

Three extension points: parsers (new file formats), search (rerankers), and connectors (external data sources).

Drop a .py file into ~/.desksearch/plugins/:

from desksearch.plugins.base import BaseParserPlugin
from pathlib import Path

class EpubParser(BaseParserPlugin):
    name = "epub-parser"
    extensions = [".epub"]

    def parse(self, file_path: Path) -> str:
        ...
        return text

Performance

Tested on MacBook Pro M1, 10,000 files (~2GB):

Metric Value
BM25 search <10ms
Semantic search <200ms
Indexing ~500 files/min
Memory (idle) ~80MB
Cold start ~1.5s

Building from Source

Desktop App

Requires Python 3.10+ and Node.js 18+.

# 1. Install Python dependencies
pip install -e ".[dev]"

# 2. Bundle the backend
pip install pyinstaller
pyinstaller desksearch.spec  # or see scripts/build-app.sh

# 3. Build the Electron app
cd electron && npm install && npm run dist:mac  # or dist:win / dist:linux

Development

pip install -e ".[dev]"
pytest

# Frontend dev (hot reload)
cd src/ui && npm install && npm run dev

License

MIT


Built by Shuai Wang

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

desksearch-0.1.4.tar.gz (789.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

desksearch-0.1.4-py3-none-any.whl (132.9 kB view details)

Uploaded Python 3

File details

Details for the file desksearch-0.1.4.tar.gz.

File metadata

  • Download URL: desksearch-0.1.4.tar.gz
  • Upload date:
  • Size: 789.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for desksearch-0.1.4.tar.gz
Algorithm Hash digest
SHA256 c9fe28c7bd725420c2e4718310dae64bc8796b417d392c8cf28cdbd3d2e0dbab
MD5 a84cde8206c95d3ee077f2be3779f24c
BLAKE2b-256 c5be9993c42ff5d65cb8a996187194f82ced55e7a0dcf190925f07b8a64e299c

See more details on using hashes here.

File details

Details for the file desksearch-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: desksearch-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 132.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for desksearch-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 65c5268c180aad2ff5348e7836744500092c172aeabda3cde7f1bf32be832262
MD5 95e63708701837316f9495c717ff5403
BLAKE2b-256 050087fad4cbaa59a04ea78bab48429ff4c47b92dc5b2a1eeb52d6b8300a6e99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page