Skip to main content

Private semantic search engine for your local files

Project description

DeskSearch
DeskSearch

Search your files by meaning, not just keywords. 100% local, zero cloud.

PyPI version License: MIT

DeskSearch Demo


What is DeskSearch?

A private search engine for your local files. It combines keyword matching (BM25) with semantic search (dense vectors) so you can find files by what they mean, not just what they say — and nothing ever leaves your machine.

Install

Desktop App (recommended)

Download the latest release for your platform:

  • macOS.dmg (Apple Silicon)
  • Windows.exe installer
  • Linux.AppImage

👉 Download from Releases

Just open the app — no Python or terminal needed. It handles everything automatically.

pip (for CLI / developer use)

pip install desksearch
desksearch

On first run, the setup wizard detects your document folders, indexes them, and opens a web UI at localhost:3777.

Features

  • Desktop app — standalone app with system tray, global hotkey (Cmd+Shift+Space / Ctrl+Shift+Space), and built-in folder browser
  • Hybrid search — BM25 (via tantivy) + dense retrieval (FAISS), merged with Reciprocal Rank Fusion
  • Fully private — everything runs locally. No cloud, no API keys, no data leaves your machine
  • Fast — keyword search in <10ms, semantic search in <200ms
  • 30+ file formats — PDF, DOCX, Markdown, HTML, Jupyter notebooks, source code, CSV, LaTeX, and more
  • Web UI — clean dark-mode interface with folder browser, file explorer, and real-time indexing
  • CLI — search, index, and manage everything from the terminal
  • Background daemon — watches folders for changes and re-indexes automatically
  • Plugins — extend with custom parsers, rerankers, or data connectors

How It Works

  Your Files (PDF, DOCX, Markdown, Code, ...)
                    │
                    ▼
        ┌───────────────────────┐
        │   Parse → Chunk → Embed  │
        └─────┬───────────┬─────┘
              │           │
              ▼           ▼
        ┌─────────┐ ┌──────────┐
        │  BM25   │ │  Dense   │
        │(tantivy)│ │ (FAISS)  │
        └────┬────┘ └────┬─────┘
             │           │
             └─────┬─────┘
                   ▼
          Reciprocal Rank Fusion
                   │
                   ▼
          Ranked Results + Snippets
  1. Parse — extract text from 30+ formats
  2. Chunk — split into overlapping passages (512 chars, 64 overlap)
  3. Embed — generate 384-dim vectors with all-MiniLM-L6-v2 (runs locally)
  4. Search — query both indexes in parallel, fuse results with RRF

CLI Usage

desksearch                          # Start web UI (setup wizard on first run)
desksearch search "your query"      # Search from terminal
desksearch search "ML papers" -n 5  # Limit results
desksearch index ~/Documents        # Index a folder
desksearch status                   # Show index stats
desksearch daemon start             # Run in background with file watcher
desksearch config                   # View/edit configuration

Configuration

Config lives at ~/.desksearch/config.json:

Setting Default Description
index_paths ~/Documents, ~/Desktop Folders to index
embedding_model all-MiniLM-L6-v2 Embedding model
chunk_size 512 Characters per chunk
port 3777 Web UI port
max_file_size_mb 50 Skip files larger than this

Plugins

Three extension points: parsers (new file formats), search (rerankers), and connectors (external data sources).

Drop a .py file into ~/.desksearch/plugins/:

from desksearch.plugins.base import BaseParserPlugin
from pathlib import Path

class EpubParser(BaseParserPlugin):
    name = "epub-parser"
    extensions = [".epub"]

    def parse(self, file_path: Path) -> str:
        ...
        return text

Performance

Tested on MacBook Pro M1, 10,000 files (~2GB):

Metric Value
BM25 search <10ms
Semantic search <200ms
Indexing ~500 files/min
Memory (idle) ~80MB
Cold start ~1.5s

Building from Source

Desktop App

Requires Python 3.10+ and Node.js 18+.

# 1. Install Python dependencies
pip install -e ".[dev]"

# 2. Bundle the backend
pip install pyinstaller
pyinstaller desksearch.spec  # or see scripts/build-app.sh

# 3. Build the Electron app
cd electron && npm install && npm run dist:mac  # or dist:win / dist:linux

Development

pip install -e ".[dev]"
pytest

# Frontend dev (hot reload)
cd src/ui && npm install && npm run dev

License

MIT


Built by Shuai Wang

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

desksearch-0.1.5.tar.gz (791.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

desksearch-0.1.5-py3-none-any.whl (134.6 kB view details)

Uploaded Python 3

File details

Details for the file desksearch-0.1.5.tar.gz.

File metadata

  • Download URL: desksearch-0.1.5.tar.gz
  • Upload date:
  • Size: 791.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for desksearch-0.1.5.tar.gz
Algorithm Hash digest
SHA256 1c96f31f21b565ab826f4c927ab219537ccd97bb780aa8a8a24ecbcc026bc9c6
MD5 683d4d42a11ed54f2e8c34303cd08d91
BLAKE2b-256 6a23ee72f5bed6fb6c1860c527f92659803bd704ad8b773a5c1436eabd879cfb

See more details on using hashes here.

File details

Details for the file desksearch-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: desksearch-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 134.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for desksearch-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 12a8b8bbee21dc1abf549559ed305108201f6b0ac2895f5497051f51f0acb724
MD5 ba5820689795e254b174cb1b2e7b7825
BLAKE2b-256 eb47e63e9c7309d20daad8db57bc4cf85d893ed87d88fb146bdca4bebc125f63

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page