Skip to main content

Private semantic search engine for your local files

Project description

DeskSearch
DeskSearch

Search your files by meaning, not just keywords. 100% local, zero cloud.

PyPI version License: MIT

DeskSearch Demo


What is DeskSearch?

A private search engine for your local files. It combines keyword matching (BM25) with semantic search (dense vectors) so you can find files by what they mean, not just what they say — and nothing ever leaves your machine.

Install

Desktop App (recommended)

Download the latest release for your platform:

  • macOS.dmg (Apple Silicon)
  • Windows.exe installer
  • Linux.AppImage

👉 Download from Releases

Just open the app — no Python or terminal needed. It handles everything automatically.

pip (for CLI / developer use)

pip install desksearch
desksearch

On first run, the setup wizard detects your document folders, indexes them, and opens a web UI at localhost:3777.

Features

  • Desktop app — standalone app with system tray, global hotkey (Cmd+Shift+Space / Ctrl+Shift+Space), and built-in folder browser
  • Hybrid search — BM25 (via tantivy) + dense retrieval (FAISS), merged with Reciprocal Rank Fusion
  • Fully private — everything runs locally. No cloud, no API keys, no data leaves your machine
  • Fast — keyword search in <10ms, semantic search in <200ms
  • 30+ file formats — PDF, DOCX, Markdown, HTML, Jupyter notebooks, source code, CSV, LaTeX, and more
  • Web UI — clean dark-mode interface with folder browser, file explorer, and real-time indexing
  • CLI — search, index, and manage everything from the terminal
  • Background daemon — watches folders for changes and re-indexes automatically
  • Plugins — extend with custom parsers, rerankers, or data connectors

How It Works

  Your Files (PDF, DOCX, Markdown, Code, ...)
                    │
                    ▼
        ┌───────────────────────┐
        │   Parse → Chunk → Embed  │
        └─────┬───────────┬─────┘
              │           │
              ▼           ▼
        ┌─────────┐ ┌──────────┐
        │  BM25   │ │  Dense   │
        │(tantivy)│ │ (FAISS)  │
        └────┬────┘ └────┬─────┘
             │           │
             └─────┬─────┘
                   ▼
          Reciprocal Rank Fusion
                   │
                   ▼
          Ranked Results + Snippets
  1. Parse — extract text from 30+ formats
  2. Chunk — split into overlapping passages (512 chars, 64 overlap)
  3. Embed — generate 384-dim vectors with all-MiniLM-L6-v2 (runs locally)
  4. Search — query both indexes in parallel, fuse results with RRF

CLI Usage

desksearch                          # Start web UI (setup wizard on first run)
desksearch search "your query"      # Search from terminal
desksearch search "ML papers" -n 5  # Limit results
desksearch index ~/Documents        # Index a folder
desksearch status                   # Show index stats
desksearch daemon start             # Run in background with file watcher
desksearch config                   # View/edit configuration

Configuration

Config lives at ~/.desksearch/config.json:

Setting Default Description
index_paths ~/Documents, ~/Desktop Folders to index
embedding_model all-MiniLM-L6-v2 Embedding model
chunk_size 512 Characters per chunk
port 3777 Web UI port
max_file_size_mb 50 Skip files larger than this

Plugins

Three extension points: parsers (new file formats), search (rerankers), and connectors (external data sources).

Drop a .py file into ~/.desksearch/plugins/:

from desksearch.plugins.base import BaseParserPlugin
from pathlib import Path

class EpubParser(BaseParserPlugin):
    name = "epub-parser"
    extensions = [".epub"]

    def parse(self, file_path: Path) -> str:
        ...
        return text

Performance

Tested on MacBook Pro M1, 10,000 files (~2GB):

Metric Value
BM25 search <10ms
Semantic search <200ms
Indexing ~500 files/min
Memory (idle) ~80MB
Cold start ~1.5s

Building from Source

Desktop App

Requires Python 3.10+ and Node.js 18+.

# 1. Install Python dependencies
pip install -e ".[dev]"

# 2. Bundle the backend
pip install pyinstaller
pyinstaller desksearch.spec  # or see scripts/build-app.sh

# 3. Build the Electron app
cd electron && npm install && npm run dist:mac  # or dist:win / dist:linux

Development

pip install -e ".[dev]"
pytest

# Frontend dev (hot reload)
cd src/ui && npm install && npm run dev

License

MIT


Built by Shuai Wang

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

desksearch-0.1.3.tar.gz (789.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

desksearch-0.1.3-py3-none-any.whl (132.8 kB view details)

Uploaded Python 3

File details

Details for the file desksearch-0.1.3.tar.gz.

File metadata

  • Download URL: desksearch-0.1.3.tar.gz
  • Upload date:
  • Size: 789.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for desksearch-0.1.3.tar.gz
Algorithm Hash digest
SHA256 8c409f217e52dac48b19d41093dd413d03f565ea697bbc748e53d93cafcbed02
MD5 788b49304721ab2992ad62341b5d009b
BLAKE2b-256 aa726167d78aa74c9c51d9823133195c7368b993fb6c1a43bec031f490afbe2b

See more details on using hashes here.

File details

Details for the file desksearch-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: desksearch-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 132.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for desksearch-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ffe17f58b73997da32a361813b04c21d856190b2839cf65d9a4f965ca153346a
MD5 05f0b9f24501af4ffa229a39bcabae23
BLAKE2b-256 921f2355c9eb51d8661e1d3d140d1964b3dded09a67deb6ddf2e5ed4467ba912

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page