Private semantic search engine for your local files
Project description
DeskSearch
Search your files by meaning, not just keywords. 100% local, zero cloud.
What is DeskSearch?
A private search engine for your local files. It combines keyword matching (BM25) with semantic search (dense vectors) so you can find files by what they mean, not just what they say — and nothing ever leaves your machine.
Install
Desktop App (recommended)
Download the latest release for your platform:
- macOS —
.dmg(Apple Silicon) - Windows —
.exeinstaller - Linux —
.AppImage
Just open the app — no Python or terminal needed. It handles everything automatically.
pip (for CLI / developer use)
pip install desksearch
desksearch
On first run, the setup wizard detects your document folders, indexes them, and opens a web UI at localhost:3777.
Features
- Desktop app — standalone app with system tray, global hotkey (
Cmd+Shift+Space/Ctrl+Shift+Space), and built-in folder browser - Hybrid search — BM25 (via tantivy) + dense retrieval (FAISS), merged with Reciprocal Rank Fusion
- Fully private — everything runs locally. No cloud, no API keys, no data leaves your machine
- Fast — keyword search in <10ms, semantic search in <200ms
- 30+ file formats — PDF, DOCX, Markdown, HTML, Jupyter notebooks, source code, CSV, LaTeX, and more
- Web UI — clean dark-mode interface with folder browser, file explorer, and real-time indexing
- CLI — search, index, and manage everything from the terminal
- Background daemon — watches folders for changes and re-indexes automatically
- Plugins — extend with custom parsers, rerankers, or data connectors
How It Works
Your Files (PDF, DOCX, Markdown, Code, ...)
│
▼
┌───────────────────────┐
│ Parse → Chunk → Embed │
└─────┬───────────┬─────┘
│ │
▼ ▼
┌─────────┐ ┌──────────┐
│ BM25 │ │ Dense │
│(tantivy)│ │ (FAISS) │
└────┬────┘ └────┬─────┘
│ │
└─────┬─────┘
▼
Reciprocal Rank Fusion
│
▼
Ranked Results + Snippets
- Parse — extract text from 30+ formats
- Chunk — split into overlapping passages (512 chars, 64 overlap)
- Embed — generate 384-dim vectors with
all-MiniLM-L6-v2(runs locally) - Search — query both indexes in parallel, fuse results with RRF
CLI Usage
desksearch # Start web UI (setup wizard on first run)
desksearch search "your query" # Search from terminal
desksearch search "ML papers" -n 5 # Limit results
desksearch index ~/Documents # Index a folder
desksearch status # Show index stats
desksearch daemon start # Run in background with file watcher
desksearch config # View/edit configuration
Configuration
Config lives at ~/.desksearch/config.json:
| Setting | Default | Description |
|---|---|---|
index_paths |
~/Documents, ~/Desktop |
Folders to index |
embedding_model |
all-MiniLM-L6-v2 |
Embedding model |
chunk_size |
512 |
Characters per chunk |
port |
3777 |
Web UI port |
max_file_size_mb |
50 |
Skip files larger than this |
Plugins
Three extension points: parsers (new file formats), search (rerankers), and connectors (external data sources).
Drop a .py file into ~/.desksearch/plugins/:
from desksearch.plugins.base import BaseParserPlugin
from pathlib import Path
class EpubParser(BaseParserPlugin):
name = "epub-parser"
extensions = [".epub"]
def parse(self, file_path: Path) -> str:
...
return text
Performance
Tested on MacBook Pro M1, 10,000 files (~2GB):
| Metric | Value |
|---|---|
| BM25 search | <10ms |
| Semantic search | <200ms |
| Indexing | ~500 files/min |
| Memory (idle) | ~80MB |
| Cold start | ~1.5s |
Building from Source
Desktop App
Requires Python 3.10+ and Node.js 18+.
# 1. Install Python dependencies
pip install -e ".[dev]"
# 2. Bundle the backend
pip install pyinstaller
pyinstaller desksearch.spec # or see scripts/build-app.sh
# 3. Build the Electron app
cd electron && npm install && npm run dist:mac # or dist:win / dist:linux
Development
pip install -e ".[dev]"
pytest
# Frontend dev (hot reload)
cd src/ui && npm install && npm run dev
License
Built by Shuai Wang
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file desksearch-0.1.4.tar.gz.
File metadata
- Download URL: desksearch-0.1.4.tar.gz
- Upload date:
- Size: 789.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9fe28c7bd725420c2e4718310dae64bc8796b417d392c8cf28cdbd3d2e0dbab
|
|
| MD5 |
a84cde8206c95d3ee077f2be3779f24c
|
|
| BLAKE2b-256 |
c5be9993c42ff5d65cb8a996187194f82ced55e7a0dcf190925f07b8a64e299c
|
File details
Details for the file desksearch-0.1.4-py3-none-any.whl.
File metadata
- Download URL: desksearch-0.1.4-py3-none-any.whl
- Upload date:
- Size: 132.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65c5268c180aad2ff5348e7836744500092c172aeabda3cde7f1bf32be832262
|
|
| MD5 |
95e63708701837316f9495c717ff5403
|
|
| BLAKE2b-256 |
050087fad4cbaa59a04ea78bab48429ff4c47b92dc5b2a1eeb52d6b8300a6e99
|