Local, offline voice dictation for Linux, macOS, and Windows — hold a key, speak, release
Project description
YazSes
Local, offline voice dictation for Linux, macOS, and Windows. Hold a key, speak, release — the transcribed text appears in whatever app is focused. No cloud, no GPU.
Hold the dictation key (>0.5s) → speak → release → text appears
Powered by faster-whisper (CPU/int8). Works in browsers, terminals, IDEs, chat apps — anywhere the OS lets keystrokes reach the focused window.
Supported platforms
| OS | Hotkey default | Install | Status |
|---|---|---|---|
| Linux | Space |
apt / snap / PPA / pipx / .deb / installer | Stable |
| macOS | Right Option |
.dmg (Homebrew Cask coming) |
Developer preview (unsigned) |
| Windows | Right Ctrl |
.exe installer (winget coming) |
Developer preview (unsigned) |
Why Right Ctrl on Windows, not Right Alt? On many international layouts Right Alt acts as AltGr — used to type
@,€,{},[],\,~, etc. Hijacking it would break normal typing. Right Ctrl is rarely used for typing, so it's the safer default. Every platform's hotkey is configurable inconfig.toml.
Quick install
One-line install on every major OS:
# macOS — via Homebrew tap
brew tap novafabric/yazses && brew install --cask yazses
# Windows — via winget (pending PR review at microsoft/winget-pkgs#371427)
winget install NovaFabric.YazSes
# Linux — via the apt repo
bash <(curl -fsSL https://raw.githubusercontent.com/novafabric/yazses/main/install.sh)
# Cross-platform fallback — pip
pipx install yazses
After install:
| OS | What's left |
|---|---|
| macOS | Right-click → Open the first time (unsigned dev preview); grant Accessibility + Microphone when prompted; hold Right Option to dictate. |
| Windows | If SmartScreen warns, click More info → Run anyway (unsigned dev preview); hold Right Ctrl to dictate. |
| Linux | sudo usermod -aG input "$USER" then re-login; systemctl --user enable --now yazses.service; hold Space to dictate. |
Full per-OS guides: docs/macos-install.md, docs/windows-install.md. Status of every distribution channel lives in docs/distribution-status.md.
Other channels
If a one-liner above doesn't fit your environment, pick from the platform sections below.
macOS — alternatives
# Direct .dmg download (no Homebrew needed)
# https://github.com/novafabric/yazses/releases/latest
# Open the .dmg, drag YazSes.app into /Applications, right-click → Open the first time.
Windows — alternatives
# Direct .exe download
# https://github.com/novafabric/yazses/releases/latest
# Click "More info → Run anyway" if SmartScreen warns.
Linux — alternatives
# APT repo (Debian/Ubuntu)
curl -fsSL https://novafabric.github.io/yazses/apt/KEY.gpg \
| sudo gpg --dearmor --yes -o /usr/share/keyrings/yazses.gpg
echo "deb [signed-by=/usr/share/keyrings/yazses.gpg] https://novafabric.github.io/yazses/apt ./" \
| sudo tee /etc/apt/sources.list.d/yazses.list
sudo apt update && sudo apt install yazses
# Launchpad PPA (Ubuntu)
sudo add-apt-repository ppa:novafabric/yazses
sudo apt update && sudo apt install yazses
# Snap (works on most distros after `snapd` is installed)
sudo snap install yazses --classic
# AUR (Arch / Manjaro / EndeavourOS)
yay -S yazses # any AUR helper
# .deb download
# https://github.com/novafabric/yazses/releases/latest
sudo apt install ./yazses_*.deb
# pipx (any Linux)
sudo apt install libportaudio2 xdotool xclip pipx
pipx install yazses
Optional extras
v0.4.0 introduces three opt-in feature groups. Install only what you need:
| Extra | What it enables | Dependencies installed |
|---|---|---|
yazses[slm] |
SLM intent routing — natural phrasing for voice commands | llama-cpp-python + GGUF model |
yazses[lsp] |
LSP code context injection — better identifier accuracy | pygls, pynvim |
yazses[emg] |
EMG silent speech backend — dictate without speaking aloud | pyserial |
yazses[all] |
All optional extras | all of the above |
pip install "yazses[slm]" # SLM routing only
pip install "yazses[lsp]" # LSP context only
pip install "yazses[emg]" # EMG backend only
pip install "yazses[all]" # everything
Each extra requires additional setup described in the Configuration section below.
Usage
YazSes runs silently in the background. The same CLI works on every platform.
| Command | What it does |
|---|---|
| Hold the hotkey, speak, release | Transcribe and inject text into focused app |
yazses status |
Daemon state, model, hotkey, backend, uptime |
yazses start / stop |
Manage the daemon |
yazses doctor |
Per-platform prerequisite check |
yazses inject "hello" |
Type text without recording (debug) |
yazses remote <host> |
Forward voice typing to a remote SSH host |
yazses remote --stop |
Disconnect active remote session |
yazses enroll |
Calibration wizard for VAD / silence settings |
On macOS and Windows the YazSes tray icon changes color to reflect state (idle / recording / transcribing / remote / error).
Voice commands (v0.4.0)
Speak natural commands while [commands] enabled = true (default). v0.4.0 adds a Tier 2 SLM routing layer (requires yazses[slm]) that handles natural, varied phrasing — you no longer need to say the exact canonical form:
| Say (examples) | Action |
|---|---|
| "undo" / "undo 3 times" | Ctrl+Z (×N) |
| "save file" / "save this" / "save it" | Ctrl+S |
| "delete 2 words" | Ctrl+Backspace ×2 |
| "delete 3 lines" | Delete 3 lines |
| "go to line 42" | Ctrl+G → "42" → Enter |
| "comment selection" | Ctrl+/ |
| "copy" / "paste" | Ctrl+C / Ctrl+V |
| "scratch that" / "delete that" | Remove text back to last sentence |
| "close this tab" / "close the current tab" | Ctrl+W (SLM Tier 2) |
| "zoom in" / "make this bigger" | Ctrl++ (SLM Tier 2) |
Without yazses[slm], the Tier 1 regex grammar handles a fixed set of canonical phrases. With it, the SLM layer catches anything the regex misses, at the cost of ~50–200 ms additional latency per utterance.
Everything that does not match a command intent is typed verbatim.
Configuration
config.toml lives in the platform's standard config dir:
| OS | Path |
|---|---|
| Linux | ~/.config/yazses/config.toml |
| macOS | ~/Library/Application Support/yazses/config.toml |
| Windows | %APPDATA%\yazses\config.toml |
[stt]
model = "tiny.en" # tiny.en (fast) | base.en (more accurate, slower)
[hotkey]
# "auto" → Space (Linux) / right_option (macOS) / right_ctrl (Windows).
key = "auto"
hold_threshold_ms = 500
[audio]
sample_rate = 16000
max_record_seconds = 90
[tray]
enabled = "auto" # default true on macOS/Windows, false on Linux v0
[general]
log_level = "INFO"
# --- v0.3.0 additions (all optional — defaults shown) ---
[commands]
enabled = true # voice command grammar (undo, save, go to line N, …)
profile = "auto" # "auto" | "vscode" | "vim" | "default"
[filters.disfluency]
enabled = true # remove filler words, repeated phrases, "scratch that"
[accessibility]
vad_threshold = 0.01 # silence threshold — run `yazses enroll` to calibrate
min_silence_ms = 500 # minimum silence to end a recording
pre_speech_padding_ms = 200 # prepend ring-buffer audio to catch voice onset
[streaming]
enabled = true # emit stable partial transcripts while you speak
partial_interval_ms = 300
[remote]
default_host = "" # SSH host for `yazses remote`
ssh_port = 22
agent_port = 9875
key_file = "" # path to SSH private key (optional)
# --- v0.4.0 additions (all optional — defaults shown) ---
[commands]
# existing fields above…
# Tier 2 SLM routing (optional; requires `pip install yazses[slm]`)
# Download a GGUF model separately — TinyLlama (~700 MB) or Phi-3-mini (~2.2 GB).
slm_model_path = "" # e.g. ~/.cache/yazses/models/tinyllama.gguf
slm_confidence_threshold = 0.75 # fall back to verbatim text below this score
# LSP code context injection (optional; requires `pip install yazses[lsp]`)
# Connects to Neovim or VS Code via LSP and feeds the active file's language,
# scope, and identifier list into Whisper's initial_prompt — significantly
# improves transcription accuracy for code identifiers spoken aloud.
lsp_enabled = false
lsp_editor = "auto" # auto | neovim | vscode
[emg]
# EMG silent speech backend (optional; requires `pip install yazses[emg]` + device)
# Supported devices: YESP-protocol USB serial EMG headphones/wristbands.
# When active, replaces the hotkey-hold trigger — muscle signals start/stop capture.
device_port = "" # e.g. /dev/ttyUSB0, COM3
baud_rate = 115200
mode = "command" # command | full_text
# Map EMG gesture labels to voice-command strings (processed by the same
# grammar/SLM pipeline as spoken commands):
# [emg.command_map]
# save = "save file"
# undo = "undo"
How it works
┌─────────────────────┐
│ EMG backend │ ← v0.4.0 (optional)
│ (YESP USB serial) │
└──────────┬──────────┘
│ (alternative trigger)
┌──────────────┐ ┌──────────▼───────┐ ┌──────────────────────────────┐
│ Hotkey hook │──▶│ Audio (16kHz │──▶│ faster-whisper (CPU / int8) │
│ (per-OS API) │ │ PortAudio) │ │ │◀──┐
└──────────────┘ └──────────────────┘ └──────────────┬───────────────┘ │
│ │
┌──────────────▼───────────────┐ │
│ LspContextProvider │───┘
│ (injects initial_prompt) │ ← v0.4.0 (optional)
└──────────────────────────────┘
│
┌──────────────▼───────────────┐
│ disfluency filter │ ← v0.3.0
│ clean_text │
└──────────────┬───────────────┘
│
┌──────────────▼───────────────┐
│ Tier 1: grammar classifier │ ← v0.3.0
│ (regex, zero latency) │
└──────────────┬───────────────┘
│ (unmatched intents)
┌──────────────▼───────────────┐
│ Tier 2: SLM router │ ← v0.4.0 (optional)
│ (llama-cpp-python + GGUF) │
└──────────────┬───────────────┘
│
┌──────────────▼───────────────┐
│ Text injector │
│ (local or SSH remote) │ ← v0.3.0
└──────────────────────────────┘
│
└─────────── daemon process ──────────────────────────────────────
▲
JSON-RPC over Unix socket / named pipe
│
┌─────────┴─────────┐
│ CLI / tray │
└───────────────────┘
Every platform-specific surface (keyboard hook, text injection, autostart, IPC, paths, permissions, tray) lives behind a single Protocol-based abstraction in src/yazses/platform/. Adding a fifth platform is a matter of writing one more sub-package.
Remote voice forwarding (v0.3.0)
Local machine Remote machine
───────────────────────────────────── ──────────────────────
microphone → daemon → transcript ──SSH──▶ yazses-agent → injector
tunnel (types into remote app)
Start: yazses remote user@remote-host
Stop: yazses remote --stop
Only the transcript text travels over SSH — audio never leaves the local machine.
LSP code context injection (v0.4.0)
When lsp_enabled = true, YazSes connects to the running Neovim or VS Code LSP server and queries the active buffer for its language, current scope, and visible symbol names. This list is passed to faster-whisper as initial_prompt, biasing the model toward the identifiers actually present in the file. In practice this eliminates most transcription errors on camelCase and snake_case names spoken aloud.
Requires pip install yazses[lsp] and a running editor with an active LSP session.
EMG silent speech (v0.4.0)
When an EMG device is configured, muscle-signal onset/offset replaces the hotkey-hold trigger. Audio is captured normally; the user does not need to speak aloud — the EMG envelope alone gates recording. This is useful in open-plan offices or wherever speaking is impractical.
Supported protocol: YESP (USB CDC serial). Hardware examples: YESP-1 EMG headband, compatible wristbands. The mode = "full_text" setting attempts continuous dictation; mode = "command" maps gesture labels via [emg.command_map].
Requires pip install yazses[emg] and a compatible device.
Build from source
git clone https://github.com/novafabric/yazses
cd yazses
uv sync
uv run pytest tests/ -v # 246 tests across all platforms
Platform-specific installers:
# macOS — produces dist/YazSes-<v>.dmg
./scripts/build-macos.sh
# Windows — produces dist/YazSes-<v>-windows-x64.exe
./scripts/build-windows.ps1
# Linux .deb
./scripts/build-deb.sh
CI builds the unsigned .dmg and .exe on every PR that touches the relevant code paths.
Troubleshooting
yazses doctor— first stop. Tells you what's missing on the current OS.- macOS: see
docs/macos-install.mdfor Gatekeeper / Accessibility / Microphone. - Windows: see
docs/windows-install.mdfor SmartScreen / antivirus / privacy. - Linux: confirm you're in the
inputgroup; checkjournalctl --user -u yazses.service -f.
License
Apache 2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file yazses-0.4.1.tar.gz.
File metadata
- Download URL: yazses-0.4.1.tar.gz
- Upload date:
- Size: 253.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a3e92ecf117512ecfc7e8d7eb3de72f483ba31b972e7e9b702faca298d6f5e8
|
|
| MD5 |
4738d476f7e9fe5fed549def1644afbd
|
|
| BLAKE2b-256 |
31168ef62619420beed7542224139121ea734073cf7985fea39f94a1cce3f4b5
|
Provenance
The following attestation bundles were made for yazses-0.4.1.tar.gz:
Publisher:
release.yml on novafabric/yazses
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
yazses-0.4.1.tar.gz -
Subject digest:
4a3e92ecf117512ecfc7e8d7eb3de72f483ba31b972e7e9b702faca298d6f5e8 - Sigstore transparency entry: 1563952461
- Sigstore integration time:
-
Permalink:
novafabric/yazses@584247c24b70ed22fcdd19b1b53ffd784f206bfa -
Branch / Tag:
refs/tags/v0.4.2 - Owner: https://github.com/novafabric
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@584247c24b70ed22fcdd19b1b53ffd784f206bfa -
Trigger Event:
push
-
Statement type:
File details
Details for the file yazses-0.4.1-py3-none-any.whl.
File metadata
- Download URL: yazses-0.4.1-py3-none-any.whl
- Upload date:
- Size: 103.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e35b32e6d1de3ba9adf4caedd424a9d1fcbf4537d39c23ebcf68a872b1cfdbd8
|
|
| MD5 |
4aae5859a85642ccd2232885e5ff8dfc
|
|
| BLAKE2b-256 |
8a859a34b5758f424bfac29717f82d618f9b7264e053ea74aa5feb4c2b4aa4e1
|
Provenance
The following attestation bundles were made for yazses-0.4.1-py3-none-any.whl:
Publisher:
release.yml on novafabric/yazses
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
yazses-0.4.1-py3-none-any.whl -
Subject digest:
e35b32e6d1de3ba9adf4caedd424a9d1fcbf4537d39c23ebcf68a872b1cfdbd8 - Sigstore transparency entry: 1563952462
- Sigstore integration time:
-
Permalink:
novafabric/yazses@584247c24b70ed22fcdd19b1b53ffd784f206bfa -
Branch / Tag:
refs/tags/v0.4.2 - Owner: https://github.com/novafabric
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@584247c24b70ed22fcdd19b1b53ffd784f206bfa -
Trigger Event:
push
-
Statement type: