Local, offline voice dictation for Linux, macOS, and Windows — hold a key, speak, release

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

YazSes

Local, offline voice dictation for Linux, macOS, and Windows. Hold a key, speak, release — the transcribed text appears in whatever app is focused. No cloud, no GPU.

Hold the dictation key (>0.5s) → speak → release → text appears

Powered by faster-whisper (CPU/int8). Works in browsers, terminals, IDEs, chat apps — anywhere the OS lets keystrokes reach the focused window.

Supported platforms

OS	Hotkey default	Install	Status
Linux	`Space`	apt / snap / PPA / pipx / .deb / installer	Stable
macOS	`Right Option`	`.dmg` (Homebrew Cask coming)	Developer preview (unsigned)
Windows	`Right Ctrl`	`.exe` installer (winget coming)	Developer preview (unsigned)

Why Right Ctrl on Windows, not Right Alt? On many international layouts Right Alt acts as AltGr — used to type @, €, {}, [], \, ~, etc. Hijacking it would break normal typing. Right Ctrl is rarely used for typing, so it's the safer default. Every platform's hotkey is configurable in config.toml.

Quick install

One-line install on every major OS:

# macOS  — via Homebrew tap
brew tap novafabric/yazses && brew install --cask yazses

# Windows  — via winget (pending PR review at microsoft/winget-pkgs#371427)
winget install NovaFabric.YazSes

# Linux  — via the apt repo
bash <(curl -fsSL https://raw.githubusercontent.com/novafabric/yazses/main/install.sh)

# Cross-platform fallback — pip
pipx install yazses

After install:

OS	What's left
macOS	Right-click → Open the first time (unsigned dev preview); grant Accessibility + Microphone when prompted; hold Right Option to dictate.
Windows	If SmartScreen warns, click More info → Run anyway (unsigned dev preview); hold Right Ctrl to dictate.
Linux	`sudo usermod -aG input "$USER"` then re-login; `systemctl --user enable --now yazses.service`; hold Space to dictate.

Full per-OS guides: docs/macos-install.md, docs/windows-install.md. Status of every distribution channel lives in docs/distribution-status.md.

Other channels

If a one-liner above doesn't fit your environment, pick from the platform sections below.

macOS — alternatives

# Direct .dmg download (no Homebrew needed)
# https://github.com/novafabric/yazses/releases/latest
# Open the .dmg, drag YazSes.app into /Applications, right-click → Open the first time.

Windows — alternatives

# Direct .exe download
# https://github.com/novafabric/yazses/releases/latest
# Click "More info → Run anyway" if SmartScreen warns.

Linux — alternatives

# APT repo (Debian/Ubuntu)
curl -fsSL https://novafabric.github.io/yazses/apt/KEY.gpg \
  | sudo gpg --dearmor --yes -o /usr/share/keyrings/yazses.gpg
echo "deb [signed-by=/usr/share/keyrings/yazses.gpg] https://novafabric.github.io/yazses/apt ./" \
  | sudo tee /etc/apt/sources.list.d/yazses.list
sudo apt update && sudo apt install yazses

# Launchpad PPA (Ubuntu)
sudo add-apt-repository ppa:novafabric/yazses
sudo apt update && sudo apt install yazses

# Snap (works on most distros after `snapd` is installed)
sudo snap install yazses --classic

# AUR (Arch / Manjaro / EndeavourOS)
yay -S yazses          # any AUR helper

# .deb download
# https://github.com/novafabric/yazses/releases/latest
sudo apt install ./yazses_*.deb

# pipx (any Linux)
sudo apt install libportaudio2 xdotool xclip pipx
pipx install yazses

Optional extras

v0.4.0 introduces three opt-in feature groups. Install only what you need:

Extra	What it enables	Dependencies installed
`yazses[slm]`	SLM intent routing — natural phrasing for voice commands	llama-cpp-python + GGUF model
`yazses[lsp]`	LSP code context injection — better identifier accuracy	pygls, pynvim
`yazses[emg]`	EMG silent speech backend — dictate without speaking aloud	pyserial
`yazses[all]`	All optional extras	all of the above

pip install "yazses[slm]"        # SLM routing only
pip install "yazses[lsp]"        # LSP context only
pip install "yazses[emg]"        # EMG backend only
pip install "yazses[all]"        # everything

Each extra requires additional setup described in the Configuration section below.

Usage

YazSes runs silently in the background. The same CLI works on every platform.

Command	What it does
Hold the hotkey, speak, release	Transcribe and inject text into focused app
`yazses status`	Daemon state, model, hotkey, backend, uptime
`yazses start` / `stop`	Manage the daemon
`yazses doctor`	Per-platform prerequisite check
`yazses inject "hello"`	Type text without recording (debug)
`yazses remote <host>`	Forward voice typing to a remote SSH host
`yazses remote --stop`	Disconnect active remote session
`yazses enroll`	Calibration wizard for VAD / silence settings

On macOS and Windows the YazSes tray icon changes color to reflect state (idle / recording / transcribing / remote / error).

Voice commands (v0.4.0)

Speak natural commands while [commands] enabled = true (default). v0.4.0 adds a Tier 2 SLM routing layer (requires yazses[slm]) that handles natural, varied phrasing — you no longer need to say the exact canonical form:

Say (examples)	Action
"undo" / "undo 3 times"	Ctrl+Z (×N)
"save file" / "save this" / "save it"	Ctrl+S
"delete 2 words"	Ctrl+Backspace ×2
"delete 3 lines"	Delete 3 lines
"go to line 42"	Ctrl+G → "42" → Enter
"comment selection"	Ctrl+/
"copy" / "paste"	Ctrl+C / Ctrl+V
"scratch that" / "delete that"	Remove text back to last sentence
"close this tab" / "close the current tab"	Ctrl+W (SLM Tier 2)
"zoom in" / "make this bigger"	Ctrl++ (SLM Tier 2)

Without yazses[slm], the Tier 1 regex grammar handles a fixed set of canonical phrases. With it, the SLM layer catches anything the regex misses, at the cost of ~50–200 ms additional latency per utterance.

Everything that does not match a command intent is typed verbatim.

Configuration

config.toml lives in the platform's standard config dir:

OS	Path
Linux	`~/.config/yazses/config.toml`
macOS	`~/Library/Application Support/yazses/config.toml`
Windows	`%APPDATA%\yazses\config.toml`

[stt]
model = "tiny.en"   # tiny.en (fast) | base.en (more accurate, slower)

[hotkey]
# "auto" → Space (Linux) / right_option (macOS) / right_ctrl (Windows).
key = "auto"
hold_threshold_ms = 500

[audio]
sample_rate = 16000
max_record_seconds = 90

[tray]
enabled = "auto"   # default true on macOS/Windows, false on Linux v0

[general]
log_level = "INFO"

# --- v0.3.0 additions (all optional — defaults shown) ---

[commands]
enabled = true          # voice command grammar (undo, save, go to line N, …)
profile = "auto"        # "auto" | "vscode" | "vim" | "default"

[filters.disfluency]
enabled = true          # remove filler words, repeated phrases, "scratch that"

[accessibility]
vad_threshold = 0.01    # silence threshold — run `yazses enroll` to calibrate
min_silence_ms = 500    # minimum silence to end a recording
pre_speech_padding_ms = 200   # prepend ring-buffer audio to catch voice onset

[streaming]
enabled = true          # emit stable partial transcripts while you speak
partial_interval_ms = 300

[remote]
default_host = ""       # SSH host for `yazses remote`
ssh_port = 22
agent_port = 9875
key_file = ""           # path to SSH private key (optional)

# --- v0.4.0 additions (all optional — defaults shown) ---

[commands]
# existing fields above…

# Tier 2 SLM routing (optional; requires `pip install yazses[slm]`)
# Download a GGUF model separately — TinyLlama (~700 MB) or Phi-3-mini (~2.2 GB).
slm_model_path = ""              # e.g. ~/.cache/yazses/models/tinyllama.gguf
slm_confidence_threshold = 0.75  # fall back to verbatim text below this score

# LSP code context injection (optional; requires `pip install yazses[lsp]`)
# Connects to Neovim or VS Code via LSP and feeds the active file's language,
# scope, and identifier list into Whisper's initial_prompt — significantly
# improves transcription accuracy for code identifiers spoken aloud.
lsp_enabled = false
lsp_editor = "auto"              # auto | neovim | vscode

[emg]
# EMG silent speech backend (optional; requires `pip install yazses[emg]` + device)
# Supported devices: YESP-protocol USB serial EMG headphones/wristbands.
# When active, replaces the hotkey-hold trigger — muscle signals start/stop capture.
device_port = ""                 # e.g. /dev/ttyUSB0, COM3
baud_rate = 115200
mode = "command"                 # command | full_text

# Map EMG gesture labels to voice-command strings (processed by the same
# grammar/SLM pipeline as spoken commands):
# [emg.command_map]
# save = "save file"
# undo = "undo"

How it works

                   ┌─────────────────────┐
                   │  EMG backend        │  ← v0.4.0 (optional)
                   │  (YESP USB serial)  │
                   └──────────┬──────────┘
                              │ (alternative trigger)
┌──────────────┐   ┌──────────▼───────┐   ┌──────────────────────────────┐
│ Hotkey hook  │──▶│ Audio (16kHz     │──▶│ faster-whisper (CPU / int8)  │
│ (per-OS API) │   │  PortAudio)      │   │                              │◀──┐
└──────────────┘   └──────────────────┘   └──────────────┬───────────────┘   │
                                                         │                   │
                                          ┌──────────────▼───────────────┐   │
                                          │  LspContextProvider          │───┘
                                          │  (injects initial_prompt)    │  ← v0.4.0 (optional)
                                          └──────────────────────────────┘
                                                         │
                                          ┌──────────────▼───────────────┐
                                          │  disfluency filter           │  ← v0.3.0
                                          │  clean_text                  │
                                          └──────────────┬───────────────┘
                                                         │
                                          ┌──────────────▼───────────────┐
                                          │  Tier 1: grammar classifier  │  ← v0.3.0
                                          │  (regex, zero latency)       │
                                          └──────────────┬───────────────┘
                                                         │ (unmatched intents)
                                          ┌──────────────▼───────────────┐
                                          │  Tier 2: SLM router          │  ← v0.4.0 (optional)
                                          │  (llama-cpp-python + GGUF)   │
                                          └──────────────┬───────────────┘
                                                         │
                                          ┌──────────────▼───────────────┐
                                          │  Text injector               │
                                          │  (local or SSH remote)       │  ← v0.3.0
                                          └──────────────────────────────┘
       │
       └─────────── daemon process ──────────────────────────────────────
                          ▲
              JSON-RPC over Unix socket / named pipe
                          │
                ┌─────────┴─────────┐
                │     CLI / tray    │
                └───────────────────┘

Every platform-specific surface (keyboard hook, text injection, autostart, IPC, paths, permissions, tray) lives behind a single Protocol-based abstraction in src/yazses/platform/. Adding a fifth platform is a matter of writing one more sub-package.

Remote voice forwarding (v0.3.0)

Local machine                             Remote machine
─────────────────────────────────────     ──────────────────────
microphone → daemon → transcript ──SSH──▶ yazses-agent → injector
                                  tunnel       (types into remote app)

Start: yazses remote user@remote-host
Stop: yazses remote --stop

Only the transcript text travels over SSH — audio never leaves the local machine.

LSP code context injection (v0.4.0)

When lsp_enabled = true, YazSes connects to the running Neovim or VS Code LSP server and queries the active buffer for its language, current scope, and visible symbol names. This list is passed to faster-whisper as initial_prompt, biasing the model toward the identifiers actually present in the file. In practice this eliminates most transcription errors on camelCase and snake_case names spoken aloud.

Requires pip install yazses[lsp] and a running editor with an active LSP session.

EMG silent speech (v0.4.0)

When an EMG device is configured, muscle-signal onset/offset replaces the hotkey-hold trigger. Audio is captured normally; the user does not need to speak aloud — the EMG envelope alone gates recording. This is useful in open-plan offices or wherever speaking is impractical.

Supported protocol: YESP (USB CDC serial). Hardware examples: YESP-1 EMG headband, compatible wristbands. The mode = "full_text" setting attempts continuous dictation; mode = "command" maps gesture labels via [emg.command_map].

Requires pip install yazses[emg] and a compatible device.

Build from source

git clone https://github.com/novafabric/yazses
cd yazses
uv sync
uv run pytest tests/ -v   # 246 tests across all platforms

Platform-specific installers:

# macOS — produces dist/YazSes-<v>.dmg
./scripts/build-macos.sh

# Windows — produces dist/YazSes-<v>-windows-x64.exe
./scripts/build-windows.ps1

# Linux .deb
./scripts/build-deb.sh

CI builds the unsigned .dmg and .exe on every PR that touches the relevant code paths.

Troubleshooting

yazses doctor — first stop. Tells you what's missing on the current OS.
macOS: see docs/macos-install.md for Gatekeeper / Accessibility / Microphone.
Windows: see docs/windows-install.md for SmartScreen / antivirus / privacy.
Linux: confirm you're in the input group; check journalctl --user -u yazses.service -f.

License

Apache 2.0 — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mskazemi

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.1

May 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yazses-0.4.1.tar.gz (253.7 kB view details)

Uploaded May 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

yazses-0.4.1-py3-none-any.whl (103.5 kB view details)

Uploaded May 17, 2026 Python 3

File details

Details for the file yazses-0.4.1.tar.gz.

File metadata

Download URL: yazses-0.4.1.tar.gz
Upload date: May 17, 2026
Size: 253.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yazses-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`4a3e92ecf117512ecfc7e8d7eb3de72f483ba31b972e7e9b702faca298d6f5e8`
MD5	`4738d476f7e9fe5fed549def1644afbd`
BLAKE2b-256	`31168ef62619420beed7542224139121ea734073cf7985fea39f94a1cce3f4b5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for yazses-0.4.1.tar.gz:

Publisher: release.yml on novafabric/yazses

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: yazses-0.4.1.tar.gz
- Subject digest: 4a3e92ecf117512ecfc7e8d7eb3de72f483ba31b972e7e9b702faca298d6f5e8
- Sigstore transparency entry: 1563952461
- Sigstore integration time: May 17, 2026
Source repository:
- Permalink: novafabric/yazses@584247c24b70ed22fcdd19b1b53ffd784f206bfa
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/novafabric
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@584247c24b70ed22fcdd19b1b53ffd784f206bfa
- Trigger Event: push

File details

Details for the file yazses-0.4.1-py3-none-any.whl.

File metadata

Download URL: yazses-0.4.1-py3-none-any.whl
Upload date: May 17, 2026
Size: 103.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for yazses-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e35b32e6d1de3ba9adf4caedd424a9d1fcbf4537d39c23ebcf68a872b1cfdbd8`
MD5	`4aae5859a85642ccd2232885e5ff8dfc`
BLAKE2b-256	`8a859a34b5758f424bfac29717f82d618f9b7264e053ea74aa5feb4c2b4aa4e1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for yazses-0.4.1-py3-none-any.whl:

Publisher: release.yml on novafabric/yazses

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: yazses-0.4.1-py3-none-any.whl
- Subject digest: e35b32e6d1de3ba9adf4caedd424a9d1fcbf4537d39c23ebcf68a872b1cfdbd8
- Sigstore transparency entry: 1563952462
- Sigstore integration time: May 17, 2026
Source repository:
- Permalink: novafabric/yazses@584247c24b70ed22fcdd19b1b53ffd784f206bfa
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/novafabric
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@584247c24b70ed22fcdd19b1b53ffd784f206bfa
- Trigger Event: push

yazses 0.4.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

YazSes

Supported platforms

Quick install

Other channels

macOS — alternatives

Windows — alternatives

Linux — alternatives

Optional extras

Usage

Voice commands (v0.4.0)

Configuration

How it works

Remote voice forwarding (v0.3.0)

LSP code context injection (v0.4.0)

EMG silent speech (v0.4.0)

Build from source

Troubleshooting

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance