A local, private Linux text-to-speech tool. Select text in any app, press a hotkey, hear it read by Kokoro-82M on your GPU.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Gusfromdk

These details have not been verified by PyPI

Project description

Lexaloud

A local, private text-to-speech tool for Linux. Select text, press a hotkey, hear it read by a neural voice running on your own machine.

How it works

Select text in any application
Press a global hotkey (e.g., Ctrl+0)
Hear it spoken sentence by sentence, with pause / skip / rewind controls

Lexaloud runs a small daemon on your machine that synthesizes speech using Kokoro-82M, an open-weights neural voice model. Nothing leaves your computer — no cloud API, no account, no telemetry.

To hear what Kokoro sounds like before installing, try the live demo on Hugging Face.

Features

Global hotkey on any desktop — works on GNOME, KDE Plasma, Sway, Hyprland, XFCE, Cinnamon, and any window manager that supports custom keybindings. GNOME is the primary tested path with integrated tray + hotkey UI; other desktops bind the same CLI commands manually. See docs/hotkeys/.
MPRIS2 / media keys — desktop media keys, GNOME's top-bar media indicator, KDE's media widget, Bluetooth headphone buttons, and playerctl all control Lexaloud playback with zero configuration. Uses dbus-fast (optional dependency).
Floating overlay — an always-on-top sentence caption bar (off by default). Enable via [advanced] overlay = true in config.toml or the control window's Settings tab. Supports both gtk-layer-shell (wlroots/KWin) and X11/GNOME Wayland fallback.
XDG GlobalShortcuts portal — Wayland-native global hotkey binding on KDE Plasma 6+, Sway, and Hyprland via the org.freedesktop.portal.GlobalShortcuts portal. GNOME does not support this portal and continues using the gsettings path.
GPU-accelerated neural TTS — Kokoro-82M via kokoro-onnx on onnxruntime-gpu with NVIDIA CUDA. CPU fallback runs at ~10x real-time, which is fine for reading along.
Sentence-granularity streaming with bounded backpressure and cooperative cancellation. Pause, skip, rewind, or stop mid-article without audio clipping.
12 built-in voices — American and British, male and female, from warm to serious. The control window lets you preview and switch voices; see the full list in docs/models.md.
GTK3 tray indicator + control window — visible on any desktop that supports AppIndicator (GNOME with the ubuntu-appindicators extension, KDE, Budgie, etc.). Voice, speed, and hotkey settings. The CLI works without the tray on minimal setups.
Privacy-first — see the Privacy section.
Open-source — MIT-licensed code, Apache-2.0-licensed model weights. See THIRD_PARTY_LICENSES.md.

Requirements

Requirement	Details
OS	Linux only. Tier 1: Ubuntu 24.04, Debian 13. Tier 2: Fedora 41, Arch, Mint, Pop!_OS. Not supported: Windows, macOS.
Init system	systemd (for the `--user` daemon unit). Non-systemd distros (Artix, Void) can run `lexaloud daemon` manually.
Python	3.11 or newer
GPU (optional)	NVIDIA with CUDA 12-compatible driver. AMD ROCm and Intel Arc are not yet supported — the daemon falls back to CPU, which runs at ~10x real-time and is fine for reading along.
Audio	PipeWire, PulseAudio, or bare ALSA (via PortAudio/`libportaudio2`). Most desktop Linux distros ship PipeWire by default.
Disk	~400 MB for model weights (downloaded once on first setup)
Desktop (optional)	GNOME for the integrated tray + hotkey UI. KDE, Sway, XFCE, Cinnamon, and others work via manual hotkey binding — see `docs/hotkeys/`. The CLI works headless.

Install

Ubuntu / Debian (Tier 1)

sudo apt install python3-venv wl-clipboard xclip libportaudio2 libnotify-bin \
                 python3-gi gir1.2-gtk-3.0 gir1.2-ayatanaappindicator3-0.1

git clone https://github.com/Gustavjiversen01/lexaloud.git
cd lexaloud
./scripts/install.sh

~/.local/share/lexaloud/venv/bin/lexaloud setup
systemctl --user daemon-reload
systemctl --user enable --now lexaloud.service

Then bind a hotkey — see docs/hotkeys/gnome.md or the walkthrough lexaloud setup prints.

Full walkthrough: docs/install/ubuntu-debian.md

Fedora (Tier 2)

sudo dnf install python3 python3-pip python3-gobject gtk3 \
                 wl-clipboard xclip portaudio libnotify

Then the same git clone → ./scripts/install.sh → lexaloud setup → systemctl flow. Full walkthrough: docs/install/fedora.md

Arch / Manjaro (Tier 2)

sudo pacman -S python python-gobject gtk3 wl-clipboard xclip portaudio libnotify

Then git clone → ./scripts/install.sh → lexaloud setup → systemctl. Full walkthrough: docs/install/arch.md

Other distros

The installer auto-detects your distro via /etc/os-release and prints the right package names if any are missing. For distros not in the table, file a PR against docs/install/.

GPU backend

The installer detects NVIDIA via nvidia-smi and picks the right lockfile automatically. To force a backend:

./scripts/install.sh --backend cuda12   # NVIDIA GPU
./scripts/install.sh --backend cpu      # CPU only (AMD, Intel, or no GPU)

Wayland users: read this

On GNOME Wayland (the default on Ubuntu 24.04), speak-selection may return empty for some apps (VS Code, Obsidian, Slack) because Electron apps don't always publish to the PRIMARY selection. The reliable workflow is:

Ctrl+C to copy the selection to the clipboard
Press your speak-clipboard hotkey

Both commands are in the CLI — bind whichever suits your workflow, or bind both to different keys. Details in docs/gotchas.md.

Not via `pip install`

pip install lexaloud does not give you a working installation. The TTS runtime requires a specific install sequence for kokoro-onnx

onnxruntime-gpu that pip cannot express in one command (the two packages share an internal directory and silently break each other if both are installed normally — see docs/design-rationale.md for the full story). scripts/install.sh is the only supported install path.

CLI

lexaloud speak-selection      # capture PRIMARY selection, speak it
lexaloud speak-clipboard      # capture CLIPBOARD (after Ctrl+C), speak it
lexaloud pause                # pause at the next sentence boundary
lexaloud resume
lexaloud toggle               # pause if speaking, resume if paused
lexaloud skip                 # skip the current sentence
lexaloud back                 # rewind one sentence
lexaloud stop                 # stop and clear the queue
lexaloud status               # daemon state as JSON
lexaloud download-models      # fetch model weights (~340 MB, once)
lexaloud setup                # first-time configuration walkthrough
lexaloud bug-report           # system diagnostics for filing issues
lexaloud daemon               # run the daemon (normally via systemd)

Exit codes: 0 success, 1 error, 2 empty selection, 3 daemon down, 4 oversized payload, 5 capture tool missing/timeout.

Full reference: docs/cli-reference.md

Privacy

Lexaloud performs no telemetry. No text, metadata, or usage statistics are transmitted anywhere. The only outbound network calls are the one-time model downloads on first setup, fetched over HTTPS from the kokoro-onnx GitHub releases page and SHA256-verified against pins in src/lexaloud/models.py.

The daemon listens on a Unix domain socket at $XDG_RUNTIME_DIR/lexaloud/lexaloud.sock (mode 0700 enforced by systemd's RuntimeDirectoryMode=). Only processes running as your user can reach it. There is no open TCP port.

Selection text is never written to disk. Log entries that mention a sentence replace the content with a SHA-1 fingerprint + length, so journalctl never contains readable user text.

Known limitations (v0.3.0)

NVIDIA only for GPU acceleration — AMD ROCm and Intel Arc are not supported. CPU fallback works on any x86_64 Linux.
No karaoke word-level highlighting — deferred (Kokoro doesn't expose word timings).
No browser extension — deferred.
Sentence-level pause granularity — the last ~100 ms of the current sub-chunk may play out after pressing pause.
GNOME Wayland primary-selection gaps — some Electron apps don't publish to PRIMARY. Workaround: use speak-clipboard + Ctrl+C. See docs/gotchas.md.
GlobalShortcuts portal not supported on GNOME — GNOME 46/47 does not implement the XDG GlobalShortcuts portal. GNOME users continue using the gsettings-based hotkey path.

Full list: ROADMAP.md

Architecture

A FastAPI daemon (systemd --user) owns the TTS provider and audio sink. A thin CLI sends HTTP requests over the Unix socket. A GTK3 tray indicator polls daemon state for visual feedback.

Component diagram + data-flow walkthrough: docs/architecture.md. Design decisions: docs/design-rationale.md.

Tests

# Set up a dev environment (one-time)
python3 -m venv .venv && source .venv/bin/activate
pip install -e .[test]

# Run the suite
python -m pytest tests/ --ignore=tests/test_real_kokoro_smoke.py -q

206 tests, ~2.5 seconds. No GPU or audio device required — tests use FakeProvider + NullSink + ASGITransport.

There is also an optional integration test that uses the real Kokoro model and sounddevice (1 extra test, 207 total):

LEXALOUD_REAL_TTS=1 python -m pytest tests/test_real_kokoro_smoke.py -s

Contributing

See CONTRIBUTING.md. Pull requests should be signed off with git commit -s (DCO).

Please read CODE_OF_CONDUCT.md before participating.

Security vulnerabilities: use GitHub private vulnerability reporting rather than public issues. See SECURITY.md.

Acknowledgments

Kokoro-82M by hexgrad — the open-weights neural TTS model.
kokoro-onnx by thewh1teagle — the ONNX wrapper.
ONNX Runtime + NVIDIA CUDA for GPU-accelerated inference from Python.
phonemizer-fork, pysbd, and sounddevice.
The GNOME and freedesktop.org communities for GTK, libnotify, systemd-user, and AppIndicator.

Significant portions of this codebase were developed in collaboration with Claude (Anthropic) via Claude Code. Code review and final editorial decisions are the author's.

License

MIT. See LICENSE for the full text and THIRD_PARTY_LICENSES.md for runtime dependency disclosures (the TTS stack includes GPL-3.0 dynamic dependencies via phonemizer-fork → espeakng-loader → espeak-ng).

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Gusfromdk

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

Apr 13, 2026

0.2.1

Apr 13, 2026

0.2.0

Apr 12, 2026

0.1.1

Apr 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lexaloud-0.3.0.tar.gz (230.1 kB view details)

Uploaded Apr 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lexaloud-0.3.0-py3-none-any.whl (89.6 kB view details)

Uploaded Apr 13, 2026 Python 3

File details

Details for the file lexaloud-0.3.0.tar.gz.

File metadata

Download URL: lexaloud-0.3.0.tar.gz
Upload date: Apr 13, 2026
Size: 230.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lexaloud-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`3ae9b9efc3bd30895a21183c84b8e2579c589328707b65c6d75c24de2327a63a`
MD5	`87880ef4b385513c49c9afc4112f2c0b`
BLAKE2b-256	`8b16e58734144de0acaf744823f0797c582ee97b009c1e44f6aa9b3e5aa14544`

See more details on using hashes here.

Provenance

The following attestation bundles were made for lexaloud-0.3.0.tar.gz:

Publisher: release.yml on Gustavjiversen01/lexaloud

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: lexaloud-0.3.0.tar.gz
- Subject digest: 3ae9b9efc3bd30895a21183c84b8e2579c589328707b65c6d75c24de2327a63a
- Sigstore transparency entry: 1287315586
- Sigstore integration time: Apr 13, 2026
Source repository:
- Permalink: Gustavjiversen01/lexaloud@15174d85368c7ae3da538d9e5c87c789a2eebcb6
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/Gustavjiversen01
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@15174d85368c7ae3da538d9e5c87c789a2eebcb6
- Trigger Event: push

File details

Details for the file lexaloud-0.3.0-py3-none-any.whl.

File metadata

Download URL: lexaloud-0.3.0-py3-none-any.whl
Upload date: Apr 13, 2026
Size: 89.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for lexaloud-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fd19c3bb611738aa6de6ed53f15c431f7fdb26b95b4b3a6ab27c889250ca1ec6`
MD5	`47b379bcdf4cacc936df99ae506e0bb5`
BLAKE2b-256	`56ab128687d3a90502f2e1eec81372afeb54e744bc50915e4e86df0daeee3f9e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for lexaloud-0.3.0-py3-none-any.whl:

Publisher: release.yml on Gustavjiversen01/lexaloud

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: lexaloud-0.3.0-py3-none-any.whl
- Subject digest: fd19c3bb611738aa6de6ed53f15c431f7fdb26b95b4b3a6ab27c889250ca1ec6
- Sigstore transparency entry: 1287315632
- Sigstore integration time: Apr 13, 2026
Source repository:
- Permalink: Gustavjiversen01/lexaloud@15174d85368c7ae3da538d9e5c87c789a2eebcb6
- Branch / Tag: refs/tags/v0.3.0
- Owner: https://github.com/Gustavjiversen01
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@15174d85368c7ae3da538d9e5c87c789a2eebcb6
- Trigger Event: push

lexaloud 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Lexaloud

How it works

Features

Requirements

Install

Ubuntu / Debian (Tier 1)

Fedora (Tier 2)

Arch / Manjaro (Tier 2)

Other distros

GPU backend

Wayland users: read this

Not via pip install

CLI

Privacy

Known limitations (v0.3.0)

Architecture

Tests

Contributing

Acknowledgments

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Not via `pip install`