Skip to main content

Voice dictation for Ubuntu/Wayland — hotkey-driven speech-to-text that types anywhere

Project description

fala

fala means "speak" in Portuguese.

Voice dictation for Ubuntu/Wayland. Press a hotkey to start recording, press it again to stop — your speech is transcribed via the OpenAI API and typed wherever your cursor is.

Works system-wide: terminals, browsers, text editors, anything.

See docs/architecture.md for a full diagram of how the components connect.

How it works

  1. Press your hotkey → recording starts (parecord)
  2. Speak
  3. Press hotkey again → recording stops, audio is sent to OpenAI for transcription, text is typed at the cursor via ydotool

System requirements

  • Python 3.11+
  • uv
sudo apt install -y ffmpeg portaudio19-dev pulseaudio-utils

ydotool needs access to the input subsystem. Add your user to the input group and create the udev rule:

sudo usermod -aG input $USER
echo 'KERNEL=="uinput", MODE="0660", GROUP="input"' \
    | sudo tee /etc/udev/rules.d/60-uinput.rules
sudo udevadm control --reload-rules && sudo udevadm trigger

Then log out and back in for the group change to take effect.

ydotool daemon

The Ubuntu apt package for ydotool is too old to include ydotoold. Build and install the current version from source:

bash scripts/install_ydotool.sh

This clones, builds, installs, and starts the daemon as a systemd user service. Safe to re-run.

Installation

Build and install fala as a global CLI tool using a wheel:

uv build
uv tool install --force dist/*.whl

This makes fala available in your PATH everywhere, no virtualenv activation needed.

To update after pulling changes:

uv build
uv tool install --force dist/*.whl

API key setup

Store your OpenAI API key in the GNOME Keyring (encrypted at rest, unlocked automatically at login):

fala setup

You will be prompted to enter your key. It is stored securely and never written to disk in plaintext.

Alternatively, set it as an environment variable (e.g. for CI or headless use):

export OPENAI_API_KEY="sk-..."

The env var takes precedence over the keyring if both are set.

Hotkey setup

Install/update helper scripts directly:

fala install-scripts

Or via guided setup:

fala setup

Register a GNOME keyboard shortcut. The script will ask you to physically press the key combination you want:

bash scripts/setup_keybinding.sh

To remove the shortcut:

bash scripts/setup_keybinding.sh --revert

fala setup is idempotent and safe to re-run.

CLI usage

Record from microphone (press Enter to start, press Enter to stop):

fala record

Transcribe an existing audio file:

fala transcribe path/to/audio.wav

Live streaming transcription to terminal:

fala stream

By default, fala stream prints transcript deltas directly in the terminal and redirects noisy stderr output (for example ALSA/JACK warnings) to ~/.config/fala/stream.log. Use --type-into-active-window to also inject text into the focused GUI input via ydotool.

Create a high-clarity test fixture recording and generate a pytest integration test with the expected transcript baked into the test:

fala create-test-fixture clear_en_sentence --text "The quick brown fox jumps over the lazy dog."

Run integration tests (including realtime streaming integration with local WAV fixture):

just test-integration

Manual equivalent for all integration tests:

RUN_STREAMING_INTEGRATION=1 uv run pytest -m integration

Notes:

  • The integration test uses tests/fixtures/clear_en_sentence.wav.
  • It requires provider API keys to be available (OPENAI_API_KEY and/or ASSEMBLYAI_API_KEY); the just command loads both from keyring automatically if stored via fala setup.

Show usage analytics from your transcription log:

fala stats
fala stats --weeks 16
fala stats --typing-wpm 45 --speaking-wpm 130

fala stats prints rich tables with weekly usage, average characters per transcription, provider mix, streaks, and activity trends. It also estimates time saved by speaking vs typing.

Set defaults once (persisted in ~/.config/fala/config.toml):

fala config --typing-wpm 40 --speaking-wpm 120 --avg-characters-per-word 5
fala config --assemblyai-model u3-rt-pro

These defaults are used by fala stats unless you override them with command flags.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fala-0.0.1.tar.gz (34.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fala-0.0.1-py3-none-any.whl (38.8 kB view details)

Uploaded Python 3

File details

Details for the file fala-0.0.1.tar.gz.

File metadata

  • Download URL: fala-0.0.1.tar.gz
  • Upload date:
  • Size: 34.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fala-0.0.1.tar.gz
Algorithm Hash digest
SHA256 471737ffd24c288c51c287f82927b5332784743e1176bb6df68221b7240ec5d7
MD5 758ca9e9959bc1851ecf10c44b0ce521
BLAKE2b-256 36e36c024539dbb34a272e3728d6811b165d8c89b7bba90c498cbc59df695f94

See more details on using hashes here.

Provenance

The following attestation bundles were made for fala-0.0.1.tar.gz:

Publisher: publish.yml on ivarurdalen/fala

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fala-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: fala-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 38.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fala-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4aad95841647d64c23a0c08e9572d9f17370ab19262bc3b989937bf29153b409
MD5 6ac5a4113f13a2323ce820e5dc743bd0
BLAKE2b-256 5a6b9b16f0e7b28ce52860908515f473a11c629f387cb76fce413c4558da1183

See more details on using hashes here.

Provenance

The following attestation bundles were made for fala-0.0.1-py3-none-any.whl:

Publisher: publish.yml on ivarurdalen/fala

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page