Skip to main content

Dead simple speech-to-text

Project description

Quickstart

This project is a simple voice dictation (speech-to-text) tool that runs completely on device. It uses the openai-whisper models for speech recognition and optionally uses the local LLMs for post-processing of transcribed text (currently supporting all ollama models).

Simply install, run and hold the hotkey to speak. The transcribed text will be pasted into the active window. Say 'help' to view voice commands.

Installation

pip install --extra-index-url https://download.pytorch.org/whl/cu124

Usage

Usage: stt [OPTIONS]

  Voice dictation (speech-to-text) completely on device.

Options:
  --stt TEXT         Whisper model name (tiny.en, base.en, turbo, ...)
  --hotkey TEXT      Hotkey to hold while speaking
  --debug            Enable debug mode
  --post-processing  Enable LLM post-processing of transcribed text
  --type-mode        Use keystrokes instead of pasting text
  --help             Show this message and exit.

Examples

Run with debug mode enabled, using the openai-whisper turbo model, and the hold-to-speak hotkey f12:

stt --debug --stt turbo --hotkey f12

Prompt for a hotkey to hold while speaking:

stt --hotkey prompt
Enter the hotkey you want to use followed by 'escape':
Hotkey: space. Press escape to confirm.
Hotkey: ctrl+space. Press escape to confirm.
Hotkey confirmed: ctrl+space
Hotkey: ctrl+space
2025-01-16 22:29:00 - INFO - Loading whisper model 'tiny.en' on cuda...
2025-01-16 22:29:00 - INFO - Press and hold 'ctrl+space' to speak

Commands

There are a few commands built-in to the voice dictation interface:

Just hold the hotkey and say 'help' to view commands.

Post-processing

Note: Post-processing is not enabled by default since there is latency and its still under development. To enable post-processing, use the --post-processing flag. You will need to have an local ollama server running and the model llama3.2:3b-instruct-q5_K_M available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sttpy-0.1.1.tar.gz (162.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sttpy-0.1.1-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file sttpy-0.1.1.tar.gz.

File metadata

  • Download URL: sttpy-0.1.1.tar.gz
  • Upload date:
  • Size: 162.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.1.tar.gz
Algorithm Hash digest
SHA256 491df21570bbd8a4e7da68194f0d1abb69b2089d1f5e4a949f2685b26867b0c1
MD5 c038a3a624b249b6f365941302c1e298
BLAKE2b-256 ab7f258440f680e71f9887ee69c4953ad891702942ddfee0fb900975e54644f9

See more details on using hashes here.

File details

Details for the file sttpy-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: sttpy-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 54bad55cbf85ac43d8172f0344d0087c12dec68771ea74c9685f4ed7ac65e356
MD5 22d92b3baf4435262ec185af2a15abf5
BLAKE2b-256 1f39b67f6713d49992f248d6c162fe24ff9d30d3761561a2143f4767298f26d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page