Skip to main content

Dead simple speech-to-text

Project description

Quickstart

This project is a simple voice dictation (speech-to-text) tool that runs completely on device. It uses the openai-whisper models for speech recognition and optionally uses the local LLMs for post-processing of transcribed text (currently supporting all ollama models).

Simply install, run and hold the hotkey to speak. The transcribed text will be pasted into the active window. Say 'help' to view voice commands.

Installation

# CPU only
pip install sttpy

# If you have a GPU and want to use the CUDA version
pip install sttpy[cuda] --extra-index-url https://download.pytorch.org/whl/cu124

Usage

Usage: stt [OPTIONS]

  Voice dictation (speech-to-text) completely on device.

Options:
  --stt TEXT         Whisper model name (tiny.en, base.en, turbo, ...)
  --hotkey TEXT      Hotkey to hold while speaking
  --debug            Enable debug mode
  --post-processing  Enable LLM post-processing of transcribed text
  --type-mode        Use keystrokes instead of pasting text
  --help             Show this message and exit.

Examples

Run with debug mode enabled, using the openai-whisper turbo model, and the hold-to-speak hotkey f12:

stt --debug --stt turbo --hotkey f12

Prompt for a hotkey to hold while speaking:

stt --hotkey prompt
Enter the hotkey you want to use followed by 'escape':
Hotkey: space. Press escape to confirm.
Hotkey: ctrl+space. Press escape to confirm.
Hotkey confirmed: ctrl+space
Hotkey: ctrl+space
2025-01-16 22:29:00 - INFO - Loading whisper model 'tiny.en' on cuda...
2025-01-16 22:29:00 - INFO - Press and hold 'ctrl+space' to speak

Commands

There are a few commands built-in to the voice dictation interface:

Just hold the hotkey and say 'help' to view commands.

Post-processing

Note: Post-processing is not enabled by default since there is latency and its still under development. To enable post-processing, use the --post-processing flag. You will need to have an local ollama server running and the model llama3.2:3b-instruct-q5_K_M available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sttpy-0.1.2.tar.gz (181.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sttpy-0.1.2-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file sttpy-0.1.2.tar.gz.

File metadata

  • Download URL: sttpy-0.1.2.tar.gz
  • Upload date:
  • Size: 181.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.2.tar.gz
Algorithm Hash digest
SHA256 36282a8617e4801d94a6875176e58b8fdf988da15a758be3695f4c462b508ccb
MD5 91b3cee125610122a4fb409c8dde6dfb
BLAKE2b-256 3ae9156157a6608be6c2340ee80fef216556789bd9cefa37b9031dab78b4224f

See more details on using hashes here.

File details

Details for the file sttpy-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: sttpy-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 48dceac6abf53db34156a7e16fa005d08fd92abf1b0bd39e011d31786f664118
MD5 db9e45add7e4f14983d9b13b8add6784
BLAKE2b-256 43100f869b7253ef5987ad194ea96f7d080026b026f6ad5298320704354198c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page