Skip to main content

Dead simple speech-to-text

Project description

Quickstart

This project is a simple voice dictation (speech-to-text) tool that runs completely on device. It uses the openai-whisper models for speech recognition and optionally uses the local LLMs for post-processing of transcribed text (currently supporting all ollama models).

Simply install, run and hold the hotkey to speak. The transcribed text will be pasted into the active window. Say 'help' to view voice commands.

Installation

# CPU only
pip install sttpy

# If you have a GPU and want to use the CUDA version
pip install sttpy[cuda] --extra-index-url https://download.pytorch.org/whl/cu124

Usage

Usage: stt [OPTIONS]

  Voice dictation (speech-to-text) completely on device.

Options:
  --stt TEXT           Whisper model name (tiny.en, base.en, turbo, ...)
  --hotkey TEXT        Hotkey to hold while speaking
  --debug              Enable debug mode
  --post-processing    Enable LLM post-processing of transcribed text
  --type-mode          Use keystrokes instead of pasting text
  --paste-delay FLOAT  Delay between copying to clipboard and pasting
  --help               Show this message and exit.

Examples

Run with debug mode enabled, using the openai-whisper turbo model, and the hold-to-speak hotkey f12:

stt --debug --stt turbo --hotkey f12

Prompt for a hotkey to hold while speaking:

stt --hotkey prompt
Enter the hotkey you want to use followed by 'escape':
Hotkey: space. Press escape to confirm.
Hotkey: ctrl+space. Press escape to confirm.
Hotkey confirmed: ctrl+space
Hotkey: ctrl+space
2025-01-16 22:29:00 - INFO - Loading whisper model 'tiny.en' on cuda...
2025-01-16 22:29:00 - INFO - Press and hold 'ctrl+space' to speak

Commands

There are a few commands built-in to the voice dictation interface:

Just hold the hotkey and say 'help' to view commands.

Post-processing

Note: Post-processing is not enabled by default since there is latency and its still under development. To enable post-processing, use the --post-processing flag. You will need to have an local ollama server running and the model llama3.2:3b-instruct-q5_K_M available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sttpy-0.1.3.tar.gz (181.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sttpy-0.1.3-py3-none-any.whl (11.7 kB view details)

Uploaded Python 3

File details

Details for the file sttpy-0.1.3.tar.gz.

File metadata

  • Download URL: sttpy-0.1.3.tar.gz
  • Upload date:
  • Size: 181.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.3.tar.gz
Algorithm Hash digest
SHA256 5774e386c0786f6f6ae8ae469a891e39781afb6923538a3693bbb4d8d045926f
MD5 4600a5bff5be6a6904b895f69cefb175
BLAKE2b-256 10b1c1c3808ac03ea09f4fc5269ce27471acf20fb63ef6b639773bc17276b407

See more details on using hashes here.

File details

Details for the file sttpy-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: sttpy-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 11.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2f1ae3d79cde439a13ed17ef5137e0eeb824980aa517aea8b77609eb20811044
MD5 d16c6089da0d7a41808c616ff24de50f
BLAKE2b-256 400b86f256711d8c0778b8bb979f4a3aeacebea51a80a1b832a2bea2bf3d0d36

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page