Skip to main content

Dead simple speech-to-text

Project description

Quickstart

This project is a simple voice dictation (speech-to-text) tool that runs completely on device. It uses the openai-whisper models for speech recognition and optionally uses the local LLMs for post-processing of transcribed text (currently supporting all ollama models).

Simply install, run and hold the hotkey to speak. The transcribed text will be pasted into the active window. Say 'help' to view voice commands.

Installation

pip install --extra-index-url https://download.pytorch.org/whl/cu124

Usage

Usage: stt [OPTIONS]

  Voice dictation (speech-to-text) completely on device.

Options:
  --stt TEXT         Whisper model name (tiny.en, base.en, turbo, ...)
  --hotkey TEXT      Hotkey to hold while speaking
  --debug            Enable debug mode
  --post-processing  Enable LLM post-processing of transcribed text
  --type-mode        Use keystrokes instead of pasting text
  --help             Show this message and exit.

Examples

Run with debug mode enabled, using the openai-whisper turbo model, and the hold-to-speak hotkey f12:

stt --debug --stt turbo --hotkey f12

Prompt for a hotkey to hold while speaking:

stt --hotkey prompt
Enter the hotkey you want to use followed by 'escape':
Hotkey: space. Press escape to confirm.
Hotkey: ctrl+space. Press escape to confirm.
Hotkey confirmed: ctrl+space
Hotkey: ctrl+space
2025-01-16 22:29:00 - INFO - Loading whisper model 'tiny.en' on cuda...
2025-01-16 22:29:00 - INFO - Press and hold 'ctrl+space' to speak

Commands

There are a few commands built-in to the voice dictation interface:

Just hold the hotkey and say 'help' to view commands.

Post-processing

Note: Post-processing is not enabled by default since there is latency and its still under development. To enable post-processing, use the --post-processing flag. You will need to have an local ollama server running and the model llama3.2:3b-instruct-q5_K_M available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sttpy-0.1.0.tar.gz (135.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sttpy-0.1.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file sttpy-0.1.0.tar.gz.

File metadata

  • Download URL: sttpy-0.1.0.tar.gz
  • Upload date:
  • Size: 135.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 656b1b1f89bf5a3d963db5198a756ecba19afefc6f5f3e04de5fd25edc91c3bf
MD5 0cb5d30794cd3b4d4315482875c0f751
BLAKE2b-256 cd60d7d485c3c174e1e5653b41d9bca92194246128bcfc4971ae0d662ef991e4

See more details on using hashes here.

File details

Details for the file sttpy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sttpy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 930ad2c526168addd0694d23dfd4b5aa8821116f287a8fd994ba6821057396b4
MD5 cdc1dc2fb52e1667c6e3a9cda1c35060
BLAKE2b-256 982c55f3c6f4af954e8a0fc45b50a3f1e2d8b75bf9f565132a3abe5f72bd58d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page