Skip to main content

Dead simple speech-to-text

Project description

Quickstart

This project is a simple voice dictation (speech-to-text) tool that runs completely on device. It uses the openai-whisper models for speech recognition and optionally uses the local LLMs for post-processing of transcribed text (currently supporting all ollama models).

Simply install, run and hold the hotkey to speak. The transcribed text will be pasted into the active window. Say 'help' to view voice commands.

Installation

# CPU only
pip install sttpy

# If you have a GPU and want to use the CUDA version
pip install sttpy[cuda] --extra-index-url https://download.pytorch.org/whl/cu124

Usage

Usage: stt [OPTIONS]

  Voice dictation (speech-to-text) completely on device.

Options:
  --stt TEXT           Whisper model name (tiny.en, base.en, turbo, ...)
  --hotkey TEXT        Hotkey to hold while speaking
  --debug              Enable debug mode
  --post-processing    Enable LLM post-processing of transcribed text
  --type-mode          Use keystrokes instead of pasting text
  --paste-delay FLOAT  Delay between copying to clipboard and pasting
  --help               Show this message and exit.

Examples

Run with debug mode enabled, using the openai-whisper turbo model, and the hold-to-speak hotkey f12:

stt --debug --stt turbo --hotkey f12

Prompt for a hotkey to hold while speaking:

stt --hotkey prompt
Enter the hotkey you want to use followed by 'escape':
Hotkey: space. Press escape to confirm.
Hotkey: ctrl+space. Press escape to confirm.
Hotkey confirmed: ctrl+space
Hotkey: ctrl+space
2025-01-16 22:29:00 - INFO - Loading whisper model 'tiny.en' on cuda...
2025-01-16 22:29:00 - INFO - Press and hold 'ctrl+space' to speak

Commands

There are a few commands built-in to the voice dictation interface:

Just hold the hotkey and say 'help' to view commands.

Post-processing

Note: Post-processing is not enabled by default since there is latency and its still under development. To enable post-processing, use the --post-processing flag. You will need to have an local ollama server running and the model llama3.2:3b-instruct-q5_K_M available.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sttpy-0.1.5.tar.gz (181.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sttpy-0.1.5-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file sttpy-0.1.5.tar.gz.

File metadata

  • Download URL: sttpy-0.1.5.tar.gz
  • Upload date:
  • Size: 181.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.5.tar.gz
Algorithm Hash digest
SHA256 87460cc14d6852bb799ef4cb83fae8a5e9b974400fae8202cbe271e7e080efd2
MD5 c692c780f436a5894447415835213ceb
BLAKE2b-256 ee965f0e2436028d9492055c85f8f1e0b990dbbbde3f1ea274f8ee9cd66feea2

See more details on using hashes here.

File details

Details for the file sttpy-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: sttpy-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.20

File hashes

Hashes for sttpy-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0b69d85bad617e9d2ce3ca696b1821ef36af2d34a11705b3f5fba5fb68c95fda
MD5 327ab1284451368798ba6e14aa3b265a
BLAKE2b-256 09a099621652d66b797272bf2b7ddbc39ee11b49fc8a980c37516da8bb4b1ba4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page