Skip to main content

GOOBITS STT - Pure speech-to-text engine with multiple operation modes

Project description

๐ŸŽ™๏ธ Goobits STT

A pure speech-to-text engine with multiple operation modes and advanced text formatting. Features real-time transcription, WebSocket server capabilities, and comprehensive text processing with internationalization support. Built on Whisper models for accurate transcription across various languages and use cases.

๐Ÿ”— Related Projects

๐Ÿ“‹ Table of Contents

๐Ÿ“ฆ Installation

# Install globally with pipx (recommended)
pipx install .                     # Install globally, isolated environment
pipx install .[dev]               # Install with development dependencies

# Or with pip for development
pip install -e .[dev]              # Install editable with dev dependencies
stt --version                      # Verify installation
stt --listen-once                  # Test basic functionality

๐ŸŽฏ Basic Usage

stt --listen-once                  # Single utterance with VAD
stt --conversation                 # Always listening mode
stt --tap-to-talk=f8              # Tap F8 to start/stop recording
stt --hold-to-talk=space          # Hold spacebar to record
stt --server --port=8769          # Run WebSocket server

โš™๏ธ Configuration

# Edit main configuration
nano config.json

# Configure Whisper model
stt --model large-v3-turbo --language en

# Audio settings
stt --device "USB Audio" --sample-rate 16000

# Output formats
stt --format json | jq -r '.text'
stt --format text --no-formatting

๐ŸŽค Operation Modes

# Quick transcription
stt --listen-once | llm-process

# Interactive conversation
stt --conversation | tts-speak

# Hotkey control
stt --tap-to-talk=f8              # Toggle recording with F8
stt --hold-to-talk=ctrl+space     # Push-to-talk mode

# Server mode for remote clients
stt --server --host 0.0.0.0 --port 8769

๐Ÿš€ Performance Optimization

# GPU acceleration (if available)
stt --model base --device cuda

# CPU optimization
stt --model tiny --device cpu

# Model selection by speed/quality
stt --model tiny      # Fastest, lower quality
stt --model base      # Balanced (default)
stt --model large-v3-turbo  # Best quality

๐ŸŽญ Text Formatting Features

# Advanced entity detection
stt --listen-once  # "Call me at 555-123-4567" โ†’ "Call me at (555) 123-4567"
stt --listen-once  # "Go to github dot com" โ†’ "Go to github.com"
stt --listen-once  # "Three point one four" โ†’ "3.14"

# Multilingual support
stt --language es  # Spanish formatting rules
stt --language en  # English formatting (default)

# Disable formatting
stt --no-formatting  # Raw transcription output

๐Ÿ”ง Server Deployment

# Basic server
stt --server

# Production with SSL
stt --server --port 443 --host 0.0.0.0

# Docker deployment
docker run -p 8080:8080 -p 8769:8769 sttservice/transcribe

๐ŸŽฏ Testing & Development

# Run test suite
pytest                             # All tests
pytest tests/text_formatting/     # Specific module
pytest -v -n auto                 # Parallel with verbose output

# Code quality
ruff check src/ tests/             # Linting
black src/ tests/ stt.py          # Formatting
mypy src/ stt.py                  # Type checking

# Test with real audio
pytest tests/__fixtures__/audio/

๐Ÿ”ง Model Comparison

Model Speed Quality Memory Best For
tiny โšก Fastest ๐ŸŒŸ Basic ๐Ÿ’พ 39MB Real-time, low resources
base ๐Ÿ”ฅ Fast ๐ŸŒŸ๐ŸŒŸ Good ๐Ÿ’พ 74MB General use (default)
small โšก Quick ๐ŸŒŸ๐ŸŒŸ๐ŸŒŸ Better ๐Ÿ’พ 244MB Accuracy balance
medium ๐Ÿ”ฅ Moderate ๐ŸŒŸ๐ŸŒŸ๐ŸŒŸ๐ŸŒŸ Great ๐Ÿ’พ 769MB High accuracy
large-v3-turbo ๐Ÿ”ฅ Fast ๐Ÿ† Best ๐Ÿ’พ 1550MB Production quality

Choose based on your speed/accuracy requirements and available system resources.

๐ŸŽ™๏ธ Audio Features

  • Real-time streaming: Opus audio encoding for efficient transmission
  • Voice Activity Detection: Automatic speech detection and silence handling
  • Multiple input devices: Support for various microphones and audio interfaces
  • Hotkey integration: System-wide keyboard shortcuts for hands-free operation
  • Background operation: Run as daemon with minimal resource usage

๐Ÿ› ๏ธ Tech Stack

Core Technologies

  • ๐Ÿง  AI/ML: OpenAI Whisper (faster-whisper), CTranslate2, PyTorch
  • ๐ŸŽ™๏ธ Audio: OpusLib, NumPy, custom pipe-based audio capture
  • โŒจ๏ธ System: pynput for global hotkeys, cross-platform support

Text Processing

  • ๐Ÿ“ NLP: spaCy, deepmultilingualpunctuation
  • ๐ŸŒ i18n: Multi-language entity detection and formatting
  • ๐Ÿ”ง Parsing: pyparsing for complex text transformations
  • ๐Ÿ“Š Output: JSON/text formatting with rich entity support

Development & Testing

  • ๐Ÿงช Testing: pytest with asyncio, xdist, custom plugins
  • ๐Ÿ“Š Quality: ruff (linting), black (formatting), mypy (typing)
  • ๐Ÿ” Security: bandit for security analysis
  • ๐Ÿ“ฆ Build: setuptools, pyproject.toml configuration

Deployment

  • ๐Ÿณ Containerization: Docker with CUDA 12.1 support
  • ๐Ÿ–ฅ๏ธ Interface: FastAPI admin dashboard (Docker), responsive web UI
  • ๐Ÿ”’ Security: JWT authentication, RSA+AES encryption (Docker)
  • ๐Ÿ“ˆ Monitoring: Structured logging, health checks
  • โ˜๏ธ Cloud: Ready for production deployment with SSL/TLS

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

goobits_stt-1.0.0.tar.gz (143.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

goobits_stt-1.0.0-py3-none-any.whl (161.5 kB view details)

Uploaded Python 3

File details

Details for the file goobits_stt-1.0.0.tar.gz.

File metadata

  • Download URL: goobits_stt-1.0.0.tar.gz
  • Upload date:
  • Size: 143.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for goobits_stt-1.0.0.tar.gz
Algorithm Hash digest
SHA256 67756f1282532f9a2e29208e210dbbe3f8f3756b3f53f8bccc4c745179031cb4
MD5 b66f3663e9f0d81fa9fac4a6269897cc
BLAKE2b-256 5eb5d3529f95e6fbc3d74c8f730e1bec77e43de7a022fb4ebe83c63ab36d5a8d

See more details on using hashes here.

File details

Details for the file goobits_stt-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: goobits_stt-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 161.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for goobits_stt-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fcfbe0da5e8fc548ab9e3bc4139ac35eea210e09c3a413b4a274f6cc6f7ca857
MD5 9e0513543b069630e1d8d4cdf02982ec
BLAKE2b-256 9954d1aa492e45fa3952e0746bb14a9469a584bf60c6270df8cade1e3d11e625

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page