Skip to main content

Speech to Text (s2t): Record audio, run Whisper, export formats, and copy transcript to clipboard.

Project description

s2t

Record audio from your microphone, run Whisper to transcribe it, export common formats, and copy the .txt transcript to your clipboard.

Install

  • From local checkout:
    • Editable: pip install -e .
    • Standard: pip install .

Requirements: Python 3.11+. No mandatory external binaries. ffmpeg is optional (only for MP3 encoding/decoding).

System requirements (Linux)

  • Some environments need system libraries for audio I/O:
    • Debian/Ubuntu: sudo apt-get install libportaudio2 libsndfile1
    • Fedora/RHEL: sudo dnf install portaudio libsndfile
  • Optional for MP3: ffmpeg (sudo apt-get install ffmpeg or brew install ffmpeg).

Usage

  • Start interactive recording and transcribe:
    • s2t
  • Short options:
    • Language: -l de (long: --lang de)
    • Model: -m large-v3 (long: --model large-v3)
    • Sample rate: -r 48000 (long: --rate 48000)
    • Channels: -c 2 (long: --channels 2)
    • Output dir: -o transcripts (long: --outdir transcripts) — default is transcripts/ if omitted
    • Translate to English: -t (long: --translate). You may still provide --lang as an input-language hint if you want.
    • List available models and exit: -L (long: --list-models)
    • Recording format: -f flac|wav|mp3 (long: --recording-format), default flac. MP3 requires ffmpeg; if absent, it falls back to FLAC with a warning.
    • Prompt mode (spoken prompt): -p (long: --prompt). Speak your prompt first, then press SPACE to use it as prompt and continue with your main content. If you press ENTER instead of SPACE, no prompt is used; the spoken audio is transcribed as normal payload and the session ends.
    • Keep chunk files: --keep-chunks — by default, per‑chunk audio and per‑chunk Whisper outputs are deleted after the final merge.
    • Open transcript for editing: -e (long: --edit) — opens the generated .txt in your shell editor ($VISUAL/$EDITOR).
  • Examples:
    • Transcribe in German using large-v3: s2t -l de -m large-v3
    • Translate any input to English: s2t -t
    • Write outputs under transcripts/: s2t -o transcripts
    • List local model names: s2t -L

Outputs are written into a timestamped folder under the chosen output directory (default is transcripts/), e.g. transcripts/2025-01-31T14-22-05+0200/, containing:

  • Per‑chunk outputs: chunk_####.flac/.wav plus chunk_####.txt/.srt/.vtt/.tsv/.json (deleted by default unless --keep-chunks)
  • Final outputs: recording.flac/.wav (and recording.mp3 if requested and ffmpeg available), plus recording.txt/.srt/.vtt/.tsv/.json
  • Clipboard mirrors the combined .txt with blank lines between chunks.

Makefile (optional)

  • Setup venv + dev deps: make setup
  • Lint/format/test: make lint, make format, make test; combined gate: make check
  • Build sdist/wheel: make build (runs check first)
  • Publish to PyPI/TestPyPI: make publish, make publish-test (run after build)
  • Run CLI: make record ARGS='-l de -t -o transcripts'
  • List models: make list-models
  • Show package version: make version

Notes on models

  • The local openai-whisper CLI supports models like: tiny, base, small, medium, large-v1, large-v2, large-v3 and their .en variants.
  • The name turbo refers to OpenAI’s hosted model family and is not provided by the local whisper CLI. If you pass -m turbo, the command may fail; choose a supported local model instead.

Development & Release

  • Für Entwickler-Setup und Beitragshinweise siehe CONTRIBUTING.md.
  • Für den Release-Prozess siehe docs/RELEASING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s2t-0.1.3.tar.gz (27.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s2t-0.1.3-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file s2t-0.1.3.tar.gz.

File metadata

  • Download URL: s2t-0.1.3.tar.gz
  • Upload date:
  • Size: 27.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for s2t-0.1.3.tar.gz
Algorithm Hash digest
SHA256 34bd791eff027e9f52162a247a2bd05d44abf60ee38b72248632298e7fded3d9
MD5 8fd47f4065adbd3d24c3d2dccf341cfe
BLAKE2b-256 1849f2ccb745a69a513ba9f5e73fc2b8bdd472568ad169ac035d11f2e6ca148f

See more details on using hashes here.

File details

Details for the file s2t-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: s2t-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for s2t-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f0cdeb13b9b5b8c5f80aa36049ec5e95cb359b1475052f50acdb15557d59e07e
MD5 c334e8a5038fb6c0ec3d8bd7586c7129
BLAKE2b-256 36fab0c4777788a2498f4f44e058534b67518a49adeea146343a8f707b7c219d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page