Speech to Text (s2t): Record audio, run Whisper, export formats, and copy transcript to clipboard.
Project description
s2t
Record audio from your microphone, run Whisper to transcribe it, export common formats, and copy the .txt transcript to your clipboard.
Install
- From local checkout:
- Editable:
pip install -e . - Standard:
pip install .
- Editable:
Requirements: Python 3.11+. No mandatory external binaries. ffmpeg is optional (only for MP3 encoding/decoding).
System requirements (Linux)
- Some environments need system libraries for audio I/O:
- Debian/Ubuntu:
sudo apt-get install libportaudio2 libsndfile1 - Fedora/RHEL:
sudo dnf install portaudio libsndfile
- Debian/Ubuntu:
- Optional for MP3: ffmpeg (
sudo apt-get install ffmpegorbrew install ffmpeg).
Usage
- Start interactive recording and transcribe:
s2t
- Short options:
- Language:
-l de(long:--lang de) - Model:
-m large-v3(long:--model large-v3) - Sample rate:
-r 48000(long:--rate 48000) - Channels:
-c 2(long:--channels 2) - Output dir:
-o transcripts(long:--outdir transcripts) — default istranscripts/if omitted - Translate to English:
-t(long:--translate). You may still provide--langas an input-language hint if you want. - List available models and exit:
-L(long:--list-models) - Recording format:
-f flac|wav|mp3(long:--recording-format), defaultflac. MP3 requires ffmpeg; if absent, it falls back to FLAC with a warning. - Auto-split on silence:
--silence-sec 1.0(default1.0;0disables). When continuous silence ≥ this many seconds is detected, the current chunk is ended automatically. - Minimum chunk length for auto-split:
--min-chunk-sec 5.0(default5.0). Prevents very short chunks and avoids splitting early in a sentence. - Prompt mode (spoken prompt):
-p(long:--prompt). Speak your prompt first, then press SPACE to use it as prompt and continue with your main content. If you press ENTER instead of SPACE, no prompt is used; the spoken audio is transcribed as normal payload and the session ends. - Keep chunk files:
--keep-chunks— by default, per‑chunk audio and per‑chunk Whisper outputs are deleted after the final merge. - Open transcript for editing:
-e(long:--edit) — opens the generated.txtin your shell editor ($VISUAL/$EDITOR).
- Language:
- Examples:
- Transcribe in German using large-v3:
s2t -l de -m large-v3 - Translate any input to English:
s2t -t - Write outputs under transcripts/:
s2t -o transcripts - List local model names:
s2t -L
- Transcribe in German using large-v3:
Outputs are written into a timestamped folder under the chosen output directory (default is transcripts/), e.g. transcripts/2025-01-31T14-22-05+0200/, containing:
- Per‑chunk outputs:
chunk_####.flac/.wavpluschunk_####.txt/.srt/.vtt/.tsv/.json(deleted by default unless--keep-chunks) - Final outputs:
recording.flac/.wav(andrecording.mp3if requested and ffmpeg available), plusrecording.txt/.srt/.vtt/.tsv/.json - Clipboard mirrors the combined
.txtwith blank lines between chunks.
Auto-splitting details
- SPACE always splits immediately; ENTER finishes the recording.
- With
--silence-sec > 0, chunks end automatically after detected continuous silence of that many seconds. - Auto-split only triggers once the current chunk has at least
--min-chunk-secseconds and after speech has been detected (to ignore leading silence). A short internal cooldown avoids duplicate splits.
Makefile (optional)
- Setup venv + dev deps:
make setup - Lint/format/test:
make lint,make format,make test; combined gate:make check - Build sdist/wheel:
make build(runscheckfirst) - Publish to PyPI/TestPyPI:
make publish,make publish-test(run afterbuild) - Run CLI:
make record ARGS='-l de -t -o transcripts' - List models:
make list-models - Show package version:
make version
Notes on models
- The local openai-whisper CLI supports models like:
tiny,base,small,medium,large-v1,large-v2,large-v3and their.envariants. - The name
turborefers to OpenAI’s hosted model family and is not provided by the localwhisperCLI. If you pass-m turbo, the command may fail; choose a supported local model instead.
Development & Release
- Für Entwickler-Setup und Beitragshinweise siehe
CONTRIBUTING.md. - Für den Release-Prozess siehe
docs/RELEASING.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file s2t-0.1.7.tar.gz.
File metadata
- Download URL: s2t-0.1.7.tar.gz
- Upload date:
- Size: 30.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf270df90479463c6a7e78a757a01c90f4187cb8d3c22e03d813772cdfb6cec4
|
|
| MD5 |
3ee4d69e5cafe34de4ef5a8f3aa5b24c
|
|
| BLAKE2b-256 |
d39237eb8713314c855f10eed220cd8c0a8782a538196d1a27d4004f1ddfa9b3
|
File details
Details for the file s2t-0.1.7-py3-none-any.whl.
File metadata
- Download URL: s2t-0.1.7-py3-none-any.whl
- Upload date:
- Size: 24.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0012dac053d20b1c475f060bdc33a037991b904488f6f42312851822cc9eaee
|
|
| MD5 |
b929e582bb371d35703c73926bf1e6f3
|
|
| BLAKE2b-256 |
3ea78901a06fd07536e5f4b211c72a39db84868df34ca62845ea0de7af7e2952
|