Speech to Text (s2t): Record audio, run Whisper, export formats, and copy transcript to clipboard.

These details have not been verified by PyPI

Project description

s2t

Record audio from your microphone, run Whisper to transcribe it, export common formats, and copy the .txt transcript to your clipboard.

Install

From local checkout:
- Editable: pip install -e .
- Standard: pip install .

Requirements: Python 3.11–3.12. No mandatory external binaries. ffmpeg is optional (only for MP3 encoding/decoding).

System requirements (Linux)

Some environments need system libraries for audio I/O:
- Debian/Ubuntu: sudo apt-get install libportaudio2 libsndfile1
- Fedora/RHEL: sudo dnf install portaudio libsndfile
Optional for MP3: ffmpeg (sudo apt-get install ffmpeg or brew install ffmpeg).
Optional backends:
- faster-whisper (CTranslate2): pip install faster-whisper (GPU via CUDA on NVIDIA; CPU works well with int8).
- whisper.cpp (Metal/CPU): pip install whispercpp (requires local gguf models; experimental GPU on Apple varies by build).

Usage

Start interactive recording and transcribe:
- s2t
Short options:
- Language: -l de (long: --lang de)
- Model: -m large-v3 (long: --model large-v3)
- Backend: --backend whisper|faster|whispercpp (default: whisper)
- Device: --device auto|cpu|cuda|mps (default: auto)
- Sample rate: -r 48000 (long: --rate 48000)
- Channels: -c 2 (long: --channels 2)
- Output dir: -o transcripts (long: --outdir transcripts) — default is transcripts/ if omitted
- Translate to English: -t (long: --translate). You may still provide --lang as an input-language hint if you want.
- List available models and exit: -L (long: --list-models)
- Recording format: -f flac|wav|mp3 (long: --recording-format), default flac. MP3 requires ffmpeg; if absent, it falls back to FLAC with a warning.
- Auto-split on silence: --silence-sec 1.0 (default 1.0; 0 disables). When continuous silence ≥ this many seconds is detected, the current chunk is ended automatically.
- Minimum chunk length for auto-split: --min-chunk-sec 5.0 (default 5.0). Prevents very short chunks and avoids splitting early in a sentence.
- Observation window (for block-based splitting): --buffer-sec 30.0 (default 30.0). Planned use for cutting at the longest pause within each window.
- Prompt mode (spoken prompt): -p (long: --prompt). Speak your prompt first, then press SPACE to use it as prompt and continue with your main content. If you press ENTER instead of SPACE, no prompt is used; the spoken audio is transcribed as normal payload and the session ends.
- Keep chunk files: --keep-chunks — by default, per‑chunk audio and per‑chunk Whisper outputs are deleted after the final merge.
- Open transcript for editing: -e (long: --edit) — opens the generated .txt in your shell editor ($VISUAL/$EDITOR).
Examples:
- Transcribe in German using large-v3: s2t -l de -m large-v3
- Translate any input to English: s2t -t
- Write outputs under transcripts/: s2t -o transcripts
- List local model names: s2t -L

Outputs are written into a timestamped folder under the chosen output directory (default is transcripts/), e.g. transcripts/2025-01-31T14-22-05+0200/, containing:

Per‑chunk outputs: chunk_####.flac/.wav plus chunk_####.txt/.srt/.vtt/.tsv/.json (deleted by default unless --keep-chunks)
Final outputs: recording.flac/.wav (and recording.mp3 if requested and ffmpeg available), plus recording.txt/.srt/.vtt/.tsv/.json
Clipboard mirrors the combined .txt with blank lines between chunks.

Auto-splitting details

SPACE always splits immediately; ENTER finishes the recording.
With --silence-sec > 0, chunks end automatically after detected continuous silence of that many seconds.
Auto-split only triggers once the current chunk has at least --min-chunk-sec seconds and after speech has been detected (to ignore leading silence). A short internal cooldown avoids duplicate splits.

Makefile (optional)

Setup venv + dev deps: make setup
Lint/format/test: make lint, make format, make test; combined gate: make check
Build sdist/wheel: make build (runs check first)
Publish to PyPI/TestPyPI: make publish, make publish-test (run after build)
Run CLI: make record ARGS='-l de -t -o transcripts'
List models: make list-models
Show package version: make version

Notes on models

The local openai-whisper CLI supports models like: tiny, base, small, medium, large-v1, large-v2, large-v3 and their .en variants.
The name turbo refers to OpenAI’s hosted model family and is not provided by the local whisper CLI. If you pass -m turbo, the command may fail; choose a supported local model instead.

Development & Release

For developer setup and contribution guidelines, see CONTRIBUTING.md.
For the release process, see docs/RELEASING.md.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.8

Apr 12, 2026

0.2.7.post1.dev1 pre-release

Nov 29, 2025

0.2.7.post1.dev0 pre-release

Nov 29, 2025

0.2.7

Nov 29, 2025

0.2.6

Nov 28, 2025

0.2.5

Nov 26, 2025

0.2.4

Nov 25, 2025

0.2.3

Nov 25, 2025

0.2.2

Nov 12, 2025

0.2.1

Nov 11, 2025

This version

0.2.0

Nov 11, 2025

0.1.17

Oct 25, 2025

0.1.15

Oct 22, 2025

0.1.13.post1.dev0 pre-release

Oct 22, 2025

0.1.13

Oct 17, 2025

0.1.12

Oct 14, 2025

0.1.11

Oct 14, 2025

0.1.10

Oct 14, 2025

0.1.9

Oct 14, 2025

0.1.8

Oct 14, 2025

0.1.7

Oct 14, 2025

0.1.6.post1.dev0 pre-release

Oct 14, 2025

0.1.5

Oct 13, 2025

0.1.4

Oct 13, 2025

0.1.3.post1.dev1 pre-release

Oct 13, 2025

0.1.3

Oct 13, 2025

0.1.2

Oct 13, 2025

0.1.1

Oct 13, 2025

0.1.0.post1.dev2 pre-release

Oct 12, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s2t-0.2.0.tar.gz (35.1 kB view details)

Uploaded Nov 11, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

s2t-0.2.0-py3-none-any.whl (28.2 kB view details)

Uploaded Nov 11, 2025 Python 3

File details

Details for the file s2t-0.2.0.tar.gz.

File metadata

Download URL: s2t-0.2.0.tar.gz
Upload date: Nov 11, 2025
Size: 35.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for s2t-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`2fe676431182b06c4890d96f395ca399d6e749221cd02e05a775b89e4e2c7545`
MD5	`649ef3742a36ffd29b39ae4d632c2435`
BLAKE2b-256	`d896839df962fd609ff248e46ea145c6f2b830e7787c6b05ebe1ad7cdd9aa7b9`

See more details on using hashes here.

File details

Details for the file s2t-0.2.0-py3-none-any.whl.

File metadata

Download URL: s2t-0.2.0-py3-none-any.whl
Upload date: Nov 11, 2025
Size: 28.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for s2t-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`19b530b7f4fb58624f6fab554b8f321d6f4a2990a227e2a007c271be697560e9`
MD5	`e66858d90661ce7a08f92cf4017d5086`
BLAKE2b-256	`fb7383a2a27ac02b85ee764de5563fff6d4a00af0decae0f2ed1b59743a136b6`

See more details on using hashes here.

s2t 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

s2t

Install

Usage

Makefile (optional)

Development & Release

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes