Local CLI for audio/video transcription using Soniox API — generates SRT subtitles with an HTML viewer, no server required
Project description
audio-transcribe-cli
Local CLI for audio/video transcription using the Soniox API. Generates SRT subtitles and a self-contained HTML editor — no server, no database, no cloud storage required.
Prerequisites
- Python 3.11+
- ffmpeg available on
PATH - A Soniox API key
Install
cd audio-transcribe-cli
pip install .
For an editable / development install:
pip install -e .
Environment Setup
cp .env.example .env
# Edit .env and set SONIOX_API_KEY
Or export directly:
export SONIOX_API_KEY=your_key_here
Usage
transcribe — Transcribe an audio or video file
atcli transcribe path/to/recording.mp4
With reference PDF documents (used to build Soniox context for better accuracy):
atcli transcribe recording.mp4 --context slides.pdf --context notes.pdf
With free-text hints and explicit language hints:
atcli transcribe recording.mp4 --hints "Technical meeting about Kubernetes" --language en,ja
Specify output path and skip the HTML viewer:
atcli transcribe recording.mp4 --output /tmp/output.srt --no-view
Use LLM-powered keyword extraction from reference docs:
atcli transcribe recording.mp4 \
--context reference.pdf \
--llm-api-key $OPENAI_API_KEY \
--llm-endpoint https://api.openai.com/v1
Options:
| Flag | Default | Description |
|---|---|---|
--context / -c |
— | PDF reference doc(s); repeatable |
--hints |
"" |
Free-text added to Soniox context |
--language |
auto | Comma-separated language hints (en,ja,vi) |
--output / -o |
<input>.srt |
SRT output path |
--view / --no-view |
--view |
Open HTML viewer after transcription |
--api-key |
$SONIOX_API_KEY |
Soniox API key |
--llm-api-key |
$OPENAI_API_KEY |
LLM API key for keyword extraction |
--llm-endpoint |
$OPENAI_ENDPOINT |
LLM endpoint URL |
view — Open an existing SRT in the HTML viewer
atcli view path/to/subtitles.srt
This generates subtitles_viewer.html next to the SRT and opens it in your browser. Use the file picker in the viewer to load the corresponding media file.
HTML Viewer Features
- Two-panel layout: editable subtitle table (left) + media player (right)
- Click any row to seek the player to that position
- Current subtitle highlighted during playback with on-video overlay
- Editable index, start/end timing, and text fields
- Add / delete subtitle rows
- Export SRT button — downloads the current edited state as a
.srtfile - Fully self-contained (no internet required, opens via
file://)
How It Works
- Extracts audio stream if input is a video file (ffmpeg, no re-encoding)
- Builds a Soniox context dict from PDF reference docs (keyword extraction)
- Splits audio into 30–60 s segments at silence points
- Uploads and transcribes each segment via the Soniox async API
- Merges all tokens with correct time offsets, groups into subtitle blocks
- Resolves
<split:N>tags for long sentences into separate cues - Writes the final SRT file
- Generates a self-contained HTML viewer and opens it in your browser
Project Structure
src/atcli/
├── cli.py # Typer entry point
├── core/
│ ├── soniox_client.py # Soniox HTTP client (upload, transcribe, cleanup)
│ ├── audio_segmenter.py # ffmpeg-based silence-aware segmentation
│ ├── context_extractor.py # PDF → keywords → Soniox context dict
│ ├── token_assembler.py # Soniox tokens → subtitle blocks
│ ├── subtitle_generator.py # blocks → SRT/VTT text + split-tag resolver
│ └── html_viewer.py # Self-contained HTML editor generator
└── utils/
└── logger.py # Rich-based logger
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file audio_transcribe_cli-0.1.0.tar.gz.
File metadata
- Download URL: audio_transcribe_cli-0.1.0.tar.gz
- Upload date:
- Size: 29.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bfadf9bea44ff0ef36c1a38848b17796ea7d9f5632a96c7658f8e0c44d5cfa65
|
|
| MD5 |
e529b7ff34c19f86fe1a770567baea6e
|
|
| BLAKE2b-256 |
bc28a65a275b214cf111c01a1476e8889f729b46a56a6689a5c4e6b282cba6fe
|
Provenance
The following attestation bundles were made for audio_transcribe_cli-0.1.0.tar.gz:
Publisher:
publish.yml on sun-asterisk-research/audio-transcribe-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
audio_transcribe_cli-0.1.0.tar.gz -
Subject digest:
bfadf9bea44ff0ef36c1a38848b17796ea7d9f5632a96c7658f8e0c44d5cfa65 - Sigstore transparency entry: 1590432546
- Sigstore integration time:
-
Permalink:
sun-asterisk-research/audio-transcribe-cli@0078d3c2e88035f3c23dc3de52942d93dba38345 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/sun-asterisk-research
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0078d3c2e88035f3c23dc3de52942d93dba38345 -
Trigger Event:
push
-
Statement type:
File details
Details for the file audio_transcribe_cli-0.1.0-py3-none-any.whl.
File metadata
- Download URL: audio_transcribe_cli-0.1.0-py3-none-any.whl
- Upload date:
- Size: 35.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1e6ec823c153918a1959e5d07d630f307cb6986b3e479ea51e2c2c65ab95601
|
|
| MD5 |
5bed5f7f12c0edbf5f1ae5ed1945abe5
|
|
| BLAKE2b-256 |
825c98b3aae7411b8b2813d17bec1ca4ea33a971c53215d07de8c3a0856077e8
|
Provenance
The following attestation bundles were made for audio_transcribe_cli-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on sun-asterisk-research/audio-transcribe-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
audio_transcribe_cli-0.1.0-py3-none-any.whl -
Subject digest:
f1e6ec823c153918a1959e5d07d630f307cb6986b3e479ea51e2c2c65ab95601 - Sigstore transparency entry: 1590432598
- Sigstore integration time:
-
Permalink:
sun-asterisk-research/audio-transcribe-cli@0078d3c2e88035f3c23dc3de52942d93dba38345 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/sun-asterisk-research
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@0078d3c2e88035f3c23dc3de52942d93dba38345 -
Trigger Event:
push
-
Statement type: