Skip to main content

Local CLI for audio/video transcription using Soniox API — generates SRT subtitles with an HTML viewer, no server required

Project description

audio-transcribe-cli

Local CLI for audio/video transcription using the Soniox API. Generates SRT subtitles and a self-contained HTML editor — no server, no database, no cloud storage required.

Prerequisites

Install

cd audio-transcribe-cli
pip install .

For an editable / development install:

pip install -e .

Environment Setup

cp .env.example .env
# Edit .env and set SONIOX_API_KEY

Or export directly:

export SONIOX_API_KEY=your_key_here

Usage

transcribe — Transcribe an audio or video file

atcli transcribe path/to/recording.mp4

With reference PDF documents (used to build Soniox context for better accuracy):

atcli transcribe recording.mp4 --context slides.pdf --context notes.pdf

With free-text hints and explicit language hints:

atcli transcribe recording.mp4 --hints "Technical meeting about Kubernetes" --language en,ja

Specify output path and skip the HTML viewer:

atcli transcribe recording.mp4 --output /tmp/output.srt --no-view

Use LLM-powered keyword extraction from reference docs:

atcli transcribe recording.mp4 \
  --context reference.pdf \
  --llm-api-key $OPENAI_API_KEY \
  --llm-endpoint https://api.openai.com/v1

Options:

Flag Default Description
--context / -c PDF reference doc(s); repeatable
--hints "" Free-text added to Soniox context
--language auto Comma-separated language hints (en,ja,vi)
--output / -o <input>.srt SRT output path
--view / --no-view --view Open HTML viewer after transcription
--api-key $SONIOX_API_KEY Soniox API key
--llm-api-key $OPENAI_API_KEY LLM API key for keyword extraction
--llm-endpoint $OPENAI_ENDPOINT LLM endpoint URL

view — Open an existing SRT in the HTML viewer

atcli view path/to/subtitles.srt

This generates subtitles_viewer.html next to the SRT and opens it in your browser. Use the file picker in the viewer to load the corresponding media file.

HTML Viewer Features

  • Two-panel layout: editable subtitle table (left) + media player (right)
  • Click any row to seek the player to that position
  • Current subtitle highlighted during playback with on-video overlay
  • Editable index, start/end timing, and text fields
  • Add / delete subtitle rows
  • Export SRT button — downloads the current edited state as a .srt file
  • Fully self-contained (no internet required, opens via file://)

How It Works

  1. Extracts audio stream if input is a video file (ffmpeg, no re-encoding)
  2. Builds a Soniox context dict from PDF reference docs (keyword extraction)
  3. Splits audio into 30–60 s segments at silence points
  4. Uploads and transcribes each segment via the Soniox async API
  5. Merges all tokens with correct time offsets, groups into subtitle blocks
  6. Resolves <split:N> tags for long sentences into separate cues
  7. Writes the final SRT file
  8. Generates a self-contained HTML viewer and opens it in your browser

Project Structure

src/atcli/
├── cli.py                 # Typer entry point
├── core/
│   ├── soniox_client.py   # Soniox HTTP client (upload, transcribe, cleanup)
│   ├── audio_segmenter.py # ffmpeg-based silence-aware segmentation
│   ├── context_extractor.py  # PDF → keywords → Soniox context dict
│   ├── token_assembler.py # Soniox tokens → subtitle blocks
│   ├── subtitle_generator.py # blocks → SRT/VTT text + split-tag resolver
│   └── html_viewer.py     # Self-contained HTML editor generator
└── utils/
    └── logger.py          # Rich-based logger

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audio_transcribe_cli-0.1.0.tar.gz (29.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

audio_transcribe_cli-0.1.0-py3-none-any.whl (35.0 kB view details)

Uploaded Python 3

File details

Details for the file audio_transcribe_cli-0.1.0.tar.gz.

File metadata

  • Download URL: audio_transcribe_cli-0.1.0.tar.gz
  • Upload date:
  • Size: 29.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audio_transcribe_cli-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bfadf9bea44ff0ef36c1a38848b17796ea7d9f5632a96c7658f8e0c44d5cfa65
MD5 e529b7ff34c19f86fe1a770567baea6e
BLAKE2b-256 bc28a65a275b214cf111c01a1476e8889f729b46a56a6689a5c4e6b282cba6fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for audio_transcribe_cli-0.1.0.tar.gz:

Publisher: publish.yml on sun-asterisk-research/audio-transcribe-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file audio_transcribe_cli-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for audio_transcribe_cli-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f1e6ec823c153918a1959e5d07d630f307cb6986b3e479ea51e2c2c65ab95601
MD5 5bed5f7f12c0edbf5f1ae5ed1945abe5
BLAKE2b-256 825c98b3aae7411b8b2813d17bec1ca4ea33a971c53215d07de8c3a0856077e8

See more details on using hashes here.

Provenance

The following attestation bundles were made for audio_transcribe_cli-0.1.0-py3-none-any.whl:

Publisher: publish.yml on sun-asterisk-research/audio-transcribe-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page