Local CLI for audio/video transcription using Soniox API — generates SRT subtitles with an HTML viewer, no server required

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

toanpv0639

These details have not been verified by PyPI

Project description

audio-transcribe-cli

Local CLI for audio/video transcription using the Soniox API. Generates SRT subtitles and a self-contained HTML editor — no server, no database, no cloud storage required.

Prerequisites

Python 3.11+
ffmpeg available on PATH
A Soniox API key

Install

cd audio-transcribe-cli
pip install .

For an editable / development install:

pip install -e .

Environment Setup

cp .env.example .env
# Edit .env and set SONIOX_API_KEY

Or export directly:

export SONIOX_API_KEY=your_key_here

Usage

`transcribe` — Transcribe an audio or video file

atcli transcribe path/to/recording.mp4

With reference PDF documents (used to build Soniox context for better accuracy):

atcli transcribe recording.mp4 --context slides.pdf --context notes.pdf

With free-text hints and explicit language hints:

atcli transcribe recording.mp4 --hints "Technical meeting about Kubernetes" --language en,ja

Specify output path and skip the HTML viewer:

atcli transcribe recording.mp4 --output /tmp/output.srt --no-view

Use LLM-powered keyword extraction from reference docs:

atcli transcribe recording.mp4 \
  --context reference.pdf \
  --llm-api-key $OPENAI_API_KEY \
  --llm-endpoint https://api.openai.com/v1

Options:

Flag	Default	Description
`--context / -c`	—	PDF reference doc(s); repeatable
`--hints`	`""`	Free-text added to Soniox context
`--language`	auto	Comma-separated language hints (`en,ja,vi`)
`--output / -o`	`<input>.srt`	SRT output path
`--view / --no-view`	`--view`	Open HTML viewer after transcription
`--api-key`	`$SONIOX_API_KEY`	Soniox API key
`--llm-api-key`	`$OPENAI_API_KEY`	LLM API key for keyword extraction
`--llm-endpoint`	`$OPENAI_ENDPOINT`	LLM endpoint URL

`view` — Open an existing SRT in the HTML viewer

atcli view path/to/subtitles.srt

This generates subtitles_viewer.html next to the SRT and opens it in your browser. Use the file picker in the viewer to load the corresponding media file.

HTML Viewer Features

Two-panel layout: editable subtitle table (left) + media player (right)
Click any row to seek the player to that position
Current subtitle highlighted during playback with on-video overlay
Editable index, start/end timing, and text fields
Add / delete subtitle rows
Export SRT button — downloads the current edited state as a .srt file
Fully self-contained (no internet required, opens via file://)

How It Works

Extracts audio stream if input is a video file (ffmpeg, no re-encoding)
Builds a Soniox context dict from PDF reference docs (keyword extraction)
Splits audio into 30–60 s segments at silence points
Uploads and transcribes each segment via the Soniox async API
Merges all tokens with correct time offsets, groups into subtitle blocks
Resolves <split:N> tags for long sentences into separate cues
Writes the final SRT file
Generates a self-contained HTML viewer and opens it in your browser

Project Structure

src/atcli/
├── cli.py                 # Typer entry point
├── core/
│   ├── soniox_client.py   # Soniox HTTP client (upload, transcribe, cleanup)
│   ├── audio_segmenter.py # ffmpeg-based silence-aware segmentation
│   ├── context_extractor.py  # PDF → keywords → Soniox context dict
│   ├── token_assembler.py # Soniox tokens → subtitle blocks
│   ├── subtitle_generator.py # blocks → SRT/VTT text + split-tag resolver
│   └── html_viewer.py     # Self-contained HTML editor generator
└── utils/
    └── logger.py          # Rich-based logger

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

toanpv0639

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audio_transcribe_cli-0.1.0.tar.gz (29.4 kB view details)

Uploaded May 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audio_transcribe_cli-0.1.0-py3-none-any.whl (35.0 kB view details)

Uploaded May 21, 2026 Python 3

File details

Details for the file audio_transcribe_cli-0.1.0.tar.gz.

File metadata

Download URL: audio_transcribe_cli-0.1.0.tar.gz
Upload date: May 21, 2026
Size: 29.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audio_transcribe_cli-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`bfadf9bea44ff0ef36c1a38848b17796ea7d9f5632a96c7658f8e0c44d5cfa65`
MD5	`e529b7ff34c19f86fe1a770567baea6e`
BLAKE2b-256	`bc28a65a275b214cf111c01a1476e8889f729b46a56a6689a5c4e6b282cba6fe`

See more details on using hashes here.

Provenance

The following attestation bundles were made for audio_transcribe_cli-0.1.0.tar.gz:

Publisher: publish.yml on sun-asterisk-research/audio-transcribe-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: audio_transcribe_cli-0.1.0.tar.gz
- Subject digest: bfadf9bea44ff0ef36c1a38848b17796ea7d9f5632a96c7658f8e0c44d5cfa65
- Sigstore transparency entry: 1590432546
- Sigstore integration time: May 21, 2026
Source repository:
- Permalink: sun-asterisk-research/audio-transcribe-cli@0078d3c2e88035f3c23dc3de52942d93dba38345
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/sun-asterisk-research
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@0078d3c2e88035f3c23dc3de52942d93dba38345
- Trigger Event: push

File details

Details for the file audio_transcribe_cli-0.1.0-py3-none-any.whl.

File metadata

Download URL: audio_transcribe_cli-0.1.0-py3-none-any.whl
Upload date: May 21, 2026
Size: 35.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audio_transcribe_cli-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f1e6ec823c153918a1959e5d07d630f307cb6986b3e479ea51e2c2c65ab95601`
MD5	`5bed5f7f12c0edbf5f1ae5ed1945abe5`
BLAKE2b-256	`825c98b3aae7411b8b2813d17bec1ca4ea33a971c53215d07de8c3a0856077e8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for audio_transcribe_cli-0.1.0-py3-none-any.whl:

Publisher: publish.yml on sun-asterisk-research/audio-transcribe-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: audio_transcribe_cli-0.1.0-py3-none-any.whl
- Subject digest: f1e6ec823c153918a1959e5d07d630f307cb6986b3e479ea51e2c2c65ab95601
- Sigstore transparency entry: 1590432598
- Sigstore integration time: May 21, 2026
Source repository:
- Permalink: sun-asterisk-research/audio-transcribe-cli@0078d3c2e88035f3c23dc3de52942d93dba38345
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/sun-asterisk-research
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@0078d3c2e88035f3c23dc3de52942d93dba38345
- Trigger Event: push

audio-transcribe-cli 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

audio-transcribe-cli

Prerequisites

Install

Environment Setup

Usage

`transcribe` — Transcribe an audio or video file

`view` — Open an existing SRT in the HTML viewer

HTML Viewer Features

How It Works

Project Structure

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

audio-transcribe-cli 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

audio-transcribe-cli

Prerequisites

Install

Environment Setup

Usage

transcribe — Transcribe an audio or video file

view — Open an existing SRT in the HTML viewer

HTML Viewer Features

How It Works

Project Structure

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`transcribe` — Transcribe an audio or video file

`view` — Open an existing SRT in the HTML viewer