Skip to main content

Generate SRT subtitles from audio/video files using WhisperX

Project description

subtitle-engine

PyPI Downloads

Generate .srt subtitle files from audio or video files using WhisperX. Optionally generate a caption from the transcript with a local Ollama LLM.

Installation

Requires Python 3.12 or newer.

pip install subtitle-engine

Or install from source:

git clone https://github.com/leevipuntanen/subtitle-engine.git
cd subtitle-engine
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Usage

# Basic usage — writes <input>.srt next to the source file
subeng video.mp4

# Specify output file
subeng video.mp4 --output subtitles.srt

# Use a different model or language
subeng video.mp4 --model medium --language fi

# Force CPU / CUDA
subeng video.mp4 --device cpu

# Speaker diarization (requires a Hugging Face token)
subeng video.mp4 --diarize --hf-token $HF_TOKEN

# Generate a caption from the transcript using Ollama
subeng video.mp4 --caption --ollama-model qwen3.5:0.6b

# Generate a caption from an existing SRT file
subeng caption subtitles.srt

# Short-form subtitles (2-5 words per line, default)
subeng video.mp4 --preset shortform

# Long-form subtitles (10-14 words per line)
subeng video.mp4 --preset longform

Options

Option Description
--output, -o Output SRT file path
--model, -m WhisperX model: tiny, base, small (default), medium, large-v2, large-v3
--language, -l ISO language code, e.g. en, fi. Auto-detected if omitted.
--device, -d cpu or cuda. Auto-detected if omitted.
--batch-size, -b Inference batch size (default: 16)
--compute-type, -c int8 or float16. Auto-selected if omitted.
--diarize Enable speaker diarization
--hf-token Hugging Face token for diarization (or set HF_TOKEN env var)
--caption Generate a caption from the transcript via Ollama
--ollama-model Ollama model name. If omitted, installed models are listed and you can pick one.
--ollama-host Ollama API host (default: http://localhost:11434)
caption Generate a caption from an existing SRT file (e.g. subeng caption file.srt)
--preset, -p Subtitle style: shortform (2-5 words, default) or longform (10-14 words)

Development

Run the test suite:

pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subtitle_engine-0.1.5.tar.gz (27.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

subtitle_engine-0.1.5-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file subtitle_engine-0.1.5.tar.gz.

File metadata

  • Download URL: subtitle_engine-0.1.5.tar.gz
  • Upload date:
  • Size: 27.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for subtitle_engine-0.1.5.tar.gz
Algorithm Hash digest
SHA256 fd111e0a18295938362b6d7139861bf8d8027ea3728623dab910ada997e961ea
MD5 6161ecd0283b1c9761df0ff85fe6504f
BLAKE2b-256 e7b2113c0cf70e41b7820b9d73d37da2ab77da5bb11852f89d995c3d9482a6c7

See more details on using hashes here.

File details

Details for the file subtitle_engine-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for subtitle_engine-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 ea2bf05d01ea70a2b958c037b2d1a8b2334f82b81ca50d11d1d1f65537842bff
MD5 74007c65cf4f163d77b60d71aa9c8512
BLAKE2b-256 b1fea6f87cb2bc73906ef49397b5b9ccc1516115e212fd568dcba52548bfff50

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page