Generate SRT subtitles from audio/video files using WhisperX
Project description
subtitle-engine
Generate .srt subtitle files from audio or video files using WhisperX. Optionally generate a caption from the transcript with a local Ollama LLM.
Installation
Requires Python 3.12 or newer.
pip install subtitle-engine
Or install from source:
git clone https://github.com/leevipuntanen/subtitle-engine.git
cd subtitle-engine
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
Usage
# Basic usage — writes <input>.srt next to the source file
subeng video.mp4
# Specify output file
subeng video.mp4 --output subtitles.srt
# Use a different model or language
subeng video.mp4 --model medium --language fi
# Force CPU / CUDA
subeng video.mp4 --device cpu
# Speaker diarization (requires a Hugging Face token)
subeng video.mp4 --diarize --hf-token $HF_TOKEN
# Generate a caption from the transcript using Ollama
subeng video.mp4 --caption --ollama-model qwen3.5:0.6b
# Generate a caption from an existing SRT file
subeng caption subtitles.srt
# Short-form subtitles (2-5 words per line, default)
subeng video.mp4 --preset shortform
# Long-form subtitles (10-14 words per line)
subeng video.mp4 --preset longform
Options
| Option | Description |
|---|---|
--output, -o |
Output SRT file path |
--model, -m |
WhisperX model: tiny, base, small (default), medium, large-v2, large-v3 |
--language, -l |
ISO language code, e.g. en, fi. Auto-detected if omitted. |
--device, -d |
cpu or cuda. Auto-detected if omitted. |
--batch-size, -b |
Inference batch size (default: 16) |
--compute-type, -c |
int8 or float16. Auto-selected if omitted. |
--diarize |
Enable speaker diarization |
--hf-token |
Hugging Face token for diarization (or set HF_TOKEN env var) |
--caption |
Generate a caption from the transcript via Ollama |
--ollama-model |
Ollama model name. If omitted, installed models are listed and you can pick one. |
--ollama-host |
Ollama API host (default: http://localhost:11434) |
caption |
Generate a caption from an existing SRT file (e.g. subeng caption file.srt) |
--preset, -p |
Subtitle style: shortform (2-5 words, default) or longform (10-14 words) |
Development
Run the test suite:
pytest
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
subtitle_engine-0.1.4.tar.gz
(19.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file subtitle_engine-0.1.4.tar.gz.
File metadata
- Download URL: subtitle_engine-0.1.4.tar.gz
- Upload date:
- Size: 19.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c58ce66f73320e84a2cf3461f5195bbf1de31d036934d801ebb0d050ca974fc2
|
|
| MD5 |
6803cc6e9e5b32ab744f7d4786cd258d
|
|
| BLAKE2b-256 |
66c74b5c6dc2ee35113e4f4c685c9c3e580fd0a48a7ac4d18b86010c0834904d
|
File details
Details for the file subtitle_engine-0.1.4-py3-none-any.whl.
File metadata
- Download URL: subtitle_engine-0.1.4-py3-none-any.whl
- Upload date:
- Size: 15.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df015982db3391bc456b59dd003b7502539a7fffc11caa48fef21791db6bde3a
|
|
| MD5 |
219a77d15d01c1255fc388bdbc04016e
|
|
| BLAKE2b-256 |
d630c4263df68ee88500eda0e0fe2773a2d9cc79aef1ebce3b76df58e7f84b2f
|