Burn precisely-timed captions into video using forced alignment.
Project description
subcap
Burn precisely-timed captions into video. Give it a video and a transcript — it handles alignment, styling, and encoding.
Unlike speech-to-text tools that guess both what is said and when, subcap uses forced alignment: you provide the transcript, and it maps each word to its exact position in the audio waveform. The result is phoneme-level timing accuracy.
Install
pip install subcap
Requires ffmpeg with libass support.
Usage
# Align a transcript and burn captions in
subcap video.mov transcript.txt -o output.mp4
# Use an existing SRT file (skips alignment)
subcap video.mov subtitles.srt -o output.mp4
# Choose a style
subcap video.mov transcript.txt --style outline
# ProRes output for editing
subcap video.mov transcript.txt --quality studio -o output.mov
# Portrait/vertical video (auto-detected)
subcap shorts.mp4 transcript.txt -o shorts_captioned.mp4
Options
subcap <video> <transcript> [options]
-o, --output Output path (default: <input>_captioned.mp4)
--style modern | outline | minimal | bold (default: modern)
--quality standard | high | studio (default: standard)
--max-lines Max lines per subtitle (default: 2)
--max-chars Max characters per line (default: auto)
--line-spacing Gap between lines in px (default: auto)
--position bottom | center | top (default: bottom)
Styles
| Preset | Look |
|---|---|
modern |
White bold text, semi-transparent dark box |
outline |
White text with black outline |
minimal |
Lighter weight, subtle shadow |
bold |
Large text, opaque dark box |
Quality
| Preset | Codec | Use case |
|---|---|---|
standard |
H.264 | Sharing, uploading |
high |
H.265 | Smaller files |
studio |
ProRes 422 | Editing, broadcast |
How it works
- Extracts audio from the video
- Runs forced alignment via stable-ts to map each word to its exact position in the audio
- Segments words into readable subtitle chunks
- Generates styled ASS subtitles adapted to the video's aspect ratio
- Burns captions into the video via ffmpeg
Acknowledgments
Built on:
- stable-ts — Stabilized Whisper timestamps and forced alignment
- OpenAI Whisper — Speech recognition model used as the acoustic backbone
- ffmpeg — Video encoding and subtitle rendering via libass
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file subcap-0.1.0.tar.gz.
File metadata
- Download URL: subcap-0.1.0.tar.gz
- Upload date:
- Size: 22.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e8d852cc3a692b9d90068c3448abe4619f97002f27ae3ddb59cf89f4b3a1cb3
|
|
| MD5 |
cc8cef736ac396f220c18caa920bbf3d
|
|
| BLAKE2b-256 |
8a11fa037dbe79e247bd75552e2f3507287afe35e46a94a27c36e5d2b1f1462f
|
File details
Details for the file subcap-0.1.0-py3-none-any.whl.
File metadata
- Download URL: subcap-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd9391dcdcf0c3c3182eb253a14b9a71b8bacf5ed539a72a06c0bb816e823a2b
|
|
| MD5 |
40a7a1eb93843bb0f3946f16c019ab7f
|
|
| BLAKE2b-256 |
d98bda80c2df96cbc65aaba7e118492e6af7e97d3ec916dc7c1fccd82d8206f0
|