Skip to main content

Burn precisely-timed captions into video using forced alignment.

Project description

subcap

Burn precisely-timed captions into video. Give it a video and a transcript — it handles alignment, styling, and encoding.

Unlike speech-to-text tools that guess both what is said and when, subcap uses forced alignment: you provide the transcript, and it maps each word to its exact position in the audio waveform. The result is phoneme-level timing accuracy.

Install

pip install subcap

Requires ffmpeg with libass support.

Usage

# Align a transcript and burn captions in
subcap video.mov transcript.txt -o output.mp4

# Use an existing SRT file (skips alignment)
subcap video.mov subtitles.srt -o output.mp4

# Choose a style
subcap video.mov transcript.txt --style outline

# ProRes output for editing
subcap video.mov transcript.txt --quality studio -o output.mov

# Portrait/vertical video (auto-detected)
subcap shorts.mp4 transcript.txt -o shorts_captioned.mp4

Options

subcap <video> <transcript> [options]

  -o, --output          Output path (default: <input>_captioned.mp4)
  --style               modern | outline | minimal | bold (default: modern)
  --quality             standard | high | studio (default: standard)
  --max-lines           Max lines per subtitle (default: 2)
  --max-chars           Max characters per line (default: auto)
  --line-spacing        Gap between lines in px (default: auto)
  --position            bottom | center | top (default: bottom)

Styles

Preset Look
modern White bold text, semi-transparent dark box
outline White text with black outline
minimal Lighter weight, subtle shadow
bold Large text, opaque dark box

Quality

Preset Codec Use case
standard H.264 Sharing, uploading
high H.265 Smaller files
studio ProRes 422 Editing, broadcast

How it works

  1. Extracts audio from the video
  2. Runs forced alignment via stable-ts to map each word to its exact position in the audio
  3. Segments words into readable subtitle chunks
  4. Generates styled ASS subtitles adapted to the video's aspect ratio
  5. Burns captions into the video via ffmpeg

Acknowledgments

Built on:

  • stable-ts — Stabilized Whisper timestamps and forced alignment
  • OpenAI Whisper — Speech recognition model used as the acoustic backbone
  • ffmpeg — Video encoding and subtitle rendering via libass

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subcap-0.1.0.tar.gz (22.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

subcap-0.1.0-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file subcap-0.1.0.tar.gz.

File metadata

  • Download URL: subcap-0.1.0.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for subcap-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6e8d852cc3a692b9d90068c3448abe4619f97002f27ae3ddb59cf89f4b3a1cb3
MD5 cc8cef736ac396f220c18caa920bbf3d
BLAKE2b-256 8a11fa037dbe79e247bd75552e2f3507287afe35e46a94a27c36e5d2b1f1462f

See more details on using hashes here.

File details

Details for the file subcap-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: subcap-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for subcap-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cd9391dcdcf0c3c3182eb253a14b9a71b8bacf5ed539a72a06c0bb816e823a2b
MD5 40a7a1eb93843bb0f3946f16c019ab7f
BLAKE2b-256 d98bda80c2df96cbc65aaba7e118492e6af7e97d3ec916dc7c1fccd82d8206f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page