Skip to main content

Burn precisely-timed captions into video using forced alignment.

Project description

subcap

Burn precisely-timed captions into video. Give it a video and a transcript — it handles alignment, styling, and encoding.

Unlike speech-to-text tools that guess both what is said and when, subcap uses forced alignment: you provide the transcript, and wav2vec2 maps each word to its exact position in the audio waveform. The result is phoneme-level timing accuracy — no drift, no guessing, no cascading errors.

Install

pip install subcap

Requires Python 3.10–3.12 and ffmpeg with libass support.

On first run, subcap downloads the wav2vec2 alignment model (~360 MB).

Usage

# Align a transcript and burn captions in
subcap video.mov transcript.txt -o output.mp4

# Use an existing SRT file (skips alignment)
subcap video.mov subtitles.srt -o output.mp4

# Choose a style
subcap video.mov transcript.txt --style outline

# ProRes output for editing
subcap video.mov transcript.txt --quality studio -o output.mov

# Portrait/vertical video (auto-detected)
subcap shorts.mp4 transcript.txt -o shorts_captioned.mp4

Options

subcap <video> <transcript> [options]

  -o, --output          Output path (default: <input>_captioned.mp4)
  --style               modern | outline | minimal | bold (default: modern)
  --quality             standard | high | studio (default: standard)
  --max-lines           Max lines per subtitle (default: 2)
  --max-chars           Max characters per line (default: auto)
  --line-spacing        Gap between lines in px (default: auto)
  --position            bottom | center | top (default: bottom)

Styles

Preset Look
modern White bold text, semi-transparent dark box
outline White text with black outline
minimal Lighter weight, subtle shadow
bold Large text, opaque dark box

Quality

Preset Codec Use case
standard H.264 Sharing, uploading
high H.265 Smaller files
studio ProRes 422 Editing, broadcast

How it works

  1. Extracts audio from the video
  2. Runs phoneme-level forced alignment via WhisperX (wav2vec2) to map each word of your transcript to its exact position in the audio
  3. Segments words into readable subtitle chunks, breaking at sentence boundaries
  4. Generates styled ASS subtitles adapted to the video's aspect ratio
  5. Burns captions into the video via ffmpeg

Because the text is fixed and only the timing is being solved, alignment is precise even for fast speech, accents, or overlapping audio — conditions that typically break speech-to-text.

Transcript notes

Your transcript must match what's actually said in the audio. Small edits are tolerated, but missing or extra sentences will cause alignment failures. If the speaker ad-libs or skips text, update the transcript to match the final delivery.

Acknowledgments

Built on:

  • WhisperX — Phoneme-level forced alignment using wav2vec2
  • wav2vec2 — Self-supervised speech model used as the acoustic backbone for alignment
  • ffmpeg — Video encoding and subtitle rendering via libass

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

subcap-0.2.0.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

subcap-0.2.0-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file subcap-0.2.0.tar.gz.

File metadata

  • Download URL: subcap-0.2.0.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for subcap-0.2.0.tar.gz
Algorithm Hash digest
SHA256 cb2253484bcd5439eb7c42b8eef89469f0b1065009d9890a4307ac79a66c157d
MD5 9e4fb0b2f098ada2e3ed3f2eb1a8fce6
BLAKE2b-256 e302cf20ea4395a0d8eb2720b762496e5a470ab4c05285ebcb01ef183edd11ee

See more details on using hashes here.

Provenance

The following attestation bundles were made for subcap-0.2.0.tar.gz:

Publisher: publish.yml on bighippoman/subcap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file subcap-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: subcap-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for subcap-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 25305b25549589e2db38ddebbc17dcd77aeb3f99ac5eba2ced0967cc270b8f06
MD5 22c48829efd4cb52ba9e4b13296510a2
BLAKE2b-256 2753533a2fb5f3c5112997129ddb0f6b0247ae565265132f1cab66b8bd60207d

See more details on using hashes here.

Provenance

The following attestation bundles were made for subcap-0.2.0-py3-none-any.whl:

Publisher: publish.yml on bighippoman/subcap

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page