Skip to main content

20 audio-driven caption styles for video. Composes text-fx + lyric-sync + audio-arrange.

Project description

caption-cast

20 audio-driven caption styles for video.

PyPI License: MIT Python 3.10+


caption-cast renders styled, timed captions onto video. It provides 20 named caption styles, each a fixed composition of text-fx effects, timing behaviour, and layout rules. It does not do speech transcription, audio separation, or beat detection itself — those are handled by sibling packages (lyric-sync and audio-arrange respectively), which caption-cast can consume when installed as optional extras.

The library was built to serve the caption layer of Trollfabriken's automated video pipeline, where clip metadata already carries subtitle or lyric timecodes and the remaining work is rendering them consistently across styles.


What it solves

Pain point Resolution
Karaoke-style word-by-word highlighting requires tight timing control and per-word colour switching burn_lyrics() with any karaoke_* style handles word-level timing from a pysubs2 SSAFile or from lyric-sync output
Subtitle burn-in varies wildly in font, position, and animation across tools 20 named styles give a reproducible, versioned rendering contract; pick a slug, get the same output every time
Beat-synchronised caption pulses need audio analysis glued to render beat_synced_captions() accepts an audio-arrange beat grid and schedules caption emphasis automatically

Installation

Core package (requires ffmpeg on PATH):

pip install caption-cast

With lyric-sync support (LRC / MusicXML ingestion, word-level timing):

pip install "caption-cast[lyrics]"

With audio-arrange support (beat detection, tempo analysis):

pip install "caption-cast[beats]"

Full extras:

pip install "caption-cast[all]"

Development:

pip install "caption-cast[dev]"

ffmpeg must be available on your system PATH. Install via:


Quick start

1. Burn subtitles from an SRT file

from caption_cast import burn_subtitles

burn_subtitles(
    video="input.mp4",
    subtitles="dialogue.srt",
    style="clean_lower_third",
    output="output.mp4",
)

The subtitles parameter accepts a path to any pysubs2-readable format (SRT, ASS, VTT, SSA).

2. Karaoke word-by-word highlighting from an LRC file

from caption_cast import burn_lyrics

burn_lyrics(
    video="music_video.mp4",
    lyrics="song.lrc",          # LRC with word-level timestamps
    style="karaoke_neon",
    output="music_video_captioned.mp4",
    highlight_color="#FF3399",
    base_color="#FFFFFF",
)

If lyric-sync is installed, you can pass a lyric_sync.LyricTrack object directly instead of a file path. See lyric-sync's README for how to produce one from MusicXML or Spotify API data.

3. Beat-synchronised caption pulses

from caption_cast import beat_synced_captions
from audio_arrange import analyse_beats   # requires caption-cast[beats]

beat_grid = analyse_beats("track.wav")

beat_synced_captions(
    video="clip.mp4",
    subtitles="lyrics.srt",
    beat_grid=beat_grid,
    style="pulse_bold",
    output="clip_captioned.mp4",
)

On each beat onset the active caption receives a scale/opacity pulse governed by the style's beat_emphasis parameters. The default pulse lasts 80 ms and can be overridden per call.


The 20 styles

Slug Display name Primary use case
clean_lower_third Clean Lower Third Documentary, interview subtitles
clean_center Clean Center Narrative subtitles, no music
minimal_top Minimal Top B-roll with top-aligned text
bold_lower_third Bold Lower Third Social media, short-form video
bold_center Bold Center Impact titles doubling as captions
outline_lower_third Outline Lower Third Subtitles over bright or variable backgrounds
outline_center Outline Center Lyrics on music videos with complex backgrounds
drop_shadow_lower Drop Shadow Lower Standard broadcast-style lower third
karaoke_classic Karaoke Classic Word-by-word left-to-right wipe highlight
karaoke_neon Karaoke Neon Word highlight with neon glow effect
karaoke_fill Karaoke Fill Word highlight with solid fill colour swap
karaoke_bounce Karaoke Bounce Word highlight with vertical bounce on activation
karaoke_wave Karaoke Wave Sequential per-character wave on active word
pulse_bold Pulse Bold Caption scales on beat onset, bold weight
pulse_glow Pulse Glow Caption glow radius pulses on beat onset
pulse_color Pulse Color Caption colour shifts on beat onset
fade_word Fade Word Each word fades in on its start timecode
slide_up Slide Up Line slides up into position on entry
typewriter Typewriter Characters reveal left-to-right at constant rate
pop_center Pop Center Caption pops to full size from zero scale

All 20 styles are parametric. Every parameter has a default; pass keyword arguments to apply_caption, burn_lyrics, or burn_subtitles to override individual parameters without changing the base style.


Lyrics input formats

caption-cast accepts timed lyrics in three ways:

pysubs2-readable files — SRT, ASS, SSA, VTT. Word-level timing in ASS/SSA format is supported for karaoke styles. Pass the file path as the lyrics or subtitles argument.

LRC files — Standard LRC with line timestamps. Enhanced LRC with word-level <mm:ss.xx> tags is required for karaoke styles. Pass the file path; caption-cast parses LRC internally via pysubs2.

lyric-sync LyricTrack — When lyric-sync >= 0.1 is installed, pass a LyricTrack object directly. This is the preferred path when you need to ingest MusicXML, parse Spotify's lyrics API, or align phoneme boundaries. lyric-sync produces word-level timecodes that map cleanly onto the karaoke styles.

# With lyric-sync installed
from lyric_sync import parse_lrc, align_words
from caption_cast import burn_lyrics

track = parse_lrc("song.lrc")
aligned = align_words(track, audio="song.wav")   # optional phoneme alignment

burn_lyrics(
    video="clip.mp4",
    lyrics=aligned,          # LyricTrack object accepted directly
    style="karaoke_wave",
    output="out.mp4",
)

CLI

caption-cast ships a CLI entry point at caption-cast.

List all styles:

caption-cast styles

Burn subtitles from the command line:

caption-cast burn \
  --video input.mp4 \
  --subtitles dialogue.srt \
  --style clean_lower_third \
  --output output.mp4

Burn karaoke lyrics:

caption-cast lyrics \
  --video music_video.mp4 \
  --lyrics song.lrc \
  --style karaoke_neon \
  --highlight-color "#FF3399" \
  --output out.mp4

Inspect a style's parameters:

caption-cast info karaoke_neon

Get version:

caption-cast --version

All burn and lyrics subcommand options map 1-to-1 to the Python API keyword arguments. Run caption-cast <subcommand> --help for a full parameter list.


Composition with the Trollfabriken stack

caption-cast sits between the ingestion packages (lyric-sync, audio-arrange) and the render engine (text-fx). The dependency chain is:

lyric-sync ──┐
              ├──► caption-cast ──► text-fx ──► ffmpeg
audio-arrange ┘

caption-cast does not need lyric-sync or audio-arrange at runtime if you supply pre-timed subtitle files. Install the extras only when you need programmatic lyrics ingestion or beat analysis.

For title cards and motion-graphic intro/outro sequences, see title-fx, which builds on text-fx with a separate catalog of 44 cinematic title effects.


Package structure

caption-cast/
  src/
    caption_cast/
      __init__.py          # public API: apply_caption, burn_lyrics, burn_subtitles,
      api.py               #             beat_synced_captions, list_styles, get_style_info
      cli.py               # caption-cast CLI entry point
      renderer.py          # delegates to text-fx render pipeline
      timing.py            # timecode parsing, word-level offset resolution
      beat.py              # beat grid → caption emphasis schedule
      styles/
        __init__.py        # style registry
        catalog.py         # style definitions (parametric dataclasses)
      data/
        caption_styles.json  # serialised style catalog (shipped in wheel)
  tests/

License

MIT. Copyright 2026 Trollfabriken AITrix AB.

Part of the Trollfabriken stack.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

caption_cast-0.1.0.tar.gz (29.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

caption_cast-0.1.0-py3-none-any.whl (39.0 kB view details)

Uploaded Python 3

File details

Details for the file caption_cast-0.1.0.tar.gz.

File metadata

  • Download URL: caption_cast-0.1.0.tar.gz
  • Upload date:
  • Size: 29.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for caption_cast-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f4864f79a088c9ffe0cc706b420e63b0b66e50af9d98343b3834fc1ff2a1e624
MD5 f40d5f9f1b6cdc769fe8120ec4a143ba
BLAKE2b-256 375a3ecc05c9a2e29983c484ffb1e6bc3577ae3c7905e449207fb96afa9c5b74

See more details on using hashes here.

Provenance

The following attestation bundles were made for caption_cast-0.1.0.tar.gz:

Publisher: release.yml on tomastimelock/caption-cast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file caption_cast-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: caption_cast-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 39.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for caption_cast-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 31cf0d52563facdefd8447abdb28799d1840923c72f96d38acbe517c09457be0
MD5 5198308e666db18c8b61078d68810d71
BLAKE2b-256 b611cde5765309dfebb1aedabc5d0075ee7dde72abeba2b18078710fbf8d91dc

See more details on using hashes here.

Provenance

The following attestation bundles were made for caption_cast-0.1.0-py3-none-any.whl:

Publisher: release.yml on tomastimelock/caption-cast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page