20 audio-driven caption styles for video. Composes text-fx + lyric-sync + audio-arrange.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

opusmorale

These details have not been verified by PyPI

Project description

caption-cast

20 audio-driven caption styles for video.

caption-cast renders styled, timed captions onto video. It provides 20 named caption styles, each a fixed composition of text-fx effects, timing behaviour, and layout rules. It does not do speech transcription, audio separation, or beat detection itself — those are handled by sibling packages (lyric-sync and audio-arrange respectively), which caption-cast can consume when installed as optional extras.

The library was built to serve the caption layer of Trollfabriken's automated video pipeline, where clip metadata already carries subtitle or lyric timecodes and the remaining work is rendering them consistently across styles.

What it solves

Pain point	Resolution
Karaoke-style word-by-word highlighting requires tight timing control and per-word colour switching	`burn_lyrics()` with any `karaoke_*` style handles word-level timing from a pysubs2 `SSAFile` or from lyric-sync output
Subtitle burn-in varies wildly in font, position, and animation across tools	20 named styles give a reproducible, versioned rendering contract; pick a slug, get the same output every time
Beat-synchronised caption pulses need audio analysis glued to render	`beat_synced_captions()` accepts an `audio-arrange` beat grid and schedules caption emphasis automatically

Installation

Core package (requires ffmpeg on PATH):

pip install caption-cast

With lyric-sync support (LRC / MusicXML ingestion, word-level timing):

pip install "caption-cast[lyrics]"

With audio-arrange support (beat detection, tempo analysis):

pip install "caption-cast[beats]"

Full extras:

pip install "caption-cast[all]"

Development:

pip install "caption-cast[dev]"

ffmpeg must be available on your system PATH. Install via:

Ubuntu/Debian: sudo apt-get install ffmpeg
macOS: brew install ffmpeg
Windows: choco install ffmpeg or download from https://ffmpeg.org/download.html

Quick start

1. Burn subtitles from an SRT file

from caption_cast import burn_subtitles

burn_subtitles(
    video="input.mp4",
    subtitles="dialogue.srt",
    style="clean_lower_third",
    output="output.mp4",
)

The subtitles parameter accepts a path to any pysubs2-readable format (SRT, ASS, VTT, SSA).

2. Karaoke word-by-word highlighting from an LRC file

from caption_cast import burn_lyrics

burn_lyrics(
    video="music_video.mp4",
    lyrics="song.lrc",          # LRC with word-level timestamps
    style="karaoke_neon",
    output="music_video_captioned.mp4",
    highlight_color="#FF3399",
    base_color="#FFFFFF",
)

If lyric-sync is installed, you can pass a lyric_sync.LyricTrack object directly instead of a file path. See lyric-sync's README for how to produce one from MusicXML or Spotify API data.

3. Beat-synchronised caption pulses

from caption_cast import beat_synced_captions
from audio_arrange import analyse_beats   # requires caption-cast[beats]

beat_grid = analyse_beats("track.wav")

beat_synced_captions(
    video="clip.mp4",
    subtitles="lyrics.srt",
    beat_grid=beat_grid,
    style="pulse_bold",
    output="clip_captioned.mp4",
)

On each beat onset the active caption receives a scale/opacity pulse governed by the style's beat_emphasis parameters. The default pulse lasts 80 ms and can be overridden per call.

The 20 styles

Slug	Display name	Primary use case
`clean_lower_third`	Clean Lower Third	Documentary, interview subtitles
`clean_center`	Clean Center	Narrative subtitles, no music
`minimal_top`	Minimal Top	B-roll with top-aligned text
`bold_lower_third`	Bold Lower Third	Social media, short-form video
`bold_center`	Bold Center	Impact titles doubling as captions
`outline_lower_third`	Outline Lower Third	Subtitles over bright or variable backgrounds
`outline_center`	Outline Center	Lyrics on music videos with complex backgrounds
`drop_shadow_lower`	Drop Shadow Lower	Standard broadcast-style lower third
`karaoke_classic`	Karaoke Classic	Word-by-word left-to-right wipe highlight
`karaoke_neon`	Karaoke Neon	Word highlight with neon glow effect
`karaoke_fill`	Karaoke Fill	Word highlight with solid fill colour swap
`karaoke_bounce`	Karaoke Bounce	Word highlight with vertical bounce on activation
`karaoke_wave`	Karaoke Wave	Sequential per-character wave on active word
`pulse_bold`	Pulse Bold	Caption scales on beat onset, bold weight
`pulse_glow`	Pulse Glow	Caption glow radius pulses on beat onset
`pulse_color`	Pulse Color	Caption colour shifts on beat onset
`fade_word`	Fade Word	Each word fades in on its start timecode
`slide_up`	Slide Up	Line slides up into position on entry
`typewriter`	Typewriter	Characters reveal left-to-right at constant rate
`pop_center`	Pop Center	Caption pops to full size from zero scale

All 20 styles are parametric. Every parameter has a default; pass keyword arguments to apply_caption, burn_lyrics, or burn_subtitles to override individual parameters without changing the base style.

Lyrics input formats

caption-cast accepts timed lyrics in three ways:

pysubs2-readable files — SRT, ASS, SSA, VTT. Word-level timing in ASS/SSA format is supported for karaoke styles. Pass the file path as the lyrics or subtitles argument.

LRC files — Standard LRC with line timestamps. Enhanced LRC with word-level <mm:ss.xx> tags is required for karaoke styles. Pass the file path; caption-cast parses LRC internally via pysubs2.

lyric-sync LyricTrack — When lyric-sync >= 0.1 is installed, pass a LyricTrack object directly. This is the preferred path when you need to ingest MusicXML, parse Spotify's lyrics API, or align phoneme boundaries. lyric-sync produces word-level timecodes that map cleanly onto the karaoke styles.

# With lyric-sync installed
from lyric_sync import parse_lrc, align_words
from caption_cast import burn_lyrics

track = parse_lrc("song.lrc")
aligned = align_words(track, audio="song.wav")   # optional phoneme alignment

burn_lyrics(
    video="clip.mp4",
    lyrics=aligned,          # LyricTrack object accepted directly
    style="karaoke_wave",
    output="out.mp4",
)

CLI

caption-cast ships a CLI entry point at caption-cast.

List all styles:

caption-cast styles

Burn subtitles from the command line:

caption-cast burn \
  --video input.mp4 \
  --subtitles dialogue.srt \
  --style clean_lower_third \
  --output output.mp4

Burn karaoke lyrics:

caption-cast lyrics \
  --video music_video.mp4 \
  --lyrics song.lrc \
  --style karaoke_neon \
  --highlight-color "#FF3399" \
  --output out.mp4

Inspect a style's parameters:

caption-cast info karaoke_neon

Get version:

caption-cast --version

All burn and lyrics subcommand options map 1-to-1 to the Python API keyword arguments. Run caption-cast <subcommand> --help for a full parameter list.

Composition with the Trollfabriken stack

caption-cast sits between the ingestion packages (lyric-sync, audio-arrange) and the render engine (text-fx). The dependency chain is:

lyric-sync ──┐
              ├──► caption-cast ──► text-fx ──► ffmpeg
audio-arrange ┘

caption-cast does not need lyric-sync or audio-arrange at runtime if you supply pre-timed subtitle files. Install the extras only when you need programmatic lyrics ingestion or beat analysis.

For title cards and motion-graphic intro/outro sequences, see title-fx, which builds on text-fx with a separate catalog of 44 cinematic title effects.

Package structure

caption-cast/
  src/
    caption_cast/
      __init__.py          # public API: apply_caption, burn_lyrics, burn_subtitles,
      api.py               #             beat_synced_captions, list_styles, get_style_info
      cli.py               # caption-cast CLI entry point
      renderer.py          # delegates to text-fx render pipeline
      timing.py            # timecode parsing, word-level offset resolution
      beat.py              # beat grid → caption emphasis schedule
      styles/
        __init__.py        # style registry
        catalog.py         # style definitions (parametric dataclasses)
      data/
        caption_styles.json  # serialised style catalog (shipped in wheel)
  tests/

License

Part of the Trollfabriken stack.

PyPI: https://pypi.org/project/caption-cast/
Issues: https://github.com/tomastimelock/caption-cast/issues
text-fx (render engine): https://github.com/tomastimelock/text-fx
lyric-sync (lyrics ingestion): https://github.com/tomastimelock/lyric-sync
audio-arrange (beat detection): https://github.com/tomastimelock/audio-arrange

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

opusmorale

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

caption_cast-0.1.0.tar.gz (29.0 kB view details)

Uploaded May 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

caption_cast-0.1.0-py3-none-any.whl (39.0 kB view details)

Uploaded May 23, 2026 Python 3

File details

Details for the file caption_cast-0.1.0.tar.gz.

File metadata

Download URL: caption_cast-0.1.0.tar.gz
Upload date: May 23, 2026
Size: 29.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for caption_cast-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f4864f79a088c9ffe0cc706b420e63b0b66e50af9d98343b3834fc1ff2a1e624`
MD5	`f40d5f9f1b6cdc769fe8120ec4a143ba`
BLAKE2b-256	`375a3ecc05c9a2e29983c484ffb1e6bc3577ae3c7905e449207fb96afa9c5b74`

See more details on using hashes here.

Provenance

The following attestation bundles were made for caption_cast-0.1.0.tar.gz:

Publisher: release.yml on tomastimelock/caption-cast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: caption_cast-0.1.0.tar.gz
- Subject digest: f4864f79a088c9ffe0cc706b420e63b0b66e50af9d98343b3834fc1ff2a1e624
- Sigstore transparency entry: 1616142283
- Sigstore integration time: May 23, 2026
Source repository:
- Permalink: tomastimelock/caption-cast@64931062396c6cdffb1ecff438bb6cce3af44579
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/tomastimelock
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@64931062396c6cdffb1ecff438bb6cce3af44579
- Trigger Event: push

File details

Details for the file caption_cast-0.1.0-py3-none-any.whl.

File metadata

Download URL: caption_cast-0.1.0-py3-none-any.whl
Upload date: May 23, 2026
Size: 39.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for caption_cast-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`31cf0d52563facdefd8447abdb28799d1840923c72f96d38acbe517c09457be0`
MD5	`5198308e666db18c8b61078d68810d71`
BLAKE2b-256	`b611cde5765309dfebb1aedabc5d0075ee7dde72abeba2b18078710fbf8d91dc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for caption_cast-0.1.0-py3-none-any.whl:

Publisher: release.yml on tomastimelock/caption-cast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: caption_cast-0.1.0-py3-none-any.whl
- Subject digest: 31cf0d52563facdefd8447abdb28799d1840923c72f96d38acbe517c09457be0
- Sigstore transparency entry: 1616142299
- Sigstore integration time: May 23, 2026
Source repository:
- Permalink: tomastimelock/caption-cast@64931062396c6cdffb1ecff438bb6cce3af44579
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/tomastimelock
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@64931062396c6cdffb1ecff438bb6cce3af44579
- Trigger Event: push

caption-cast 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

caption-cast

What it solves

Installation

Quick start

1. Burn subtitles from an SRT file

2. Karaoke word-by-word highlighting from an LRC file

3. Beat-synchronised caption pulses

The 20 styles

Lyrics input formats

CLI

Composition with the Trollfabriken stack

Package structure

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance