Deck-spec to narrated MP4: TTS via ElevenLabs, frame capture via web-overlay, audio mix via audio-arrange, video assembly via video-arrange.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

opusmorale

These details have not been verified by PyPI

Project description

talk-cast

Deck-spec to narrated MP4: TTS via ElevenLabs, frame capture via web-overlay, audio mix via audio-arrange, video assembly via video-arrange.

Built at Trollfabriken AITrix AB to close the loop: AIMOS Insight audit reports, Granskning case briefs, and civic-education explainers all begin as Deck objects authored by an LLM, and end as narrated videos posted to the news site — without leaving Python or paying a video-API vendor. Uses ElevenLabs for narration, your audio-arrange for the mix, your web-overlay for frame capture, and your video-arrange for the final assembly. Re-rendering after editing one slide takes seconds, not minutes.

What it solves

Problem	How talk-cast fixes it
TTS calls cost money on every re-render	Per-slide audio cache; unchanged slides reuse the cached MP3
Frame capture needs a real browser	`web-overlay` drives Playwright Chromium headlessly
Audio timing drifts from slide duration	`audio-arrange` trims/pads each clip to match slide duration exactly
Assembling MP4 from frames + audio requires ffmpeg knowledge	`video-arrange` wraps ffmpeg; one call produces the final file
Different voice per section is messy to wire up	`NarrateConfig` maps slide index ranges to ElevenLabs voice IDs
Subtitle track is optional but painful to add	`talk-cast[subtitles]` writes a WebVTT file alongside the MP4

Installation

pip install talk-cast

With subtitle support:

pip install "talk-cast[subtitles]"

Development extras:

pip install "talk-cast[dev]"

Runtime requirements

ffmpeg must be on PATH:

# Ubuntu / Debian
sudo apt-get install ffmpeg

# macOS
brew install ffmpeg

# Windows
choco install ffmpeg

Playwright Chromium for frame capture:

python -m playwright install chromium

ElevenLabs API key — set the environment variable:

export ELEVENLABS_API_KEY="your-key-here"

Quick start

from deck_spec import Deck
from talk_cast import NarrateConfig, cast

# Load a deck authored by an LLM or built by hand
deck = Deck.model_validate_json(open("my_deck.json").read())

config = NarrateConfig(
    voice_id="21m00Tcm4TlvDq8ikWAM",   # ElevenLabs voice ID
    slide_duration=8.0,                   # seconds per slide
    output_path="output/my_video.mp4",
    cache_dir=".talk-cast-cache",         # skip TTS if audio already cached
)

# Render the full narrated video
cast(deck, config)

Re-run after editing one slide — only that slide's TTS call is repeated. All other audio is served from cache.

The pipeline

Deck object
    │
    ① Read slides + speaker notes
    │
    ② Check cache (.talk-cast-cache/)
    │        │
    │   hit ─┘   miss ─► ③ ElevenLabs TTS → MP3 → cache
    │
    ④ audio-arrange: trim / pad each MP3 to slide_duration
    │
    ⑤ slide-render: render each slide to HTML
    │
    ⑥ web-overlay: Playwright Chromium captures PNG frame per slide
    │
    ⑦ Assemble per-slide: frame PNG + padded MP3
    │
    ⑧ video-arrange: encode each slide segment to MP4 clip
    │
    ⑨ video-arrange: concatenate all clips → final MP4
    │
    ⑩ (optional) write WebVTT subtitle file alongside MP4

Each step is independently testable. Steps ③–④ are skipped when TALK_CAST_SKIP_LIVE_TTS=1.

Configuration

NarrateConfig is a Pydantic model. All fields have defaults except voice_id.

Field	Type	Default	Description
`voice_id`	`str`	required	ElevenLabs voice ID for narration
`voice_map`	`dict[int, str]`	`{}`	Override voice per slide index (0-based)
`slide_duration`	`float`	`8.0`	Seconds each slide is held on screen
`output_path`	`str \| Path`	`"output.mp4"`	Destination MP4 file
`cache_dir`	`str \| Path`	`".talk-cast-cache"`	Directory for cached TTS audio
`resolution`	`tuple[int, int]`	`(1920, 1080)`	Frame resolution in pixels
`fps`	`int`	`30`	Frames per second in the output video
`model_id`	`str`	`"eleven_multilingual_v2"`	ElevenLabs model
`stability`	`float`	`0.5`	ElevenLabs stability (0.0–1.0)
`similarity_boost`	`float`	`0.75`	ElevenLabs similarity boost (0.0–1.0)
`subtitles`	`bool`	`False`	Write a `.vtt` file alongside the MP4
`theme`	`str`	`"default"`	slide-render theme name

Voice map example

Assign a different voice to slides 5–9:

config = NarrateConfig(
    voice_id="21m00Tcm4TlvDq8ikWAM",
    voice_map={5: "AZnzlk1XvdvUeBnXmlld", 6: "AZnzlk1XvdvUeBnXmlld"},
    slide_duration=10.0,
    output_path="output/report.mp4",
)

CLI

Render a deck file to MP4:

talk-cast render my_deck.json --voice 21m00Tcm4TlvDq8ikWAM --output output/video.mp4

Set slide duration to 12 seconds and enable subtitles:

talk-cast render my_deck.json \
    --voice 21m00Tcm4TlvDq8ikWAM \
    --duration 12 \
    --subtitles \
    --output output/video.mp4

Purge the TTS cache (forces re-synthesis on next render):

talk-cast cache clear

Inspect the cache — see which slides have audio:

talk-cast cache list

Validate a deck before rendering (runs deck-spec validation):

talk-cast validate my_deck.json

Package structure

talk-cast/
├── src/
│   └── talk_cast/
│       ├── __init__.py          ← public API: cast(), NarrateConfig
│       ├── cli.py               ← talk-cast entry point
│       ├── narrate.py           ← TTS orchestration and cache logic
│       ├── capture.py           ← web-overlay frame capture per slide
│       ├── assemble.py          ← video-arrange + audio-arrange wiring
│       ├── config.py            ← NarrateConfig Pydantic model
│       ├── cache.py             ← cache read/write helpers
│       └── py.typed             ← PEP 561 marker
├── tests/
│   ├── fixtures/                ← small JSON decks + reference WAVs
│   ├── test_narrate.py
│   ├── test_capture.py
│   ├── test_assemble.py
│   └── test_cli.py
├── pyproject.toml
├── README.md
└── LICENSE

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

opusmorale

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

talk_cast-0.1.0.tar.gz (12.4 kB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

talk_cast-0.1.0-py3-none-any.whl (20.1 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file talk_cast-0.1.0.tar.gz.

File metadata

Download URL: talk_cast-0.1.0.tar.gz
Upload date: May 22, 2026
Size: 12.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for talk_cast-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`88179ae92b24b657ec2c6420967f986d854b91695aab4f9dc66eb503d3511cc0`
MD5	`1b3e73c156f1f9747fbca9dc08ba9d18`
BLAKE2b-256	`d5b6fd7fcf10a09d694a3ee571b601cd81e3162f825f96179894d412097bb363`

See more details on using hashes here.

Provenance

The following attestation bundles were made for talk_cast-0.1.0.tar.gz:

Publisher: release.yml on tomastimelock/talk-cast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: talk_cast-0.1.0.tar.gz
- Subject digest: 88179ae92b24b657ec2c6420967f986d854b91695aab4f9dc66eb503d3511cc0
- Sigstore transparency entry: 1602221726
- Sigstore integration time: May 22, 2026
Source repository:
- Permalink: tomastimelock/talk-cast@b06f9d29643bcaf70442fa2a36cebbbd82dd73d9
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/tomastimelock
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@b06f9d29643bcaf70442fa2a36cebbbd82dd73d9
- Trigger Event: push

File details

Details for the file talk_cast-0.1.0-py3-none-any.whl.

File metadata

Download URL: talk_cast-0.1.0-py3-none-any.whl
Upload date: May 22, 2026
Size: 20.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for talk_cast-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8e5f562821aedd180d016d2107e32ad6f097917d77444ee0abf38c986d70335a`
MD5	`8d69bac9976d61bbe6db61c9ede16393`
BLAKE2b-256	`63634804a67cfe40e49be2ed164a95621ea779e3a6a99851ccfd828d75a17ea6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for talk_cast-0.1.0-py3-none-any.whl:

Publisher: release.yml on tomastimelock/talk-cast

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: talk_cast-0.1.0-py3-none-any.whl
- Subject digest: 8e5f562821aedd180d016d2107e32ad6f097917d77444ee0abf38c986d70335a
- Sigstore transparency entry: 1602221757
- Sigstore integration time: May 22, 2026
Source repository:
- Permalink: tomastimelock/talk-cast@b06f9d29643bcaf70442fa2a36cebbbd82dd73d9
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/tomastimelock
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@b06f9d29643bcaf70442fa2a36cebbbd82dd73d9
- Trigger Event: push

talk-cast 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

talk-cast

What it solves

Installation

Runtime requirements

Quick start

The pipeline

Configuration

Voice map example

CLI

Package structure

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance